1. 20 4月, 2017 1 次提交
  2. 27 3月, 2017 2 次提交
  3. 21 3月, 2017 1 次提交
  4. 30 11月, 2016 1 次提交
    • D
      perf script: Add option to stop printing callchain · 64eff7d9
      David Ahern 提交于
      Allow user to specify list of symbols which cause the dump of callchains
      to stop at that symbol.
      
      Committer notes:
      
      Testing it:
      
        # perf record -ag usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ]
        #
        # # Without it:
        #
        # perf script
        swapper   0 [000]  9693.370039:          1 cycles:ppp:
                        2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                       137ffeb start_kernel ([kernel.vmlinux].init.text)
                       137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
                       137f419 x86_64_start_kernel ([kernel.vmlinux].init.text)
      
        swapper   0 [000]  9693.370044:          1 cycles:ppp:
                        20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                       137ffeb start_kernel ([kernel.vmlinux].init.text)
                       137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
        #
        # # Using it to see just what are the calls from the 'remote_function' function:
        #
        # perf script --stop-bt remote_function
        swapper   0 [000]  9693.370039:          1 cycles:ppp:
                        2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
      
        swapper   0 [000]  9693.370044:          1 cycles:ppp:
                        20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1480104021-36275-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      64eff7d9
  5. 25 11月, 2016 1 次提交
    • N
      perf sched timehist: Mark schedule function in callchains · cdeb01bf
      Namhyung Kim 提交于
      The sched_switch event always captured from the scheduler function.  So
      it'd be great omit them from the callchain.  This patch marks the
      functions to be omitted by later patch.
      
      Committer notes:
      
      Testing it:
      
      Before:
      
        [root@jouet experimental]# perf sched record -g ls
        Dockerfile  perf.data  x-mips64
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.355 MB perf.data (29 samples) ]
        [root@jouet experimental]# perf sched timehist
            time  cpu  task name         wait time sch delay run time
                       [tid/pid]             (msec) (msec) (msec)
        ----------- -----  ----------------- ------ ------ ------
        6.494998 [001] <idle>                0.000  0.000  0.000
        6.495027 [002] perf[519]             0.000  0.000  0.000 __schedule <- schedule <- schedule_hrtimeout_range_clock <- schedule_hrtimeou
        6.495096 [003] <idle>                0.000  0.000  0.000
        6.495100 [003] rcuos/0[9]            0.000  0.005  0.003 __schedule <- schedule <- rcu_nocb_kthread <- kthread <- ret_from_fork
        6.495113 [001] perf[520]             0.000  0.008  0.114 __schedule <- preempt_schedule_common <- _cond_resched <- wait_for_completion
        6.495121 [000] <idle>                0.000  0.000  0.000
        6.495129 [001] migration/1[17]       0.000  0.003  0.016 __schedule <- schedule <- smpboot_thread_fn <- kthread <- ret_from_fork
        6.496085 [002] <idle>                0.000  0.000  1.057
        6.496096 [002] kworker/u16:1[31169]  0.000  0.004  0.011 __schedule <- schedule <- worker_thread <- kthread <- ret_from_fork
        6.496096 [003] <idle>                0.003  0.000  0.996
        6.496169 [002] <idle>                0.011  0.000  0.072
        6.496171 [000] ls[520]               0.008  0.000  1.049 __schedule <- schedule <- do_exit <- do_group_exit <- [unknown]
        6.496172 [003] gnome-terminal-[4391] 0.000  0.003  0.076 __schedule <- schedule <- schedule_hrtimeout_range_clock <- schedule_hrtimeo
      
      After:
      
        [root@jouet experimental]# perf sched timehist
            time  cpu  task name         wait time sch delay run time
                       [tid/pid]            (msec)  (msec)  (msec)
        ----------- -----  ----------------- -----  -----  ------
        6.494998 [001] <idle>                0.000  0.000  0.000
        6.495027 [002] perf[519]             0.000  0.000  0.000 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_t
        6.495096 [003] <idle>                0.000  0.000  0.000
        6.495100 [003] rcuos/0[9]            0.000  0.005  0.003 rcu_nocb_kthread <- kthread <- ret_from_fork
        6.495113 [001] perf[520]             0.000  0.008  0.114 preempt_schedule_common <- _cond_resched <- wait_for_completion <- stop_one_c
        6.495121 [000] <idle>                0.000  0.000  0.000
        6.495129 [001] migration/1[17]       0.000  0.003  0.016 smpboot_thread_fn <- kthread <- ret_from_fork
        6.496085 [002] <idle>                0.000  0.000  1.057
        6.496096 [002] kworker/u16:1[31169]  0.000  0.004  0.011 worker_thread <- kthread <- ret_from_fork
        6.496096 [003] <idle>                0.003  0.000  0.996
        6.496169 [002] <idle>                0.011  0.000  0.072
        6.496171 [000] ls[520]               0.008  0.000  1.049 do_exit <- do_group_exit <- [unknown]
        6.496172 [003] gnome-terminal-[4391] 0.000  0.003  0.076 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_
        [root@jouet experimental]#
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20161124011114.7102-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cdeb01bf
  6. 23 11月, 2016 1 次提交
  7. 15 11月, 2016 1 次提交
  8. 29 9月, 2016 1 次提交
  9. 05 9月, 2016 2 次提交
  10. 30 8月, 2016 2 次提交
  11. 05 7月, 2016 1 次提交
    • H
      perf sdt: ELF support for SDT · 060fa0c7
      Hemant Kumar 提交于
      This patch serves the initial support to identify and list SDT events in
      binaries.  When programs containing SDT markers are compiled, gcc with
      the help of assembler directives identifies them and places them in the
      section ".note.stapsdt".
      
      To find these markers from the binaries, one needs to traverse through
      this section and parse the relevant details like the name, type and
      location of the marker. Also, the original location could be skewed due
      to the effect of prelinking. If that is the case, the locations need to
      be adjusted.
      
      The functions in this patch open a given ELF, find out the SDT section,
      parse the relevant details, adjust the location (if necessary) and
      populate them in a list.
      
      A typical note entry in ".note.stapsdt" section is as follows :
      
                                       |--nhdr.n_namesz--|
                      ------------------------------------
                      |      nhdr      |     "stapsdt"   |
              -----   |----------------------------------|
               |      |  <location>       <base_address> |
               |      |  <semaphore>                     |
      nhdr.n_descsize |  "provider_name"   "note_name"   |
               |      |   <args>                         |
              -----   |----------------------------------|
                      |      nhdr      |     "stapsdt"   |
                      |...
      
      The above shows an excerpt from the section ".note.stapsdt".  'nhdr' is
      a structure which has the note name size (n_namesz), note description
      size (n_desc_sz) and note type (n_type).
      
      So, in order to parse the note note info, we need nhdr to tell us where
      to start from.  As can be seen from <sys/sdt.h>, the name of the SDT
      notes given is "stapsdt".  But this is not the identifier of the note.
      
      After that, we go to description of the note to find out its location, the
      address of the ".stapsdt.base" section and the semaphore address.
      Then, we find the provider name and the SDT marker name and then follow the
      arguments.
      Signed-off-by: NHemant Kumar <hemant@linux.vnet.ibm.com>
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/146736022628.27797.1201368329092908163.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      060fa0c7
  12. 23 5月, 2016 1 次提交
    • A
      perf report: Add srcline_from/to branch sort keys · 508be0df
      Andi Kleen 提交于
      Add "srcline_from" and "srcline_to" branch sort keys that allow to show
      the source lines of a branch.
      
      That makes it much easier to track down where particular branches happen
      in the program, for example to examine branch mispredictions, or to
      associate it with cycle counts:
      
        % perf record -b -e cycles:p ./tcall
        % perf report --sort srcline_from,srcline_to,mispredict
        ...
          15.10%  tcall.c:18       tcall.c:10       N
          14.83%  tcall.c:11       tcall.c:5        N
          14.12%  tcall.c:7        tcall.c:12       N
          14.04%  tcall.c:12       tcall.c:5        N
          12.42%  tcall.c:17       tcall.c:18       N
          12.39%  tcall.c:7        tcall.c:13       N
          12.27%  tcall.c:13       tcall.c:17       N
        ...
      
        % perf report --sort srcline_from,srcline_to,cycles
        ...
          17.12%  tcall.c:18       tcall.c:11       1
          17.01%  tcall.c:12       tcall.c:6        1
          16.98%  tcall.c:11       tcall.c:6        1
          15.91%  tcall.c:17       tcall.c:18       1
           6.38%  tcall.c:7        tcall.c:17       7
           4.80%  tcall.c:7        tcall.c:12       8
           4.21%  tcall.c:7        tcall.c:17       8
           2.67%  tcall.c:7        tcall.c:12       7
           2.62%  tcall.c:7        tcall.c:12       10
           2.10%  tcall.c:7        tcall.c:17       9
           1.58%  tcall.c:7        tcall.c:12       6
           1.44%  tcall.c:7        tcall.c:12       5
           1.38%  tcall.c:7        tcall.c:12       9
           1.06%  tcall.c:7        tcall.c:17       13
           1.05%  tcall.c:7        tcall.c:12       4
           1.01%  tcall.c:7        tcall.c:17       6
      
      Open issues:
      
      - Some kernel symbols get misresolved.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      508be0df
  13. 20 5月, 2016 1 次提交
  14. 17 5月, 2016 1 次提交
  15. 11 5月, 2016 1 次提交
    • C
      perf symbols: Add dso__insert_symbol function · ae93a6c7
      Chris Phlipot 提交于
      The current method for inserting symbols is to use the symbols__insert()
      function. However symbols__insert() does not update the dso symbol
      cache.  This causes problems in the following scenario:
      
      1. symbol not found at addr using dso__find_symbol
      
      2. symbol inserted at addr using the existing symbols__insert function
      
      3. symbol still not found at addr using dso__find_symbol() because cache isn't
         updated. This is undesired behavior.
      
      The undesired behavior in (3) is addressed by creating a new function,
      dso__insert_symbol() to both insert the symbol and update the symbol
      cache if necessary.
      
      If dso__insert_symbol() is used in (2) instead of symbols__insert(),
      then the undesired behavior in (3) is avoided.
      Signed-off-by: NChris Phlipot <cphlipot0@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1462937209-6032-2-git-send-email-cphlipot0@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ae93a6c7
  16. 06 5月, 2016 1 次提交
  17. 19 4月, 2016 1 次提交
  18. 15 4月, 2016 1 次提交
  19. 12 4月, 2016 1 次提交
    • A
      perf evsel: Allow unresolved symbol names to be printed as addresses · fd4be130
      Arnaldo Carvalho de Melo 提交于
      The fprintf_sym() and fprintf_callchain() methods now allow users to
      change the existing behaviour of showing "[unknown]" as the name of
      unresolved symbols to instead show "[0x123456]", i.e. its address.
      
      The current patch doesn't change tools to use this facility, the results
      from 'perf trace' and 'perf script' cotinue like:
      
      70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout
                                         [unknown] (/usr/lib64/libc-2.22.so)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         start_thread+0xca (/usr/lib64/libpthread-2.22.so)
                                         __clone+0x6d (/usr/lib64/libc-2.22.so)
      
      The next patch will make 'perf trace' use the new formatting.
      Suggested-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fd4be130
  20. 24 3月, 2016 1 次提交
  21. 25 2月, 2016 1 次提交
  22. 07 1月, 2016 1 次提交
    • N
      perf report/top: Add --raw-trace option · 053a3989
      Namhyung Kim 提交于
      The --raw-trace option allows disabling pretty printing by the event's
      print_fmt or plugin.  Besides that, each dynamic sort key now can
      receive a 'raw' suffix separated by '/' to ask for the raw trace of a
      specific field.
      
        $ perf report -s comm,kmem:kmalloc.gfp_flags
        ...
        # Overhead  Command            gfp_flags
        # ........  .......  ...................
        #
            99.89%  perf       GFP_NOFS|GFP_ZERO
             0.06%  sleep             GFP_KERNEL
             0.03%  perf     GFP_KERNEL|GFP_ZERO
             0.01%  perf              GFP_KERNEL
      
      Now
      
        $ perf report -s comm,kmem:kmalloc.gfp_flags --raw-trace
      or
        $ perf report -s comm,kmem:kmalloc.gfp_flags/raw
        ...
        # Overhead  Command   gfp_flags
        # ........  .......  ..........
        #
            99.89%  perf          32848
             0.06%  sleep           208
             0.03%  perf          32976
             0.01%  perf            208
      Suggested-and-Acked-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1450804030-29193-9-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      053a3989
  23. 27 11月, 2015 1 次提交
  24. 13 11月, 2015 1 次提交
  25. 14 9月, 2015 1 次提交
  26. 29 8月, 2015 1 次提交
  27. 13 8月, 2015 1 次提交
    • K
      perf report: Show call graph from reference events · 9e207ddf
      Kan Liang 提交于
      Introduce --show-ref-call-graph for perf report to print reference
      callgraph for no callgraph event.
      
      Here is an example.
      
       perf report --show-ref-call-graph --stdio
      
       # To display the perf.data header info, please use
       --header/--header-only options.
       #
       #
       # Total Lost Samples: 0
       #
       # Samples: 5  of event 'cpu/cpu-cycles,call-graph=fp/'
       # Event count (approx.): 144985
       #
       # Children      Self  Command  Shared Object     Symbol
       # ........  ........  .......  ................  ........................................
       #
          72.30%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--22.62%-- __GI___libc_nanosleep
                        --77.38%-- [...]
      
      ......
      
       # Samples: 6  of event 'cpu/instructions,call-graph=no/', show reference callgraph
       # Event count (approx.): 172780
       #
       # Children      Self  Command  Shared Object     Symbol
       # ........  ........  .......  ................  ........................................
       #
          73.16%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--31.44%-- __GI___libc_nanosleep
                        --68.56%-- [...]
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9e207ddf
  28. 13 7月, 2015 1 次提交
    • A
      perf symbols: Store if there is a filter in place · 0bc2f2f7
      Arnaldo Carvalho de Melo 提交于
      When setting yup the symbols library we setup several filter lists,
      for dsos, comms, symbols, etc, and there is code that, if there are
      filters, do certain operations, like recalculate the number of non
      filtered histogram entries in the top/report TUI.
      
      But they were considering just the "Zoom" filters, when they need to
      take into account as well the above mentioned filters (perf top --comms,
      --dsos, etc).
      
      So store in symbol_conf.has_filter true if any of those filters is in
      place.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-f5edfmhq69vfvs1kmikq1wep@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0bc2f2f7
  29. 06 5月, 2015 1 次提交
  30. 04 5月, 2015 3 次提交
  31. 25 3月, 2015 1 次提交
  32. 12 3月, 2015 1 次提交
  33. 21 1月, 2015 1 次提交
  34. 02 12月, 2014 1 次提交
    • A
      perf callchain: Support handling complete branch stacks as histograms · 8b7bad58
      Andi Kleen 提交于
      Currently branch stacks can be only shown as edge histograms for
      individual branches. I never found this display particularly useful.
      
      This implements an alternative mode that creates histograms over
      complete branch traces, instead of individual branches, similar to how
      normal callgraphs are handled. This is done by putting it in front of
      the normal callgraph and then using the normal callgraph histogram
      infrastructure to unify them.
      
      This way in complex functions we can understand the control flow that
      lead to a particular sample, and may even see some control flow in the
      caller for short functions.
      
      Example (simplified, of course for such simple code this is usually not
      needed), please run this after the whole patchkit is in, as at this
      point in the patch order there is no --branch-history, that will be
      added in a patch after this one:
      
      tcall.c:
      
      volatile a = 10000, b = 100000, c;
      
      __attribute__((noinline)) f2()
      {
      	c = a / b;
      }
      
      __attribute__((noinline)) f1()
      {
      	f2();
      	f2();
      }
      main()
      {
      	int i;
      	for (i = 0; i < 1000000; i++)
      		f1();
      }
      
      % perf record -b -g ./tsrc/tcall
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
      % perf report --no-children --branch-history
      ...
          54.91%  tcall.c:6  [.] f2                      tcall
                  |
                  |--65.53%-- f2 tcall.c:5
                  |          |
                  |          |--70.83%-- f1 tcall.c:11
                  |          |          f1 tcall.c:10
                  |          |          main tcall.c:18
                  |          |          main tcall.c:18
                  |          |          main tcall.c:17
                  |          |          main tcall.c:17
                  |          |          f1 tcall.c:13
                  |          |          f1 tcall.c:13
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:12
                  |          |          f1 tcall.c:12
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:11
                  |          |
                  |           --29.17%-- f1 tcall.c:12
                  |                     f1 tcall.c:12
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:11
                  |                     f1 tcall.c:10
                  |                     main tcall.c:18
                  |                     main tcall.c:18
                  |                     main tcall.c:17
                  |                     main tcall.c:17
                  |                     f1 tcall.c:13
                  |                     f1 tcall.c:13
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:12
      
      The default output is unchanged.
      
      This is only implemented in perf report, no change to record or anywhere
      else.
      
      This adds the basic code to report:
      
      - add a new "branch" option to the -g option parser to enable this mode
      - when the flag is set include the LBR into the callstack in machine.c.
      
      The rest of the history code is unchanged and doesn't know the
      difference between LBR entry and normal call entry.
      
      - detect overlaps with the callchain
      - remove small loop duplicates in the LBR
      
      Current limitations:
      
      - The LBR flags (mispredict etc.) are not shown in the history
      and LBR entries have no special marker.
      - It would be nice if annotate marked the LBR entries somehow
      (e.g. with arrows)
      
      v2: Various fixes.
      v3: Merge further patches into this one. Fix white space.
      v4: Improve manpage. Address review feedback.
      v5: Rename functions. Better error message without -g. Fix crash without
          -b.
      v6: Rebase
      v7: Rebase. Use NO_ENTRY in memset.
      v8: Port to latest tip. Move add_callchain_ip to separate
          patch. Skip initial entries in callchain. Minor cleanups.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b7bad58
  35. 25 11月, 2014 1 次提交
    • A
      perf symbols: Move bfd_demangle stubbing to its only user · aaba4e12
      Arnaldo Carvalho de Melo 提交于
      We need to define bfd_demangle() to either a wrapper for
      cplus_demangle() or to a stub when NO_DEMANGLE is defined.
      
      That is at odds with using bfd.h for some other reason, as it defines
      bfd_demangle() and then if code that wants to use symbol.h, where the
      above stubbing/wrapping is done, and bfd.h for other reasons, we end up
      with a build error where bfd_demangle() is found to be redefined.
      
      Avoid that by moving the stubbing/wrapping to symbol-elf.c, that is the
      only user of such function. If we ever get to a point where there are
      more valid users, we can then introduce a header for that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-6wzjpe2fy9xtgchshulixlzw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aaba4e12