1. 28 11月, 2011 5 次提交
  2. 03 11月, 2011 1 次提交
    • A
      perf top: Fix live annotation in the --stdio interface · f9e3d4b1
      Arnaldo Carvalho de Melo 提交于
      In the old --stdio interface the annotation is done just after one
      selects a symbol, while in --tui, now the default when the required libs
      are installed, we annotate all symbols with samples so that when
      annotation is asked we see what happened recently on that symbol.
      
      To achieve that the --stdio variant checks if the hist_entry being
      processed is the one selected by the user via the 's' hotkey. What
      happens now that we share the hist_entry abstractions with 'perf report'
      is that for minimizing locking contention multiple rb_trees are used,
      one for collecting the samples and other to browse/show them after
      resorting it by number of samples and decay them, which is done
      periodically.
      
      So the simple test in record_precise_ip doesn't work as we move
      hist_entries between those rb_trees. To fix it just check that the
      underlying struct symbol associated with those hist_entries is the same.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Tested-by: NMike Galbraith <efault@gmx.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-bcfnraqkux88fox9ba9767ds@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9e3d4b1
  3. 02 11月, 2011 2 次提交
  4. 26 10月, 2011 2 次提交
  5. 17 10月, 2011 2 次提交
  6. 13 10月, 2011 2 次提交
  7. 08 10月, 2011 4 次提交
    • A
      perf tools: Make --no-asm-raw the default · 64c6f0c7
      Arnaldo Carvalho de Melo 提交于
      And add the annotation output knobs to all the tools that have
      integrated annotation (top, report).
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-gnlob67mke6sji2kf4nstp7m@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      64c6f0c7
    • A
      perf top: Use the TUI interface by default · 8b1bfdbd
      Arnaldo Carvalho de Melo 提交于
      To disable it either:
      
      1. Make sure newt-devel is not installed when building it
      
      2. Use 'perf top --stdio' just like with report
      
      3. Edit your ~/.perfconfig or system wide config and have this there:
      
      [tui]
      
      	top = off
      
      But you shouldn't, since the TUI is so much more powerful, has
      integration with annotation and where lots more interesting features
      will be developed, so if something annoys you (the colors?) just let me
      know and I'll do my best to make it pleasant as a default.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-cy2tn4uj1t7c3aqss5l25of5@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b1bfdbd
    • A
      perf top: Add callgraph support · 19d4ac3c
      Arnaldo Carvalho de Melo 提交于
      Just like in 'perf report', but live.
      
      Still needs to decay the callchains, but already somewhat useful as-is.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-cj3rmaf5jpsvi3v0tf7t4uvp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      19d4ac3c
    • A
      perf top: Reuse the 'report' hist_entry/hists classes · ab81f3fd
      Arnaldo Carvalho de Melo 提交于
      This actually fixes several problems we had in the old 'perf top':
      
      1. Unresolved symbols not show, limitation that came from the old
         "KernelTop" codebase, to solve it we would need to do changes
         that would make sym_entry have most of the hist_entry fields.
      2. It was using the number of samples, not the sum of sample->period.
      
      And brings the --sort code that allows us to have all the views in
      'perf report', for instance:
      
      [root@emilia ~]# perf top --sort dso
      PerfTop: 5903 irqs/sec kernel:77.5% exact: 0.0% [1000Hz cycles], (all, 8 CPUs)
      ------------------------------------------------------------------------------
      
          31.59%  libcrypto.so.1.0.0
          21.55%  [kernel]
          18.57%  libpython2.6.so.1.0
           7.04%  libc-2.12.so
           6.99%  _backend_agg.so
           4.72%  sshd
           1.48%  multiarray.so
           1.39%  libfreetype.so.6.3.22
           1.37%  perf
           0.71%  libgobject-2.0.so.0.2200.5
           0.53%  [tg3]
           0.48%  libglib-2.0.so.0.2200.5
           0.44%  libstdc++.so.6.0.13
           0.40%  libcairo.so.2.10800.8
           0.38%  libm-2.12.so
           0.34%  umath.so
           0.30%  libgdk-x11-2.0.so.0.1800.9
           0.22%  libpthread-2.12.so
           0.20%  libgtk-x11-2.0.so.0.1800.9
           0.20%  librt-2.12.so
           0.15%  _path.so
           0.13%  libpango-1.0.so.0.2800.1
           0.11%  libatlas.so.3.0
           0.09%  ft2font.so
           0.09%  libpangoft2-1.0.so.0.2800.1
           0.08%  libX11.so.6.3.0
           0.07%  [vdso]
           0.06%  cyclictest
      ^C
      
      All the filter lists can be used as well: --dsos, --comms, --symbols,
      etc.
      
      The 'perf report' TUI is also reused, being possible to apply all the
      zoom operations, do annotation, etc.
      
      This change will allow multiple simplifications in the symbol system as
      well, that will be detailed in upcoming changesets.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-xzaaldxq7zhqrrxdxjifk1mh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab81f3fd
  8. 30 9月, 2011 1 次提交
  9. 24 9月, 2011 1 次提交
  10. 21 7月, 2011 1 次提交
  11. 28 5月, 2011 3 次提交
  12. 22 5月, 2011 1 次提交
  13. 15 5月, 2011 1 次提交
    • A
      perf evlist: Fix per thread mmap setup · aece948f
      Arnaldo Carvalho de Melo 提交于
      The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using
      --pid when monitoring multithreaded apps, as we can only share a ring
      buffer for events on the same thread if not doing per cpu.
      
      Fix it by using per thread ring buffers.
      
      Tested with:
      
      [root@felicio ~]# tuna -t 26131 -CP | nl
        1                      thread       ctxt_switches
        2    pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
        3 26131   OTHER     0      0,1  10814276      2397830 chromium-browse
        4  642    OTHER     0      0,1     14688            0 chromium-browse
        5  26148  OTHER     0      0,1    713602       115479 chromium-browse
        6  26149  OTHER     0      0,1    801958         2262 chromium-browse
        7  26150  OTHER     0      0,1   1271128          248 chromium-browse
        8  26151  OTHER     0      0,1         3            0 chromium-browse
        9  27049  OTHER     0      0,1     36796            9 chromium-browse
       10  618    OTHER     0      0,1     14711            0 chromium-browse
       11  661    OTHER     0      0,1     14593            0 chromium-browse
       12  29048  OTHER     0      0,1     28125            0 chromium-browse
       13  26143  OTHER     0      0,1   2202789          781 chromium-browse
      [root@felicio ~]#
      
      So 11 threads under pid 26131, then:
      
      [root@felicio ~]# perf record -F 50000 --pid 26131
      
      [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
        1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
       10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
       11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
      [root@felicio ~]#
      
      11 mmaps, one per thread since we didn't specify any CPU list, so we need one
      mmap per thread and:
      
      [root@felicio ~]# perf record -F 50000 --pid 26131
      ^M
      ^C[ perf record: Woken up 79 times to write data ]
      [ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ]
      
      [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
           1	 371310 26131
           2	  96516 26148
           3	  95694 26149
           4	  95203 26150
           5	   7291 26143
           6	     87 27049
           7	     76 661
           8	     60 29048
           9	     47 618
          10	     43 642
      [root@felicio ~]#
      
      Ok, one of the threads, 26151 was quiescent, so no samples there, but all the
      others are there.
      
      Then, if I specify one CPU:
      
      [root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1
      ^C[ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ]
      
      [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
           1	   8444 26131
           2	   2584 26149
           3	   2518 26148
           4	   2324 26150
           5	    123 26143
           6	      9 661
           7	      9 29048
      [root@felicio ~]#
      
      This machine has two cores, so fewer threads appeared on the radar, and:
      
      [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
       1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
      [root@felicio ~]#
      
      Just one mmap, as now we can use just one per-cpu buffer instead of the
      per-thread needed in the previous case.
      
      For global profiling:
      
      [root@felicio ~]# perf record -F 50000 -a
      ^C[ perf record: Woken up 26 times to write data ]
      [ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ]
      
      [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
           1	7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
           2	7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
      [root@felicio ~]#
      
      It uses per-cpu buffers.
      
      For just one thread:
      
      [root@felicio ~]# perf record -F 50000 --tid 26148
      ^C[ perf record: Woken up 2 times to write data ]
      [ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ]
      
      [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
           1	   9969 26148
      [root@felicio ~]#
      
      [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
           1	7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
      [root@felicio ~]#
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aece948f
  14. 15 4月, 2011 1 次提交
  15. 30 3月, 2011 2 次提交
    • D
      perf tools: Emit clearer message for sys_perf_event_open ENOENT return · ca6a4258
      David Ahern 提交于
      Resend of patch sent back in January 2011 in light of recent confusion around
      unsupported events for a given platform.
      
      Improve sys_perf_event_open ENOENT return handling in top and record, just
      like 5a3446bc does for stat.
      
      Retry of Arnaldo's patch using ui_warning instead of die which allows the
      fallback from hardware cycles to software clock.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      LKML-Reference: <1301080271-20945-1-git-send-email-daahern@cisco.com>
      Signed-off-by: NDavid Ahern <daahern@cisco.com>
      [ committer note: Some adjustments to make it apply to newer codebase ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ca6a4258
    • A
      perf tools: Fixup exit path when not able to open events · c286c419
      Arnaldo Carvalho de Melo 提交于
      We have to deal with the TUI mode in perf top, so that we don't end up
      with a garbled screen when, say, a non root user on a machine with a
      paranoid setting (the default) tries to use 'perf top'.
      
      Introduce a ui__warning_paranoid() routine shared by top and record that
      tells the user the valid values for /proc/sys/kernel/perf_event_paranoid.
      
      Cc: David Ahern <daahern@cisco.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c286c419
  16. 23 3月, 2011 1 次提交
    • A
      perf top: Fix uninitialized 'counter' variable · ce2d17ca
      Akihiro Nagai 提交于
      builtin-top.c has an uninitialized variable.
      gcc(version 4.5.1) warns about it and it results in build failure:
      
       builtin-top.c: In function 'display_thread':
       builtin-top.c:518:9: error: 'counter' may be used uninitialized
      
      This situation can indeed trigger, if the getline() call in
      prompt_integer() fails.
      Signed-off-by: NAkihiro Nagai <akihiro.nagai.hw@hitachi.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <20110323072939.11638.50173.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ce2d17ca
  17. 12 3月, 2011 4 次提交
    • A
      perf symbol: Move sym_entry->skip to symbol->ignore · 171b3be9
      Arnaldo Carvalho de Melo 提交于
      While going thru each of the sym_entry fields looking to reduce it to
      the set of entries needed when in an active symbols list, 'skip' should
      really be in symbol, as we set it when loading the symtab.
      
      And the space used by the basic symbol allocation remains the same as
      we had 5 bytes of padding.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      171b3be9
    • A
      perf symbols: Rename dso->origin to dso->symtab_type · 878b439d
      Arnaldo Carvalho de Melo 提交于
      And the DSO__ORIG_ enum to SYMTAB__, to clarify that this is about from
      where the symtab was obtained.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      878b439d
    • A
      perf top: Remove redundant syme->origin field · 8b8ba4a9
      Arnaldo Carvalho de Melo 提交于
      We can get it from syme->map->dso->kernel (that should be renamed to
      origin, but leave this for another patch).
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b8ba4a9
    • A
      perf top: Remove redundant perf_top->sym_counter · ec52d976
      Arnaldo Carvalho de Melo 提交于
      We can get that counter index from perf_top->sym_evsel->idx instead.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ec52d976
  18. 10 3月, 2011 1 次提交
    • A
      perf session: Use evlist/evsel for managing perf.data attributes · a91e5431
      Arnaldo Carvalho de Melo 提交于
      So that we can reuse things like the id to attr lookup routine
      (perf_evlist__id2evsel) that uses a hash table instead of the linear
      lookup done in the older perf_header_attr routines, etc.
      
      Also to make evsels/evlist more pervasive an API, simplyfing using the
      emerging perf lib.
      
      cc: Arun Sharma <arun@sharma-home.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a91e5431
  19. 01 3月, 2011 2 次提交
  20. 22 2月, 2011 1 次提交
    • A
      perf top: Live TUI Annotation · c97cf422
      Arnaldo Carvalho de Melo 提交于
      Now one has just to press the right key, 'a' or Enter on the main 'perf
      top --tui' screen to live annotate the symbol under the cursor.
      
      The annotate window starts centered on the hottest line (the one with
      most samples so far) then TAB and shift+TAB can be used to go to the
      prev/next hot line.
      
      Pressing 'H' at any point will center again the screen on the hottest
      line.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c97cf422
  21. 10 2月, 2011 1 次提交
    • A
      perf tools: Fix thread_map event synthesizing in top and record · 401b8e13
      Arnaldo Carvalho de Melo 提交于
      Jeff Moyer reported these messages:
      
        Warning:  ... trying to fall back to cpu-clock-ticks
      
      couldn't open /proc/-1/status
      couldn't open /proc/-1/maps
      [ls output]
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ]
      
      That lead me and David Ahern to see that something was fishy on the thread
      synthesizing routines, at least for the case where the workload is started
      from 'perf record', as -1 is the default for target_tid in 'perf record --tid'
      parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and
      PERF_RECORD_COMM events for the thread -1, a bug.
      
      So I investigated this and noticed that when we introduced support for
      recording a process and its threads using --pid some bugs were introduced and
      that the way to fix it was to instead of passing the target_tid to the event
      synthesizing routines we should better pass the thread_map that has the list of
      threads for a --pid or just the single thread for a --tid.
      
      Checked in the following ways:
      
      On a 8-way machine run cyclictest:
      
      [root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50
      policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798
      
      T: 0 (28791) P:99 I:100 C:  25072 Min:      4 Act:    5 Avg:    6 Max:     122
      T: 1 (28792) P:98 I:150 C:  16715 Min:      4 Act:    6 Avg:    5 Max:      27
      T: 2 (28793) P:97 I:200 C:  12534 Min:      4 Act:    5 Avg:    4 Max:       8
      T: 3 (28794) P:96 I:250 C:  10028 Min:      4 Act:    5 Avg:    5 Max:      96
      T: 4 (28795) P:95 I:300 C:   8357 Min:      5 Act:    6 Avg:    5 Max:      12
      T: 5 (28796) P:94 I:350 C:   7163 Min:      5 Act:    6 Avg:    5 Max:      12
      T: 6 (28797) P:93 I:400 C:   6267 Min:      4 Act:    5 Avg:    5 Max:       9
      T: 7 (28798) P:92 I:450 C:   5571 Min:      4 Act:    5 Avg:    5 Max:       9
      ^C[ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ]
      
      [root@emilia ~]#
      
      This will create one extra thread per CPU:
      
      [root@emilia ~]# tuna -t cyclictest -CP
                            thread       ctxt_switches
          pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
       28825   OTHER     0     0xff      2169          671      cyclictest
        28832   FIFO    93        6     52338            1      cyclictest
        28833   FIFO    92        7     46524            1      cyclictest
        28826   FIFO    99        0    209360            1      cyclictest
        28827   FIFO    98        1    139577            1      cyclictest
        28828   FIFO    97        2    104686            0      cyclictest
        28829   FIFO    96        3     83751            1      cyclictest
        28830   FIFO    95        4     69794            1      cyclictest
        28831   FIFO    94        5     59825            1      cyclictest
      [root@emilia ~]#
      
      So we should expect only samples for the above 9 threads when using the
      --dump-raw-trace|-D perf report switch to look at the column with the tid:
      
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
          629 28825
          110 28826
          491 28827
          308 28828
          198 28829
          621 28830
          225 28831
          203 28832
           89 28833
      [root@emilia ~]#
      
      So for workloads started by 'perf record' seems to work, now for existing workloads,
      just run cyclictest first, without 'perf record':
      
      [root@emilia ~]# tuna -t cyclictest -CP
                            thread       ctxt_switches
          pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
       28859   OTHER     0     0xff       594          200      cyclictest
        28864   FIFO    95        4     16587            1      cyclictest
        28865   FIFO    94        5     14219            1      cyclictest
        28866   FIFO    93        6     12443            0      cyclictest
        28867   FIFO    92        7     11062            1      cyclictest
        28860   FIFO    99        0     49779            1      cyclictest
        28861   FIFO    98        1     33190            1      cyclictest
        28862   FIFO    97        2     24895            1      cyclictest
        28863   FIFO    96        3     19918            1      cyclictest
      [root@emilia ~]#
      
      and then later did:
      
      [root@emilia ~]# perf record --pid 28859 sleep 3
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ]
      [root@emilia ~]#
      
      To collect 3 seconds worth of samples for pid 28859 and its children:
      
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
           15 28859
           33 28860
           19 28861
           13 28862
           13 28863
           10 28864
           11 28865
            9 28866
          255 28867
      [root@emilia ~]#
      
      Works, last thing is to check if looking at just one of those threads also works:
      
      [root@emilia ~]# perf record --tid 28866 sleep 3
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ]
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
            3 28866
      [root@emilia ~]#
      
      Works too.
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      401b8e13
  22. 09 2月, 2011 1 次提交
    • A
      perf annotate: Fix annotate context lines regression · d5e3d747
      Arnaldo Carvalho de Melo 提交于
      The live annotation done in 'perf top' needs to limit the context before
      lines that aren't filtered out by the min percent filter, if we don't do
      that, the screen in a tty often is not enough for showing what is
      interesting: lines with hits and a few source code lines before it.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d5e3d747