1. 16 5月, 2012 2 次提交
  2. 08 5月, 2012 2 次提交
  3. 03 5月, 2012 3 次提交
  4. 17 3月, 2012 1 次提交
  5. 05 3月, 2012 1 次提交
  6. 01 3月, 2012 1 次提交
  7. 14 2月, 2012 1 次提交
  8. 02 2月, 2012 1 次提交
  9. 25 1月, 2012 1 次提交
  10. 07 1月, 2012 1 次提交
  11. 24 12月, 2011 1 次提交
  12. 20 12月, 2011 1 次提交
  13. 29 11月, 2011 2 次提交
    • A
      perf evlist: Always do automatic allocation of pollfd and mmap structures · 806fb630
      Arnaldo Carvalho de Melo 提交于
      At first tools were required to do that, but while writing the python
      bindings to simplify the API I made them auto-allocate when needed.
      
      This just makes record, stat and top use that auto allocation,
      simplifying them a bit.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-iokhcvkzzijr3keioubx8hlq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      806fb630
    • A
      perf tools: Save some loops using perf_evlist__id2evsel · ee29be62
      Arnaldo Carvalho de Melo 提交于
      Since we already ask for PERF_SAMPLE_ID and use it to quickly find the
      associated evsel, add handler func + data to struct perf_evsel to avoid
      using chains of if(strcmp(event_name)) and also to avoid all the linear
      list searches via trace_event_find.
      
      To demonstrate the technique convert 'perf sched' to it:
      
       # perf sched record sleep 5m
      
      And then:
      
       Performance counter stats for '/tmp/oldperf sched lat':
      
              646.929438 task-clock                #    0.999 CPUs utilized
                       9 context-switches          #    0.000 M/sec
                       0 CPU-migrations            #    0.000 M/sec
                  20,901 page-faults               #    0.032 M/sec
           1,290,144,450 cycles                    #    1.994 GHz
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
           1,606,158,439 instructions              #    1.24  insns per cycle
             339,088,395 branches                  #  524.151 M/sec
               4,550,735 branch-misses             #    1.34% of all branches
      
             0.647524759 seconds time elapsed
      
      Versus:
      
       Performance counter stats for 'perf sched lat':
      
              473.564691 task-clock                #    0.999 CPUs utilized
                       9 context-switches          #    0.000 M/sec
                       0 CPU-migrations            #    0.000 M/sec
                  20,903 page-faults               #    0.044 M/sec
             944,367,984 cycles                    #    1.994 GHz
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
           1,442,385,571 instructions              #    1.53  insns per cycle
             308,383,106 branches                  #  651.195 M/sec
               4,481,784 branch-misses             #    1.45% of all branches
      
             0.474215751 seconds time elapsed
      
      [root@emilia ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-1kbzpl74lwi6lavpqke2u2p3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ee29be62
  14. 28 11月, 2011 6 次提交
  15. 26 10月, 2011 1 次提交
  16. 07 10月, 2011 1 次提交
  17. 24 9月, 2011 1 次提交
  18. 18 8月, 2011 1 次提交
  19. 25 7月, 2011 1 次提交
  20. 03 6月, 2011 2 次提交
  21. 02 6月, 2011 2 次提交
  22. 22 5月, 2011 1 次提交
  23. 15 5月, 2011 2 次提交
    • A
      perf evlist: Fix per thread mmap setup · aece948f
      Arnaldo Carvalho de Melo 提交于
      The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using
      --pid when monitoring multithreaded apps, as we can only share a ring
      buffer for events on the same thread if not doing per cpu.
      
      Fix it by using per thread ring buffers.
      
      Tested with:
      
      [root@felicio ~]# tuna -t 26131 -CP | nl
        1                      thread       ctxt_switches
        2    pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
        3 26131   OTHER     0      0,1  10814276      2397830 chromium-browse
        4  642    OTHER     0      0,1     14688            0 chromium-browse
        5  26148  OTHER     0      0,1    713602       115479 chromium-browse
        6  26149  OTHER     0      0,1    801958         2262 chromium-browse
        7  26150  OTHER     0      0,1   1271128          248 chromium-browse
        8  26151  OTHER     0      0,1         3            0 chromium-browse
        9  27049  OTHER     0      0,1     36796            9 chromium-browse
       10  618    OTHER     0      0,1     14711            0 chromium-browse
       11  661    OTHER     0      0,1     14593            0 chromium-browse
       12  29048  OTHER     0      0,1     28125            0 chromium-browse
       13  26143  OTHER     0      0,1   2202789          781 chromium-browse
      [root@felicio ~]#
      
      So 11 threads under pid 26131, then:
      
      [root@felicio ~]# perf record -F 50000 --pid 26131
      
      [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
        1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
        9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
       10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
       11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
      [root@felicio ~]#
      
      11 mmaps, one per thread since we didn't specify any CPU list, so we need one
      mmap per thread and:
      
      [root@felicio ~]# perf record -F 50000 --pid 26131
      ^M
      ^C[ perf record: Woken up 79 times to write data ]
      [ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ]
      
      [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
           1	 371310 26131
           2	  96516 26148
           3	  95694 26149
           4	  95203 26150
           5	   7291 26143
           6	     87 27049
           7	     76 661
           8	     60 29048
           9	     47 618
          10	     43 642
      [root@felicio ~]#
      
      Ok, one of the threads, 26151 was quiescent, so no samples there, but all the
      others are there.
      
      Then, if I specify one CPU:
      
      [root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1
      ^C[ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ]
      
      [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
           1	   8444 26131
           2	   2584 26149
           3	   2518 26148
           4	   2324 26150
           5	    123 26143
           6	      9 661
           7	      9 29048
      [root@felicio ~]#
      
      This machine has two cores, so fewer threads appeared on the radar, and:
      
      [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
       1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
      [root@felicio ~]#
      
      Just one mmap, as now we can use just one per-cpu buffer instead of the
      per-thread needed in the previous case.
      
      For global profiling:
      
      [root@felicio ~]# perf record -F 50000 -a
      ^C[ perf record: Woken up 26 times to write data ]
      [ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ]
      
      [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
           1	7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
           2	7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
      [root@felicio ~]#
      
      It uses per-cpu buffers.
      
      For just one thread:
      
      [root@felicio ~]# perf record -F 50000 --tid 26148
      ^C[ perf record: Woken up 2 times to write data ]
      [ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ]
      
      [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
           1	   9969 26148
      [root@felicio ~]#
      
      [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
           1	7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
      [root@felicio ~]#
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aece948f
    • A
      perf tools: Honour the cpu list parameter when also monitoring a thread list · b9019418
      Arnaldo Carvalho de Melo 提交于
      The perf_evlist__create_maps was discarding the --cpu parameter when a
      --pid or --tid was specified, fix that.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b9019418
  24. 15 4月, 2011 1 次提交
  25. 10 3月, 2011 1 次提交
    • A
      perf session: Use evlist/evsel for managing perf.data attributes · a91e5431
      Arnaldo Carvalho de Melo 提交于
      So that we can reuse things like the id to attr lookup routine
      (perf_evlist__id2evsel) that uses a hash table instead of the linear
      lookup done in the older perf_header_attr routines, etc.
      
      Also to make evsels/evlist more pervasive an API, simplyfing using the
      emerging perf lib.
      
      cc: Arun Sharma <arun@sharma-home.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a91e5431
  26. 07 3月, 2011 1 次提交
    • A
      perf evlist: Split perf_evlist__id_hash · 3d3b5e95
      Arnaldo Carvalho de Melo 提交于
      The previous situation was to receive an fd from where to read the event
      ID.
      
      Spin off a routine for when we have the ID handy, not having to read it
      from some fd.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3d3b5e95
  27. 02 3月, 2011 1 次提交
    • F
      perf: Set filters before mmaping events · 0a102479
      Frederic Weisbecker 提交于
      We currently set the filters after we mmap the events, this is a
      race that let undesired events record themselves in the buffer before
      we had the time to set the filters.
      
      So set the filters before they can be recorded. That also librarizes
      the filters setting so that filtering can be done more easily
      from other tools than perf record later.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      0a102479