1. 03 10月, 2012 1 次提交
  2. 15 9月, 2012 1 次提交
  3. 12 9月, 2012 4 次提交
    • A
      perf sched: Don't read all tracepoint variables in advance · 9ec3f4e4
      Arnaldo Carvalho de Melo 提交于
      Do it just at the actual consumer of these fields, that way we avoid
      needless lookups:
      
        [root@sandy ~]# perf sched record sleep 30s
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 8.585 MB perf.data (~375063 samples) ]
      
      Before:
      
        [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
      
         Performance counter stats for 'perf sched lat' (10 runs):
      
                103.592215 task-clock                #    0.993 CPUs utilized            ( +-  0.33% )
                        12 context-switches          #    0.114 K/sec                    ( +-  3.29% )
                         0 cpu-migrations            #    0.000 K/sec
                     7,605 page-faults               #    0.073 M/sec                    ( +-  0.00% )
               345,796,112 cycles                    #    3.338 GHz                      ( +-  0.07% ) [82.90%]
               106,876,796 stalled-cycles-frontend   #   30.91% frontend cycles idle     ( +-  0.38% ) [83.23%]
                62,060,877 stalled-cycles-backend    #   17.95% backend  cycles idle     ( +-  0.80% ) [67.14%]
               628,246,586 instructions              #    1.82  insns per cycle
                                                     #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.64%]
               134,962,057 branches                  # 1302.820 M/sec                    ( +-  0.10% ) [83.64%]
                 1,233,037 branch-misses             #    0.91% of all branches          ( +-  0.29% ) [83.41%]
      
               0.104333272 seconds time elapsed                                          ( +-  0.33% )
      
        [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
      
         Performance counter stats for 'perf sched lat' (10 runs):
      
               98.848272 task-clock                #    0.993 CPUs utilized            ( +-  0.48% )
                      11 context-switches          #    0.112 K/sec                    ( +-  2.83% )
                       0 cpu-migrations            #    0.003 K/sec                    ( +- 50.92% )
                   7,604 page-faults               #    0.077 M/sec                    ( +-  0.00% )
             332,216,085 cycles                    #    3.361 GHz                      ( +-  0.14% ) [82.87%]
             100,623,710 stalled-cycles-frontend   #   30.29% frontend cycles idle     ( +-  0.53% ) [82.95%]
              58,788,692 stalled-cycles-backend    #   17.70% backend  cycles idle     ( +-  0.59% ) [67.15%]
             609,402,433 instructions              #    1.83  insns per cycle
                                                   #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.76%]
             131,277,138 branches                  # 1328.067 M/sec                    ( +-  0.06% ) [83.77%]
               1,117,871 branch-misses             #    0.85% of all branches          ( +-  0.32% ) [83.51%]
      
             0.099580430 seconds time elapsed                                          ( +-  0.48% )
      
        [root@sandy ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-kracdpw8wqlr0xjh75uk8g11@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ec3f4e4
    • A
      perf sched: Use perf_evsel__{int,str}val · 2b7fcbc5
      Arnaldo Carvalho de Melo 提交于
      This patch also stops reading the common fields, as they were not being used except
      for one ->common_pid case that was replaced by sample->tid, i.e. the info is already
      in the perf_sample struct.
      
      Also it only fills the _event structures when there is a handler.
      
        [root@sandy ~]# perf sched record sleep 30s
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 8.585 MB perf.data (~375063 samples) ]
      
      Before:
      
        [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
      
         Performance counter stats for 'perf sched lat' (10 runs):
      
                129.117838 task-clock                #    0.994 CPUs utilized            ( +-  0.28% )
                        14 context-switches          #    0.111 K/sec                    ( +-  2.10% )
                         0 cpu-migrations            #    0.002 K/sec                    ( +- 66.67% )
                     7,654 page-faults               #    0.059 M/sec                    ( +-  0.67% )
               438,121,661 cycles                    #    3.393 GHz                      ( +-  0.06% ) [83.06%]
               150,808,605 stalled-cycles-frontend   #   34.42% frontend cycles idle     ( +-  0.14% ) [83.10%]
                80,748,941 stalled-cycles-backend    #   18.43% backend  cycles idle     ( +-  0.64% ) [66.73%]
               758,605,879 instructions              #    1.73  insns per cycle
                                                     #    0.20  stalled cycles per insn  ( +-  0.08% ) [83.54%]
               162,164,321 branches                  # 1255.940 M/sec                    ( +-  0.10% ) [83.70%]
                 1,609,903 branch-misses             #    0.99% of all branches          ( +-  0.08% ) [83.62%]
      
               0.129949153 seconds time elapsed                                          ( +-  0.28% )
      
      After:
      
        [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
      
         Performance counter stats for 'perf sched lat' (10 runs):
      
                103.592215 task-clock                #    0.993 CPUs utilized            ( +-  0.33% )
                        12 context-switches          #    0.114 K/sec                    ( +-  3.29% )
                         0 cpu-migrations            #    0.000 K/sec
                     7,605 page-faults               #    0.073 M/sec                    ( +-  0.00% )
               345,796,112 cycles                    #    3.338 GHz                      ( +-  0.07% ) [82.90%]
               106,876,796 stalled-cycles-frontend   #   30.91% frontend cycles idle     ( +-  0.38% ) [83.23%]
                62,060,877 stalled-cycles-backend    #   17.95% backend  cycles idle     ( +-  0.80% ) [67.14%]
               628,246,586 instructions              #    1.82  insns per cycle
                                                     #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.64%]
               134,962,057 branches                  # 1302.820 M/sec                    ( +-  0.10% ) [83.64%]
                 1,233,037 branch-misses             #    0.91% of all branches          ( +-  0.29% ) [83.41%]
      
               0.104333272 seconds time elapsed                                          ( +-  0.33% )
      
        [root@sandy ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-weu9t63zkrfrazkn0gxj48xy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2b7fcbc5
    • A
      perf sched: Use perf_tool as ancestor · 0e9b07e5
      Arnaldo Carvalho de Melo 提交于
      So that we can remove all the globals.
      
      Before:
      
         text	   data	    bss	    dec	    hex	filename
      1586833	 110368	1438600	3135801	 2fd939	/tmp/oldperf
      
      After:
      
         text	   data	    bss	    dec	    hex	filename
      1629329	  93568	 848328	2571225	 273bd9	/root/bin/perf
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-oph40vikij0crjz4eyapneov@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0e9b07e5
    • A
      perf sched: Remove unused thread parameter · 4218e673
      Arnaldo Carvalho de Melo 提交于
      From the tracepoint handling routines.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-mcqd9mv34z6he0wqiz4a3mh9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4218e673
  4. 11 9月, 2012 1 次提交
    • I
      perf tools: Use __maybe_used for unused variables · 1d037ca1
      Irina Tirdea 提交于
      perf defines both __used and __unused variables to use for marking
      unused variables. The variable __used is defined to
      __attribute__((__unused__)), which contradicts the kernel definition to
      __attribute__((__used__)) for new gcc versions. On Android, __used is
      also defined in system headers and this leads to warnings like: warning:
      '__used__' attribute ignored
      
      __unused is not defined in the kernel and is not a standard definition.
      If __unused is included everywhere instead of __used, this leads to
      conflicts with glibc headers, since glibc has a variables with this name
      in its headers.
      
      The best approach is to use __maybe_unused, the definition used in the
      kernel for __attribute__((unused)). In this way there is only one
      definition in perf sources (instead of 2 definitions that point to the
      same thing: __used and __unused) and it works on both Linux and Android.
      This patch simply replaces all instances of __used and __unused with
      __maybe_unused.
      Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
      [ committer note: fixed up conflict with a116e05d in builtin-sched.c ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d037ca1
  5. 09 9月, 2012 1 次提交
  6. 08 8月, 2012 2 次提交
    • A
      perf sched: Use perf_sample · 7f7f8d0b
      Arnaldo Carvalho de Melo 提交于
      To reduce the number of parameters passed to the various event handling
      functions.
      
      Cc: Andrey Wagin <avagin@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-fc537qykjjqzvyol5fecx6ug@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7f7f8d0b
    • A
      perf evsel: Cache associated event_format · fcf65bf1
      Arnaldo Carvalho de Melo 提交于
      We already lookup the associated event_format when reading the perf.data
      header, so that we can cache the tracepoint name in evsel->name, so do
      it a little further and save the event_format itself, so that we can
      avoid relookups in tools that need to access it.
      
      Change the tools to take the most obvious advantage, when they were
      using pevent_find_event directly. More work is needed for further
      removing the need of a pointer to pevent, such as when asking for event
      field values ("common_pid" and the other common fields and per
      event_format fields).
      
      This is something that was planned but only got actually done when
      Andrey Wagin needed to do this lookup at perf_tool->sample() time, when
      we don't have access to pevent (session->pevent) to use with
      pevent_find_event().
      
      Cc: Andrey Wagin <avagin@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/n/tip-txkvew2ckko0b594ae8fbnyk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fcf65bf1
  7. 28 6月, 2012 1 次提交
    • A
      perf tools: Stop using a global trace events description list · da378962
      Arnaldo Carvalho de Melo 提交于
      The pevent thing is per perf.data file, so I made it stop being static
      and become a perf_session member, so tools processing perf.data files
      use perf_session and _there_ we read the trace events description into
      session->pevent and then change everywhere to stop using that single
      global pevent variable and use the per session one.
      
      Note that it _doesn't_ fall backs to trace__event_id, as we're not
      interested at all in what is present in the
      /sys/kernel/debug/tracing/events in the workstation doing the analysis,
      just in what is in the perf.data file.
      
      This patch also introduces perf_session__set_tracepoints_handlers that
      is the perf perf.data/session way to associate handlers to tracepoint
      events by resolving their IDs using the events descriptions stored in a
      perf.data file. Make 'perf sched' use it.
      Reported-by: NDmitry Antipov <dmitry.antipov@linaro.org>
      Tested-by: NDmitry Antipov <dmitry.antipov@linaro.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linaro-dev@lists.linaro.org
      Cc: patches@linaro.org
      Link: http://lkml.kernel.org/r/20120625232016.GA28525@infradead.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      da378962
  8. 20 6月, 2012 1 次提交
  9. 25 4月, 2012 1 次提交
    • S
      perf: Have perf use the new libtraceevent.a library · aaf045f7
      Steven Rostedt 提交于
      The event parsing code in perf was originally copied from trace-cmd
      but never was kept up-to-date with the changes that was done there.
      The trace-cmd libtraceevent.a code is much more mature than what is
      currently in perf.
      
      This updates the code to use wrappers to handle the calls to the
      new event parsing code. The new code requires a handle to be pass
      around, which removes the global event variables and allows
      more than one event structure to be read from different files
      (and different machines).
      
      But perf still has the old global events and the code throughout
      perf does not yet have a nice way to pass around a handle.
      A global 'pevent' has been made for perf and the old calls have
      been created as wrappers to the new event parsing code that uses
      the global pevent.
      
      With this change, perf can later incorporate the pevent handle into
      the perf structures and allow more than one file to be read and
      compared, that contains different events.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      aaf045f7
  10. 04 4月, 2012 1 次提交
  11. 24 12月, 2011 1 次提交
    • R
      perf report: Accept fifos as input file · efad1415
      Robert Richter 提交于
      The default input file for perf report is not handled the same way as
      perf record does it for its output file. This leads to unexpected
      behavior of perf report, etc. E.g.:
      
       # perf record -a -e cpu-cycles sleep 2 | perf report | cat
       failed to open perf.data: No such file or directory  (try 'perf record' first)
      
      While perf record writes to a fifo, perf report expects perf.data to be
      read. This patch changes this to accept fifos as input file.
      
      Applies to the following commands:
      
       perf annotate
       perf buildid-list
       perf evlist
       perf kmem
       perf lock
       perf report
       perf sched
       perf script
       perf timechart
      
      Also fixes char const* -> const char* type declaration for filename
      strings.
      
      v2:
      * Prevent potential null pointer access to input_name in
        builtin-report.c. Needed due to removal of patch "perf report: Setup
        browser if stdout is a pipe"
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.comSigned-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      efad1415
  12. 29 11月, 2011 1 次提交
    • A
      perf tools: Save some loops using perf_evlist__id2evsel · ee29be62
      Arnaldo Carvalho de Melo 提交于
      Since we already ask for PERF_SAMPLE_ID and use it to quickly find the
      associated evsel, add handler func + data to struct perf_evsel to avoid
      using chains of if(strcmp(event_name)) and also to avoid all the linear
      list searches via trace_event_find.
      
      To demonstrate the technique convert 'perf sched' to it:
      
       # perf sched record sleep 5m
      
      And then:
      
       Performance counter stats for '/tmp/oldperf sched lat':
      
              646.929438 task-clock                #    0.999 CPUs utilized
                       9 context-switches          #    0.000 M/sec
                       0 CPU-migrations            #    0.000 M/sec
                  20,901 page-faults               #    0.032 M/sec
           1,290,144,450 cycles                    #    1.994 GHz
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
           1,606,158,439 instructions              #    1.24  insns per cycle
             339,088,395 branches                  #  524.151 M/sec
               4,550,735 branch-misses             #    1.34% of all branches
      
             0.647524759 seconds time elapsed
      
      Versus:
      
       Performance counter stats for 'perf sched lat':
      
              473.564691 task-clock                #    0.999 CPUs utilized
                       9 context-switches          #    0.000 M/sec
                       0 CPU-migrations            #    0.000 M/sec
                  20,903 page-faults               #    0.044 M/sec
             944,367,984 cycles                    #    1.994 GHz
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
           1,442,385,571 instructions              #    1.53  insns per cycle
             308,383,106 branches                  #  651.195 M/sec
               4,481,784 branch-misses             #    1.45% of all branches
      
             0.474215751 seconds time elapsed
      
      [root@emilia ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-1kbzpl74lwi6lavpqke2u2p3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ee29be62
  13. 28 11月, 2011 4 次提交
  14. 10 8月, 2011 2 次提交
  15. 24 3月, 2011 1 次提交
    • A
      perf session: Pass evsel in event_ops->sample() · 9e69c210
      Arnaldo Carvalho de Melo 提交于
      Resolving the sample->id to an evsel since the most advanced tools,
      report and annotate, and the others will too when they evolve to
      properly support multi-event perf.data files.
      
      Good also because it does an extra validation, checking that the ID is
      valid when present. When that is not the case, the overhead is just a
      branch + function call (perf_evlist__id2evsel).
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9e69c210
  16. 07 2月, 2011 1 次提交
    • K
      perf tool: Fix gcc 4.6.0 issues · fb7d0b3c
      Kyle McMartin 提交于
      GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf
      due to the -Werror=unused-but-set-variable flag.
      
      I've gone through and annotated some of the assignments that had side
      effects (ie: return value from a function) with the __used annotation,
      and in some cases, just removed unused code.
      
      In a few cases, we were assigning something useful, but not using it in
      later parts of the function.
      
      kyle@dreadnought:~/src% gcc --version
      gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3)
      
      Cc: Ingo Molnar <mingo@redhat.com>
      LKML-Reference: <20110124161304.GK27353@bombadil.infradead.org>
      Signed-off-by: NKyle McMartin <kyle@redhat.com>
      [ committer note: Fixed up the annotation fixes, as that code moved recently ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fb7d0b3c
  17. 30 1月, 2011 2 次提交
  18. 23 1月, 2011 1 次提交
    • A
      perf tools: Fix 64 bit integer format strings · 9486aa38
      Arnaldo Carvalho de Melo 提交于
      Using %L[uxd] has issues in some architectures, like on ppc64.  Fix it
      by making our 64 bit integers typedefs of stdint.h types and using
      PRI[ux]64 like, for instance, git does.
      
      Reported by Denis Kirjanov that provided a patch for one case, I went
      and changed all cases.
      Reported-by: NDenis Kirjanov <dkirjanov@kernel.org>
      Tested-by: NDenis Kirjanov <dkirjanov@kernel.org>
      LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
      Cc: Denis Kirjanov <dkirjanov@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pingtian Han <phan@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9486aa38
  19. 13 1月, 2011 1 次提交
  20. 11 1月, 2011 1 次提交
  21. 10 1月, 2011 1 次提交
  22. 22 12月, 2010 1 次提交
    • I
      perf session: Fallback to unordered processing if no sample_id_all · 21ef97f0
      Ian Munsie 提交于
      If we are running the new perf on an old kernel without support for
      sample_id_all, we should fall back to the old unordered processing of
      events. If we didn't than we would *always* process events without
      timestamps out of order, whether or not we hit a reordering race. In
      other words, instead of there being a chance of not attributing samples
      correctly, we would guarantee that samples would not be attributed.
      
      While processing all events without timestamps before events with
      timestamps may seem like an intuitive solution, it falls down as
      PERF_RECORD_EXIT events would also be processed before any samples.
      Even with a workaround for that case, samples before/after an exec would
      not be attributed correctly.
      
      This patch allows commands to indicate whether they need to fall back to
      unordered processing, so that commands that do not care about timestamps
      on every event will not be affected. If we do fallback, this will print
      out a warning if report -D was invoked.
      
      This patch adds the test in perf_session__new so that we only need to
      test once per session. Commands that do not use an event_ops (such as
      record and top) can simply pass NULL in it's place.
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <1291951882-sup-6069@au1.ibm.com>
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      21ef97f0
  23. 06 12月, 2010 1 次提交
  24. 05 12月, 2010 1 次提交
    • A
      perf session: Parse sample earlier · 640c03ce
      Arnaldo Carvalho de Melo 提交于
      At perf_session__process_event, so that we reduce the number of lines in eache
      tool sample processing routine that now receives a sample_data pointer already
      parsed.
      
      This will also be useful in the next patch, where we'll allow sample the
      identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
      timestamp) just after before every event.
      
      Also validate callchains in perf_session__process_event, i.e. as early as
      possible, and keep a counter of the number of events discarded due to invalid
      callchains, warning the user about it if it happens.
      
      There is an assumption that was kept that all events have the same sample_type,
      that will be dealt with in the future, when this preexisting limitation will be
      removed.
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ian Munsie <imunsie@au1.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      640c03ce
  25. 17 11月, 2010 1 次提交
  26. 01 6月, 2010 1 次提交
    • F
      perf: Use event__process_task from perf sched · af64865b
      Frederic Weisbecker 提交于
      perf sched uses event__process_comm(), which means it can resolve
      comms from:
      
      - tasks that have exec'ed (kernel comm events)
      - tasks that were running when perf record started the actual
        recording (synthetized comm events)
      
      But perf sched can't resolve the pids of tasks that were created
      after the recording started.
      
      To solve this, we need to inherit the comms on fork events using
      event__process_task().
      
      This fixes various unresolved pids in perf sched, easily visible
      with:
      	perf sched record perf bench sched messaging
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      af64865b
  27. 18 5月, 2010 2 次提交
    • A
      perf options: Type check all the remaining OPT_ variants · edb7c60e
      Arnaldo Carvalho de Melo 提交于
      OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
      tools is to set something that has an enum type, that is builtin
      compatible with unsigned int.
      
      Several string constifications were done to make OPT_STRING require a
      const char * type.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      edb7c60e
    • A
      perf options: Check v type in OPT_U?INTEGER · 1967936d
      Arnaldo Carvalho de Melo 提交于
      To avoid problems like the one fixed by Stephane Eranian in 3de29cab, now
      we'll got this instead:
      
      	bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’
      	bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’
      
      Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
      hackers should be already used to this.
      
      With it in place found some problems, fixed by changing the affected
      variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.
      
      Next csets will go thru converting each of the remaining OPT_ so that
      review can be made easier by grouping changes per type per patch.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1967936d
  28. 15 5月, 2010 1 次提交
    • A
      perf hist: Clarify events_stats fields usage · cee75ac7
      Arnaldo Carvalho de Melo 提交于
      The events_stats.total field is too generic, rename it to .total_period,
      and also add a comment explaining that it is the sum of all the .period
      fields in samples, that is needed because we use auto-freq to avoid
      sampling artifacts.
      
      Ditto for events_stats.lost, that is the sum of all lost_event.lost
      fields, i.e. the number of events the kernel dropped.
      
      Looking at the users, builtin-sched.c can make use of these fields and
      stop doing it again.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cee75ac7
  29. 03 5月, 2010 1 次提交
    • T
      perf: add perf-inject builtin · 454c407e
      Tom Zanussi 提交于
      Currently, perf 'live mode' writes build-ids at the end of the
      session, which isn't actually useful for processing live mode events.
      
      What would be better would be to have the build-ids sent before any of
      the samples that reference them, which can be done by processing the
      event stream and retrieving the build-ids on the first hit.  Doing
      that in perf-record itself, however, is off-limits.
      
      This patch introduces perf-inject, which does the same job while
      leaving perf-record untouched.  Normal mode perf still records the
      build-ids at the end of the session as it should, but for live mode,
      perf-inject can be injected in between the record and report steps
      e.g.:
      
      perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -
      
      perf-inject reads a perf-record event stream and repipes it to stdout.
      At any point the processing code can inject other events into the
      event stream - in this case build-ids (-b option) are read and
      injected as needed into the event stream.
      
      Build-ids are just the first user of perf-inject - potentially
      anything that needs userspace processing to augment the trace stream
      with additional information could make use of this facility.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      454c407e
  30. 24 4月, 2010 1 次提交
    • F
      perf: Use generic sample reordering in perf sched · a64eae70
      Frederic Weisbecker 提交于
      Use the new generic sample events reordering from perf sched,
      this drops the need of multiplexing the buffers on record time,
      improving the scalability of perf sched.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      a64eae70