1. 19 9月, 2012 1 次提交
    • I
      Merge tag 'perf-core-for-mingo' of... · bea8f354
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from  Arnaldo Carvalho de Melo:
      
       * Fix handling of unresolved samples when --symbols is used in 'report',
         from Feng Tang.
      
       * Add --symbols to 'script', similar to the one in 'report', from Feng Tang.
      
       * Add union member access support to 'probe', from Hyeoncheol Lee.
      
       * Make 'archive' work on Android, tweaking some of the utility parameters
         used (tar, rm), from Irina Tirdea.
      
       * Fixups to die() removal, from Namhyung Kim.
      
       * Render fixes for the TUI, from Namhyung Kim.
      
       * Don't enable annotation in non symbolic view, from Namhyung Kim.
      
       * Fix pipe mode in 'report', from Namhyung Kim.
      
       * Move related stats code from stat to util/, will be used by the 'stat'
         kvm tool, from Xiao Guangrong.
      
       * Add cpumask for uncore pmu, use it in 'stat', from Yan, Zheng.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      bea8f354
  2. 18 9月, 2012 12 次提交
  3. 15 9月, 2012 6 次提交
  4. 14 9月, 2012 7 次提交
  5. 13 9月, 2012 2 次提交
    • I
      Merge branch 'core/rcu' into perf/core · 4553f0b9
      Ingo Molnar 提交于
      Steve Rostedt asked for the merge of a single commit, into both
      the RCU and the perf/tracing tree:
      
       | Josh made a change to the tracing code that affects both the
       | work Paul McKenney and I are currently doing. At the last
       | Kernel Summit back in August, Linus said when such a case
       | exists, it is best to make a separate branch based off of his
       | tree and place the change there. This way, the repositories
       | that need to share the change can both pull them in and the
       | SHA1 will match for both. Whichever branch is pulled in first
       | by Linus will also pull in the necessary change for the other
       | branch as well.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4553f0b9
    • I
      Merge tag 'perf-core-for-mingo' of... · be267be8
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
       * Remove die()/exit() calls from several tools.
      
       * Add missing perf_regs.h file to MANIFEST
      
       * Clean up and improve 'perf sched' performance by elliminating lots of
         needless calls to libtraceevent.
      
       * More patches to make perf build on Android, from Irina Tirdea
      
       * Resolve vdso callchains, from Jiri Olsa
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      be267be8
  6. 12 9月, 2012 6 次提交
    • J
      trace: Don't declare trace_*_rcuidle functions in modules · 7ece55a4
      Josh Triplett 提交于
      Tracepoints declare a static inline trace_*_rcuidle variant of the trace
      function, to support safely generating trace events from the idle loop.
      Module code never actually uses that variant of trace functions, because
      modules don't run code that needs tracing with RCU idled.  However, the
      declaration of those otherwise unused functions causes the module to
      reference rcu_idle_exit and rcu_idle_enter, which RCU does not export to
      modules.
      
      To avoid this, don't generate trace_*_rcuidle functions for tracepoints
      declared in module code.
      
      Link: http://lkml.kernel.org/r/20120905062306.GA14756@leafReported-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7ece55a4
    • A
      perf sched: Don't read all tracepoint variables in advance · 9ec3f4e4
      Arnaldo Carvalho de Melo 提交于
      Do it just at the actual consumer of these fields, that way we avoid
      needless lookups:
      
        [root@sandy ~]# perf sched record sleep 30s
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 8.585 MB perf.data (~375063 samples) ]
      
      Before:
      
        [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
      
         Performance counter stats for 'perf sched lat' (10 runs):
      
                103.592215 task-clock                #    0.993 CPUs utilized            ( +-  0.33% )
                        12 context-switches          #    0.114 K/sec                    ( +-  3.29% )
                         0 cpu-migrations            #    0.000 K/sec
                     7,605 page-faults               #    0.073 M/sec                    ( +-  0.00% )
               345,796,112 cycles                    #    3.338 GHz                      ( +-  0.07% ) [82.90%]
               106,876,796 stalled-cycles-frontend   #   30.91% frontend cycles idle     ( +-  0.38% ) [83.23%]
                62,060,877 stalled-cycles-backend    #   17.95% backend  cycles idle     ( +-  0.80% ) [67.14%]
               628,246,586 instructions              #    1.82  insns per cycle
                                                     #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.64%]
               134,962,057 branches                  # 1302.820 M/sec                    ( +-  0.10% ) [83.64%]
                 1,233,037 branch-misses             #    0.91% of all branches          ( +-  0.29% ) [83.41%]
      
               0.104333272 seconds time elapsed                                          ( +-  0.33% )
      
        [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
      
         Performance counter stats for 'perf sched lat' (10 runs):
      
               98.848272 task-clock                #    0.993 CPUs utilized            ( +-  0.48% )
                      11 context-switches          #    0.112 K/sec                    ( +-  2.83% )
                       0 cpu-migrations            #    0.003 K/sec                    ( +- 50.92% )
                   7,604 page-faults               #    0.077 M/sec                    ( +-  0.00% )
             332,216,085 cycles                    #    3.361 GHz                      ( +-  0.14% ) [82.87%]
             100,623,710 stalled-cycles-frontend   #   30.29% frontend cycles idle     ( +-  0.53% ) [82.95%]
              58,788,692 stalled-cycles-backend    #   17.70% backend  cycles idle     ( +-  0.59% ) [67.15%]
             609,402,433 instructions              #    1.83  insns per cycle
                                                   #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.76%]
             131,277,138 branches                  # 1328.067 M/sec                    ( +-  0.06% ) [83.77%]
               1,117,871 branch-misses             #    0.85% of all branches          ( +-  0.32% ) [83.51%]
      
             0.099580430 seconds time elapsed                                          ( +-  0.48% )
      
        [root@sandy ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-kracdpw8wqlr0xjh75uk8g11@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ec3f4e4
    • A
      perf sched: Use perf_evsel__{int,str}val · 2b7fcbc5
      Arnaldo Carvalho de Melo 提交于
      This patch also stops reading the common fields, as they were not being used except
      for one ->common_pid case that was replaced by sample->tid, i.e. the info is already
      in the perf_sample struct.
      
      Also it only fills the _event structures when there is a handler.
      
        [root@sandy ~]# perf sched record sleep 30s
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 8.585 MB perf.data (~375063 samples) ]
      
      Before:
      
        [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
      
         Performance counter stats for 'perf sched lat' (10 runs):
      
                129.117838 task-clock                #    0.994 CPUs utilized            ( +-  0.28% )
                        14 context-switches          #    0.111 K/sec                    ( +-  2.10% )
                         0 cpu-migrations            #    0.002 K/sec                    ( +- 66.67% )
                     7,654 page-faults               #    0.059 M/sec                    ( +-  0.67% )
               438,121,661 cycles                    #    3.393 GHz                      ( +-  0.06% ) [83.06%]
               150,808,605 stalled-cycles-frontend   #   34.42% frontend cycles idle     ( +-  0.14% ) [83.10%]
                80,748,941 stalled-cycles-backend    #   18.43% backend  cycles idle     ( +-  0.64% ) [66.73%]
               758,605,879 instructions              #    1.73  insns per cycle
                                                     #    0.20  stalled cycles per insn  ( +-  0.08% ) [83.54%]
               162,164,321 branches                  # 1255.940 M/sec                    ( +-  0.10% ) [83.70%]
                 1,609,903 branch-misses             #    0.99% of all branches          ( +-  0.08% ) [83.62%]
      
               0.129949153 seconds time elapsed                                          ( +-  0.28% )
      
      After:
      
        [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
      
         Performance counter stats for 'perf sched lat' (10 runs):
      
                103.592215 task-clock                #    0.993 CPUs utilized            ( +-  0.33% )
                        12 context-switches          #    0.114 K/sec                    ( +-  3.29% )
                         0 cpu-migrations            #    0.000 K/sec
                     7,605 page-faults               #    0.073 M/sec                    ( +-  0.00% )
               345,796,112 cycles                    #    3.338 GHz                      ( +-  0.07% ) [82.90%]
               106,876,796 stalled-cycles-frontend   #   30.91% frontend cycles idle     ( +-  0.38% ) [83.23%]
                62,060,877 stalled-cycles-backend    #   17.95% backend  cycles idle     ( +-  0.80% ) [67.14%]
               628,246,586 instructions              #    1.82  insns per cycle
                                                     #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.64%]
               134,962,057 branches                  # 1302.820 M/sec                    ( +-  0.10% ) [83.64%]
                 1,233,037 branch-misses             #    0.91% of all branches          ( +-  0.29% ) [83.41%]
      
               0.104333272 seconds time elapsed                                          ( +-  0.33% )
      
        [root@sandy ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-weu9t63zkrfrazkn0gxj48xy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2b7fcbc5
    • A
      perf evsel: Introduce perf_evsel__{str,int}val methods · 5555ded4
      Arnaldo Carvalho de Melo 提交于
      Wrappers to the libtraceevent routines, so that we can further reduce
      the surface contact perf builtins have with it.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-rtmgzptvrifzjxqwb9vs6g1b@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5555ded4
    • A
      perf sched: Use perf_tool as ancestor · 0e9b07e5
      Arnaldo Carvalho de Melo 提交于
      So that we can remove all the globals.
      
      Before:
      
         text	   data	    bss	    dec	    hex	filename
      1586833	 110368	1438600	3135801	 2fd939	/tmp/oldperf
      
      After:
      
         text	   data	    bss	    dec	    hex	filename
      1629329	  93568	 848328	2571225	 273bd9	/root/bin/perf
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-oph40vikij0crjz4eyapneov@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0e9b07e5
    • A
      perf sched: Remove unused thread parameter · 4218e673
      Arnaldo Carvalho de Melo 提交于
      From the tracepoint handling routines.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-mcqd9mv34z6he0wqiz4a3mh9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4218e673
  7. 11 9月, 2012 6 次提交
    • I
      perf tools: Use __maybe_used for unused variables · 1d037ca1
      Irina Tirdea 提交于
      perf defines both __used and __unused variables to use for marking
      unused variables. The variable __used is defined to
      __attribute__((__unused__)), which contradicts the kernel definition to
      __attribute__((__used__)) for new gcc versions. On Android, __used is
      also defined in system headers and this leads to warnings like: warning:
      '__used__' attribute ignored
      
      __unused is not defined in the kernel and is not a standard definition.
      If __unused is included everywhere instead of __used, this leads to
      conflicts with glibc headers, since glibc has a variables with this name
      in its headers.
      
      The best approach is to use __maybe_unused, the definition used in the
      kernel for __attribute__((unused)). In this way there is only one
      definition in perf sources (instead of 2 definitions that point to the
      same thing: __used and __unused) and it works on both Linux and Android.
      This patch simply replaces all instances of __used and __unused with
      __maybe_unused.
      Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
      [ committer note: fixed up conflict with a116e05d in builtin-sched.c ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d037ca1
    • J
      perf tools: Back [vdso] DSO with real data · 7dbf4dcf
      Jiri Olsa 提交于
      Storing data for VDSO shared object, because we need it for the post
      unwind processing.
      
      The VDSO shared object is same for all process on a running system, so
      it makes no difference when we store it inside the tracer - perf.
      
      When [vdso] map memory is hit, we retrieve [vdso] DSO image and store it
      into temporary file.
      
      During the build-id processing phase, the [vdso] DSO image is stored in
      build-id db, and build-id reference is made inside perf.data. The
      build-id vdso file object is called '[vdso]'. We don't use temporary
      file name which gets removed when record is finished.
      
      During report phase the vdso build-id object is treated as any other
      build-id DSO object.
      
      Adding following API for vdso object:
      
        bool is_vdso_map(const char *filename)
          - returns true if the filename matches vdso map name
      
        struct dso *vdso__dso_findnew(struct list_head *head)
          - find/create proper vdso DSO object
      
        vdso__exit(void)
          - removes temporary VDSO image if there's any
      
      This change makes backtrace dwarf post unwind possible from [vdso] maps.
      
      Following output is current report of [vdso] sample dwarf backtrace:
      
        # Overhead  Command      Shared Object                         Symbol
        # ........  .......  .................  .............................
        #
            99.52%       ex  [vdso]             [.] 0x00007fff3ace89af
                         |
                         --- 0x7fff3ace89af
      
      Following output is new report of [vdso] sample dwarf backtrace:
      
        # Overhead  Command      Shared Object                         Symbol
        # ........  .......  .................  .............................
        #
            99.52%       ex  [vdso]             [.] 0x00000000000009af
                         |
                         --- 0x7fff3ace89af
                             main
                             __libc_start_main
                             _start
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1347295819-23177-5-git-send-email-jolsa@redhat.com
      [ committer note: s/ALIGN/PERF_ALIGN/g to cope with the android build changes ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7dbf4dcf
    • J
      perf symbols: Make dsos__find function globally available · 1c4be9ff
      Jiri Olsa 提交于
      Changing dsos__find function from static to be globally available.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1347295819-23177-4-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1c4be9ff
    • J
      perf tools: Add memdup function · b232e073
      Jiri Olsa 提交于
      Adding memdup function to duplicate region of memory.
      
        void *memdup(const void *src, size_t len)
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1347295819-23177-3-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b232e073
    • J
      perf tools: Do backtrace post unwind only if we regs and stack were captured · bdde3716
      Jiri Olsa 提交于
      Bail out without error if we want to do backtrace post unwind, but were
      not able to capture user registers or user stack during the record
      phase, which is possible and valid case.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1347295819-23177-2-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bdde3716
    • I
      perf tools: fix ALIGN redefinition in system headers · 9ac3e487
      Irina Tirdea 提交于
      On some systems (e.g. Android), ALIGN is defined in system headers as
      ALIGN(p).  The definition of ALIGN used in perf takes 2 parameters:
      ALIGN(x,a).  This leads to redefinition conflicts.
      
      Redefinition error on Android:
      In file included from util/include/linux/list.h:1:0,
      from util/callchain.h:5,
      from util/hist.h:6,
      from util/session.h:4,
      from util/build-id.h:4,
      from util/annotate.c:11:
      util/include/linux/kernel.h:11:0: error: "ALIGN" redefined [-Werror]
      bionic/libc/include/sys/param.h:38:0: note: this is the location of
      the previous definition
      
      Conflics with system defined ALIGN in Android:
      util/event.c: In function 'perf_event__synthesize_comm':
      util/event.c:115:32: error: macro "ALIGN" passed 2 arguments, but takes just 1
      util/event.c:115:9: error: 'ALIGN' undeclared (first use in this function)
      util/event.c:115:9: note: each undeclared identifier is reported only once for
      each function it appears in
      
      In order to avoid this redefinition, ALIGN is renamed to PERF_ALIGN.
      Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Irina Tirdea <irina.tirdea@intel.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1347315303-29906-5-git-send-email-irina.tirdea@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ac3e487