1. 21 5月, 2010 18 次提交
  2. 20 5月, 2010 8 次提交
    • A
      perf annotate: Use build-ids to find the right DSO · b36f19d5
      Arnaldo Carvalho de Melo 提交于
      We were still using the pathname found on the MMAP event, that could not
      be the one we used when recording, so use the build-id cache for that,
      only falling back to use the pathname in the MMAP event if no build-ids
      are available.
      
      With this we now also are able to do secure, seamless offline annotation.
      
      Example:
      
      [root@doppio linux-2.6-tip]# perf report -g none -v 2> /dev/null | head -10
           8.12%     Xorg  /usr/lib64/libpixman-1.so.0.14.0       0x0000000000026d02 B [.] pixman_rasterize_edges
           4.68%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x00000000005dbdba B [.] 0x000000005dbdba
           3.70%  swapper  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff81022cea ! [k] read_hpet
           2.96%     init  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff81022cea ! [k] read_hpet
           2.73%  swapper  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff8100a738 ! [k] mwait_idle_with_hints
      [root@doppio linux-2.6-tip]# perf annotate -v pixman_rasterize_edges 2>&1 | grep Executing
      Executing: objdump --start-address=0x000000371ce26670 --stop-address=0x000000371ce2709f -dS /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|grep -v /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|expand
      [root@doppio linux-2.6-tip]# perf buildid-list | grep libpixman-1.so.0.14.0
      bd6ac5199137aaeb279f864717d8d061477466c1 /usr/lib64/libpixman-1.so.0.14.0
      [root@doppio linux-2.6-tip]#
      Reported-by: NStephane Eranian <eranian@google.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b36f19d5
    • A
      perf TUI: Make 'space' be an alias to 'PgDn' · 17930b40
      Arnaldo Carvalho de Melo 提交于
      Just like if one is using the stdio based pager, or more/less, for that
      matter.
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      17930b40
    • I
      Merge branch 'perf/urgent' of... · dfacc4d6
      Ingo Molnar 提交于
      Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
      dfacc4d6
    • F
      perf: Fix unaligned accesses while fetching trace values · 85cb68b2
      Frederic Weisbecker 提交于
      Accessing trace values of an 8 size may end up in a segfault
      on archs that can't deal with misaligned access, which is the
      case for sparc 64. This is because PERF_SAMPLE_RAW are aligned
      to 4 and not to 8.
      
      Fix this on the macros that get the values of 8 size.
      
      This fixes segfaults on perf tools in sparc 64.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      85cb68b2
    • F
      perf: Comply with new rcu checks API · 49f135ed
      Frederic Weisbecker 提交于
      The software events hlist doesn't fully comply with the new
      rcu checks api.
      
      We need to consider three different sides that access the hlist:
      
      - the hlist allocation/release side. This side happens when an
        events is created or released, accesses to the hlist are
        serialized under the cpuctx mutex.
      
      - the events insertion/removal in the hlist. This side is always
        serialized against the above one. The hlist is always present
        during such operations. This side happens when a software event
        is scheduled in/out. The serialization that ensures the software
        event is really attached to the context is made under the
        ctx->lock.
      
      - events triggering. This is the read side, it can happen
        concurrently with any update side.
      
      This patch deals with them one by one and anticipates with the
      separate rcu mem space patches in preparation.
      
      This patch fixes various annoying rcu warnings.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      49f135ed
    • T
      perf: Use read() instead of lseek() in trace_event_read.c:skip() · cbb5cf7f
      Tom Zanussi 提交于
      This is a small fix for a problem affecting live-mode, introduced
      recently:
      
      root@tropicana:~# perf trace rwtop
      perf trace started with Perl
      script /root/libexec/perf-core/scripts/perl/rwtop.pl
      
        Fatal: did not read header event
      
      commit d00a47cc added a skip()
      function to skip over e.g. header_page, but this doesn't work for
      live mode.  This patch re-implements skip() to use read() instead of
      lseek() to fix that.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1273032130.6383.28.camel@tropicana>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      cbb5cf7f
    • A
      perf session: Make read_build_id routines look at the host_machine too · f869097e
      Arnaldo Carvalho de Melo 提交于
      The changes made to support host and guest machines in a session, that
      started when the 'perf kvm' tool was introduced ended up introducing a
      bug where the host_machine was not having its DSOs traversed for
      build-id processing.
      
      Fix it by moving some methods to the right classes and considering the
      host_machine when processing build-ids.
      Reported-by: NTom Zanussi <tzanussi@gmail.com>
      Reported-by: NStephane Eranian <eranian@google.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f869097e
    • A
      perf symbols: Don't try to read the build-id twice · f6e1467d
      Arnaldo Carvalho de Melo 提交于
      In __dsos__read_build_ids if the dso already had its build-id read,
      don't try again.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f6e1467d
  3. 19 5月, 2010 14 次提交
    • C
      perf, x86: P4 PMU -- add missing bit in CCCR mask · ce7f1545
      Cyrill Gorcunov 提交于
      Should be there for the sake of RAW events.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      CC: Lin Ming <ming.m.lin@intel.com>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.354345151@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ce7f1545
    • C
      perf, x86: P4_pmu_schedule_events -- use smp_processor_id instead of raw_ · 9d36dfcf
      Cyrill Gorcunov 提交于
      This snippet somehow escaped the commit:
      
       | commit 137351e0
       | Author: Cyrill Gorcunov <gorcunov@openvz.org>
       | Date:   Sat May 8 15:25:52 2010 +0400
       |
       |    x86, perf: P4 PMU -- protect sensible procedures from preemption
      
      so bring it eventually back. It helps to catch
      preemption issue (if there will be, rule of thumb --
      don't use raw_ if you can).
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.167259349@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9d36dfcf
    • C
      perf, x86: P4 PMU -- do a real check for ESCR address being in hash · 623aab89
      Cyrill Gorcunov 提交于
      To prevent from clashes in future code modifications
      do a real check for ESCR address being in hash. At
      moment the callers are known to pass sane values but
      better to be on a safe side.
      
      And comment fix.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      CC: Lin Ming <ming.m.lin@intel.com>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.004503600@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      623aab89
    • A
      perf tools: remove xstrndup, xmalloc, xzalloc · 151f85a4
      Arnaldo Carvalho de Melo 提交于
      All the functions that call this can handle the equivalent, non
      panic'ing wrapped routines.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      151f85a4
    • A
      perf probe: Don't call die() · 8a7ddad8
      Arnaldo Carvalho de Melo 提交于
      Functions that were calling xzalloc also returned -1 when, for other
      reasons, it could fail, and the calleds are coping with failures, so
      stop using die() and xzalloc().
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8a7ddad8
    • A
      perf probe: Fix some error exit paths · b448c4b6
      Arnaldo Carvalho de Melo 提交于
      That could leave filedescriptors open and leak memory. Also stop using
      xmalloc, use malloc and handle results just like other error cases in
      the same routine that used it.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b448c4b6
    • A
      perf tools: Remove some unused functions · a41794cd
      Arnaldo Carvalho de Melo 提交于
      Without the bloated cplus_demangle from binutils, i.e building with:
      
      $ make NO_DEMANGLE=1 O=~acme/git/build/perf -j3 -C tools/perf/ install
      
      Before:
      
         text	   data	    bss	    dec	    hex	filename
       471851	  29280	4025056	4526187	 45106b	/home/acme/bin/perf
      
      After:
      
      [acme@doppio linux-2.6-tip]$ size ~/bin/perf
         text	   data	    bss	    dec	    hex	filename
       446886	  29232	4008576	4484694	 446e56	/home/acme/bin/perf
      
      So its a 5.3% size reduction in code, but the interesting part is in the git
      diff --stat output:
      
       19 files changed, 20 insertions(+), 1909 deletions(-)
      
      If we ever need some of the things we got from git but weren't using, we just
      have to go to the git repo and get fresh, uptodate source code bits.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a41794cd
    • S
      perf stat: add perf stat -B to pretty print large numbers · 5af52b51
      Stephane Eranian 提交于
      It is hard to read very large numbers so provide an option to perf stat
      to separate thousands using a separator. The patch leverages the locale
      support of stdio. You need to set your LC_NUMERIC appropriately, for
      instance LC_NUMERIC=en_US.UTF8. You need to pass -B to activate this
      feature. This way existing scripts parsing the output do not need to be
      changed. Here is an example.
      
      $ perf stat noploop 2
      noploop for 2 seconds
      
       Performance counter stats for 'noploop 2':
      
              1998.347031  task-clock-msecs         #      0.998 CPUs
                       61  context-switches         #      0.000 M/sec
                        0  CPU-migrations           #      0.000 M/sec
                      118  page-faults              #      0.000 M/sec
            4,138,410,900  cycles                   #   2070.917 M/sec  (scaled from 70.01%)
            2,062,650,268  instructions             #      0.498 IPC    (scaled from 70.01%)
            2,057,653,466  branches                 #   1029.678 M/sec  (scaled from 70.01%)
                   40,267  branch-misses            #      0.002 %      (scaled from 30.04%)
            2,055,961,348  cache-references         #   1028.831 M/sec  (scaled from 30.03%)
                   53,725  cache-misses             #      0.027 M/sec  (scaled from 30.02%)
      
              2.001393933  seconds time elapsed
      
      $ perf stat -B  noploop 2
      noploop for 2 seconds
      
       Performance counter stats for 'noploop 2':
      
              1998.297883  task-clock-msecs         #      0.998 CPUs
                       59  context-switches         #      0.000 M/sec
                        0  CPU-migrations           #      0.000 M/sec
                      119  page-faults              #      0.000 M/sec
            4,131,380,160  cycles                   #   2067.450 M/sec  (scaled from 70.01%)
            2,059,096,507  instructions             #      0.498 IPC    (scaled from 70.01%)
            2,054,681,303  branches                 #   1028.216 M/sec  (scaled from 70.01%)
                   25,650  branch-misses            #      0.001 %      (scaled from 30.05%)
            2,056,283,014  cache-references         #   1029.017 M/sec  (scaled from 30.03%)
                   47,097  cache-misses             #      0.024 M/sec  (scaled from 30.02%)
      
              2.001391016  seconds time elapsed
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bf28fe8.914ed80a.01ca.fffff5f5@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5af52b51
    • L
      perf, sparc: Implement group scheduling transactional APIs · a13c3afd
      Lin Ming 提交于
      Convert to the transactional PMU API and remove the duplication of
      group_sched_in().
      
      [cross build only]
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Acked-by: NDavid Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1272002193.5707.65.camel@minggr.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a13c3afd
    • L
      Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip · 537b60d1
      Linus Torvalds 提交于
      * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        x86, UV: uv_irq.c: Fix all sparse warnings
        x86, UV: Improve BAU performance and error recovery
        x86, UV: Delete unneeded boot messages
        x86, UV: Clean up UV headers for MMR definitions
      537b60d1
    • P
      perf: Optimize perf_output_*() by avoiding local_xchg() · 6d1acfd5
      Peter Zijlstra 提交于
      Since the x86 XCHG ins implies LOCK, avoid the use by
      using a sequence count instead.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6d1acfd5
    • P
      perf: Optimize the hotpath by converting the perf output buffer to local_t · fa588151
      Peter Zijlstra 提交于
      Since there is now only a single writer, we can use
      local_t instead and avoid all these pesky LOCK insn.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa588151
    • P
      perf: Optimize the perf_output() path by removing IRQ-disables · ef60777c
      Peter Zijlstra 提交于
      Since we can now assume there is only a single writer
      to each buffer, we can remove per-cpu lock thingy and
      use a simply nest-count to the same effect.
      
      This removes the need to disable IRQs.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef60777c
    • P
      perf: Disallow mmap() on per-task inherited events · c7920614
      Peter Zijlstra 提交于
      Since we now have working per-task-per-cpu events for
      a while, disallow mmap() on per-task inherited
      events. Those things were a performance problem
      anyway, and doing away with it allows us to optimize
      the buffer somewhat by assuming there is only a
      single writer.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7920614