1. 17 6月, 2010 3 次提交
    • S
      perf record: Add option to avoid updating buildid cache · a1ac1d3c
      Stephane Eranian 提交于
      There are situations where there is enough information in the perf.data
      to process the samples. Updating the buildid cache may add unecessary
      overhead in terms of disk space and time (copying large elf images).
      
      A persistent option to do this already exists via the perfconfig file,
      simply do:
      
      [buildid]
      dir = /dev/null
      
      This patch provides a way to suppress builid cache updates on a per-run
      basis.  It addds a new option, -N, to perf record. Buildids are still
      generated in the perf.data file.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4c19ef89.93ecd80a.40dc.fffff8e9@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a1ac1d3c
    • E
      perf symbols: Function descriptor symbol lookup · 70c3856b
      Eric B Munson 提交于
      Currently symbol resolution does not work for 64-bit programs on architectures
      that use function descriptors such as ppc64.
      
      The problem is that a symbol doesn't point to a text address, it points to a
      data area that contains (amongst other things) a pointer to the text address.
      
      We look for a section called ".opd" which is the function descriptor area. To
      create the full symbol table, when we see a symbol in the function descriptor
      section we load the first pointer and use that as the text address.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1276523793-15422-1-git-send-email-ebmunson@us.ibm.com>
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      70c3856b
    • S
      perf record: Avoid synthesizing mmap() for all processes in per-thread mode · cf103a14
      Stephane Eranian 提交于
      A bug was introduced by commit c45c6ea2.
      
      Perf record was scanning /proc/PID to create synthetic PERF_RECOR_MMAP
      entries even though it was running in per-thread mode. There was a bogus
      check to select what mmaps to synthesize. We only need all processes in
      system-wide mode.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <4c192107.4f1ee30a.4316.fffff98e@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cf103a14
  2. 10 6月, 2010 2 次提交
    • A
      perf tools: Reorganize the Makefile feature tests · f9af3a4c
      Arnaldo Carvalho de Melo 提交于
      Moving the tests to a separate file, feature-tests.mak and using a try-cc
      function similar to the try-run in Kbuild.
      
      This also makes the output more quiet as we can stop using the INTERMEDIATE
      target to remove the .perf.dev.null file needed for some gcc versions where
      /dev/null can't be used as the output file name.
      
      As the tests get shorter by uninlining the source code used to test for
      features, we can more properly use identation.
      
      The feature tests itself can be made more clear and reused, like when trying to
      see what is needed to have bfd_demangle.
      
      We also get a bit closer to reusing scripts/Kbuild.include, reducing the
      distance from the kernel build system.
      
      Tests performed:
      
      [root@emilia perf]# make -j9 O=/tmp/perf
      PERF_VERSION = 0.0.2.PERF
          GEN /tmp/perf/common-cmds.h
          * new build flags or prefix
          GEN perf-archive
          CC /tmp/perf/builtin-annotate.o
          CC /tmp/perf/bench/sched-messaging.o
          CC /tmp/perf/builtin-diff.o
      <SNIP>
          CC /tmp/perf/scripts/python/Perf-Trace-Util/Context.o
          CC /tmp/perf/perf.o
          CC /tmp/perf/builtin-help.o
          AR /tmp/perf/libperf.a
          LINK /tmp/perf/perf
      [root@emilia perf]#
      
      If we uninstall, for instance newt-devel we get:
      
      [root@emilia perf]# rpm -e newt-devel
      [root@emilia perf]# make -j9 O=/tmp/perf
      Makefile:564: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
          * new build flags or prefix
          GEN perf-archive
          CC /tmp/perf/perf.o
          CC /tmp/perf/builtin-annotate.o
      <SNIP>
          AR /tmp/perf/libperf.a
          LINK /tmp/perf/perf
      [root@emilia perf]#
      
      And then binutils-devel:
      
      [root@emilia perf]# make -j9 O=/tmp/perf
      Makefile:564: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
      Makefile:632: No bfd.h/libbfd found, install binutils-dev[el]/zlib-static to gain symbol demangling
          * new build flags or prefix
          GEN perf-archive
          CC /tmp/perf/perf.o
      <SNIP>
          AR /tmp/perf/libperf.a
          LINK /tmp/perf/perf
      [root@emilia perf]#
      
      And then strictly required devel packages:
      
      [root@emilia perf]# rpm -e elfutils-libelf-devel elfutils-devel
      [root@emilia perf]# make -j9 O=/tmp/perf
      Makefile:509: No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev
      Makefile:542: *** No libelf.h/libelf found, please install libelf-dev/elfutils-libelf-devel.  Stop.
      [root@emilia perf]#
      
      After installing everything back on:
      
      [root@emilia perf]# yum install elfutils-devel binutils-devel newt-devel
      <SNIP>
      Installed:
        binutils-devel.x86_64 0:2.20.51.0.2-5.11.el6
        elfutils-devel.x86_64 0:0.147-1.el6
        elfutils-libelf-devel.x86_64 0:0.147-1.el6
        newt-devel.x86_64 0:0.52.11-1.el6
      
      Complete!
      [root@emilia perf]# make -j9
      PERF_VERSION = 0.0.2.PERF
          GEN common-cmds.h
          * new build flags or prefix
          GEN perf-archive
          CC builtin-annotate.o
      <SNIP>
          AR libperf.a
          LINK perf
      [root@emilia perf]# make -j9
      [root@emilia perf]#
      
      Thanks to Sam for pointing me to try-run.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9af3a4c
    • I
      Merge branch 'perf/core' of... · c726b61c
      Ingo Molnar 提交于
      Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
      c726b61c
  3. 09 6月, 2010 23 次提交
  4. 07 6月, 2010 2 次提交
  5. 05 6月, 2010 7 次提交
    • A
      perf report: Implement --sort cpu · f60f3593
      Arun Sharma 提交于
      In a shared multi-core environment, users want to analyze why their
      program was slow. In particular, if the code ran slower only on certain
      CPUs due to interference from other programs or kernel threads, the user
      should be able to notice that.
      
      Sample usage:
      
      perf record -f -a -- sleep 3
      perf report --sort cpu,comm
      
      Workload:
      
      program is running on 16 CPUs
      Experiencing interference from an antagonist only on 4 CPUs.
      
        Samples: 106218177676 cycles
      
        Overhead  CPU          Command
        ........  ...  ...............
      
           6.25%  2            program
           6.24%  6            program
           6.24%  11           program
           6.24%  5            program
           6.24%  9            program
           6.24%  10           program
           6.23%  15           program
           6.23%  7            program
           6.23%  3            program
           6.23%  14           program
           6.22%  1            program
           6.20%  13           program
           3.17%  12           program
           3.15%  8            program
           3.14%  0            program
           3.13%  4            program
           3.11%  4         antagonist
           3.11%  0         antagonist
           3.10%  8         antagonist
           3.07%  12        antagonist
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100505181612.GA5091@sharma-home.net>
      Signed-off-by: NArun Sharma <aruns@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f60f3593
    • A
      perf tools: Make event__preprocess_sample parse the sample · 41a37e20
      Arnaldo Carvalho de Melo 提交于
      Simplifying the tools that were using both in sequence and allowing
      upcoming simplifications, such as Arun's patch to sort by cpus.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41a37e20
    • S
      perf annotate: Ask objdump to demangle symbols · 45d8e802
      Stephane Eranian 提交于
      Perf report is demangling symbols but not annotate.
      
      The former uses internal demangling via libbdf or libiberty. The latter
      executes objdump which by default does not demangle symbols.
      
      This patch adds the -C option to the objdump cmdline to enable symbol
      demangling.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4c07b323.2126e30a.6245.0e1e@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      45d8e802
    • S
      perf buildid: add perfconfig option to specify buildid cache dir · 45de34bb
      Stephane Eranian 提交于
      This patch adds the ability to specify an alternate directory to store the
      buildid cache (buildids, copy of binaries). By default, it is hardcoded to
      $HOME/.debug. This directory contains immutable data. The layout of the
      directory is such that no conflicts in filenames are possible. A modification
      in a file, yields a different buildid and thus a different location in the
      subdir hierarchy.
      
      You may want to put the buildid cache elsewhere because of disk space
      limitation or simply to share the cache between users. It is also useful for
      remote collect vs. local analysis of profiles.
      
      This patch adds a new config option to the perfconfig file.  Under the tag
      'buildid', there is a dir option. For instance, if you have:
      
      $ cat /etc/perfconfig
      [buildid]
      dir = /var/cache/perf-buildid
      
      All buildids and binaries are be saved in the directory specified. The perf
      record, buildid-list, buildid-cache, report, annotate, and archive commands
      will it to pull information out.
      
      The option can be set in the system-wide perfconfig file or in the
      $HOME/.perfconfig file.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4c055fb7.df0ce30a.5f0d.ffffae52@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      45de34bb
    • A
      perf tools: Make target to generate self contained source tarball · 8e5564e6
      Arnaldo Carvalho de Melo 提交于
      Useful for when people want to try some version of the perf tools and don't
      wants to download the kernel tarball.
      
      Here is a session using this new target:
      
        [root@emilia linux-2.6-tip]# make help | grep -i perf
          perf-tar-src-pkg    - Build perf-2.6.35-rc1.tar source tarball
          perf-targz-src-pkg  - Build perf-2.6.35-rc1.tar.gz source tarball
          perf-tarbz2-src-pkg - Build perf-2.6.35-rc1.tar.bz2 source tarball
        [root@emilia linux-2.6-tip]# make perf-tarbz2-src-pkg
          TAR
        [root@emilia linux-2.6-tip]# ls -la perf-2.6.35-rc1.tar.bz2
        -rw-r--r-- 1 root root 295731 May 31 11:18 perf-2.6.35-rc1.tar.bz2
        [root@emilia linux-2.6-tip]# tar xf perf-2.6.35-rc1.tar.bz2
        [root@emilia linux-2.6-tip]# cd perf-2.6.35-rc1
        [root@emilia perf-2.6.35-rc1]# ls
        arch  HEAD  include  lib  tools
        [root@emilia perf-2.6.35-rc1]# cd tools/perf
        [root@emilia perf]# make -j9 2>&1 | tail
            CC arch/x86/util/dwarf-regs.o
            CC util/probe-finder.o
            CC util/newt.o
            CC util/scripting-engines/trace-event-perl.o
            CC scripts/perl/Perf-Trace-Util/Context.o
            CC perf.o
            CC builtin-help.o
            AR libperf.a
            LINK perf
        rm .perf.dev.null
        [root@emilia perf]# ./perf record -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.262 MB perf.data (~11457 samples) ]
        [root@emilia perf]# ./perf report | head -12
        # Events: 6K cycles
        #
        # Overhead          Command       Shared Object  Symbol
        # ........  ...............  ..................  ......
        #
             4.73%             perf  [kernel.kallsyms]   [k] format_decode
             4.49%             perf  libc-2.12.so        [.] _IO_file_underflow_internal
             4.38%             init  [kernel.kallsyms]   [k] mwait_idle
             3.29%             perf  [kernel.kallsyms]   [k] vsnprintf
             2.38%             init  [kernel.kallsyms]   [k] sched_clock_local
             2.35%             init  [kernel.kallsyms]   [k] apic_timer_interrupt
             1.86%     sirq-timer/5  [kernel.kallsyms]   [k] find_busiest_group
        [root@emilia perf]#
      Acked-by: NMichal Marek <mmarek@suse.cz>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100528185357.GA28009@ghostprotocols.net>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8e5564e6
    • S
      perf tools: Add the ability to specify list of cpus to monitor · c45c6ea2
      Stephane Eranian 提交于
      This patch adds a -C option to stat, record, top to designate a list of CPUs to
      monitor. CPUs can be specified as a comma-separated list or ranges, no space
      allowed.
      
      Examples:
      $ perf record -a -C0-1,4-7 sleep 1
      $ perf top -C0-4
      $ perf stat -a -C1,2,3,4 sleep 1
      
      With perf record in per-thread mode with inherit mode on, samples are collected
      only when the thread runs on the designated CPUs.
      
      The -C option does not turn on system-wide mode automatically.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bff9496.d345d80a.41fe.7b00@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c45c6ea2
    • S
      perf report: Make -D print sampled CPU · 761844b9
      Stephane Eranian 提交于
      It is useful to know on which CPU a sample was captured on.
      The information is captured with perf record -R but it was
      not printed out by perf report -D. This patch adds this.
      
      When -R is not used, cpu is set to -1to indicate that
      the CPU is unknown (it is not captured).
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bff964c.e88cd80a.3106.7d31@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      761844b9
  6. 04 6月, 2010 3 次提交
    • I
      Merge branch 'perf/urgent' of... · 58cc1a9e
      Ingo Molnar 提交于
      Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent
      58cc1a9e
    • A
      perf symbols: Set the DSO long name when using symbol_conf.vmlinux_name · e7dadc00
      Arnaldo Carvalho de Melo 提交于
      We need to set the long name to the name specified via, for instance,
      'perf annotate --vmlinux /path/to/vmlinux', if not it will remain as
      '[kernel.kallsyms]' and that will make annotate fail when passing this
      as the vmlinux name in the call to objdump.
      
      The way this is setup grew unwieldly and dso__load_vmlinux is the
      function that should allocate space for the long name, with callers not
      assuming that filenames should be allocated somehow by then (strdup,
      dso__build_id_filename, etc).
      
      For now this is the minimalistic patch, a proper fix for .36 will be
      made.
      Reported-by: NStephane Eranian <eranian@google.com>
      Tested-by: NStephane Eranian <eranian@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100604003900.GD10469@ghostprotocols.net>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e7dadc00
    • S
      tracing: Remove ftrace_preempt_disable/enable · 5168ae50
      Steven Rostedt 提交于
      The ftrace_preempt_disable/enable functions were to address a
      recursive race caused by the function tracer. The function tracer
      traces all functions which makes it easily susceptible to recursion.
      One area was preempt_enable(). This would call the scheduler and
      the schedulre would call the function tracer and loop.
      (So was it thought).
      
      The ftrace_preempt_disable/enable was made to protect against recursion
      inside the scheduler by storing the NEED_RESCHED flag. If it was
      set before the ftrace_preempt_disable() it would not call schedule
      on ftrace_preempt_enable(), thinking that if it was set before then
      it would have already scheduled unless it was already in the scheduler.
      
      This worked fine except in the case of SMP, where another task would set
      the NEED_RESCHED flag for a task on another CPU, and then kick off an
      IPI to trigger it. This could cause the NEED_RESCHED to be saved at
      ftrace_preempt_disable() but the IPI to arrive in the the preempt
      disabled section. The ftrace_preempt_enable() would not call the scheduler
      because the flag was already set before entring the section.
      
      This bug would cause a missed preemption check and cause lower latencies.
      
      Investigating further, I found that the recusion caused by the function
      tracer was not due to schedule(), but due to preempt_schedule(). Now
      that preempt_schedule is completely annotated with notrace, the recusion
      no longer is an issue.
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5168ae50