1. 10 6月, 2010 2 次提交
    • A
      perf tools: Reorganize the Makefile feature tests · f9af3a4c
      Arnaldo Carvalho de Melo 提交于
      Moving the tests to a separate file, feature-tests.mak and using a try-cc
      function similar to the try-run in Kbuild.
      
      This also makes the output more quiet as we can stop using the INTERMEDIATE
      target to remove the .perf.dev.null file needed for some gcc versions where
      /dev/null can't be used as the output file name.
      
      As the tests get shorter by uninlining the source code used to test for
      features, we can more properly use identation.
      
      The feature tests itself can be made more clear and reused, like when trying to
      see what is needed to have bfd_demangle.
      
      We also get a bit closer to reusing scripts/Kbuild.include, reducing the
      distance from the kernel build system.
      
      Tests performed:
      
      [root@emilia perf]# make -j9 O=/tmp/perf
      PERF_VERSION = 0.0.2.PERF
          GEN /tmp/perf/common-cmds.h
          * new build flags or prefix
          GEN perf-archive
          CC /tmp/perf/builtin-annotate.o
          CC /tmp/perf/bench/sched-messaging.o
          CC /tmp/perf/builtin-diff.o
      <SNIP>
          CC /tmp/perf/scripts/python/Perf-Trace-Util/Context.o
          CC /tmp/perf/perf.o
          CC /tmp/perf/builtin-help.o
          AR /tmp/perf/libperf.a
          LINK /tmp/perf/perf
      [root@emilia perf]#
      
      If we uninstall, for instance newt-devel we get:
      
      [root@emilia perf]# rpm -e newt-devel
      [root@emilia perf]# make -j9 O=/tmp/perf
      Makefile:564: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
          * new build flags or prefix
          GEN perf-archive
          CC /tmp/perf/perf.o
          CC /tmp/perf/builtin-annotate.o
      <SNIP>
          AR /tmp/perf/libperf.a
          LINK /tmp/perf/perf
      [root@emilia perf]#
      
      And then binutils-devel:
      
      [root@emilia perf]# make -j9 O=/tmp/perf
      Makefile:564: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
      Makefile:632: No bfd.h/libbfd found, install binutils-dev[el]/zlib-static to gain symbol demangling
          * new build flags or prefix
          GEN perf-archive
          CC /tmp/perf/perf.o
      <SNIP>
          AR /tmp/perf/libperf.a
          LINK /tmp/perf/perf
      [root@emilia perf]#
      
      And then strictly required devel packages:
      
      [root@emilia perf]# rpm -e elfutils-libelf-devel elfutils-devel
      [root@emilia perf]# make -j9 O=/tmp/perf
      Makefile:509: No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev
      Makefile:542: *** No libelf.h/libelf found, please install libelf-dev/elfutils-libelf-devel.  Stop.
      [root@emilia perf]#
      
      After installing everything back on:
      
      [root@emilia perf]# yum install elfutils-devel binutils-devel newt-devel
      <SNIP>
      Installed:
        binutils-devel.x86_64 0:2.20.51.0.2-5.11.el6
        elfutils-devel.x86_64 0:0.147-1.el6
        elfutils-libelf-devel.x86_64 0:0.147-1.el6
        newt-devel.x86_64 0:0.52.11-1.el6
      
      Complete!
      [root@emilia perf]# make -j9
      PERF_VERSION = 0.0.2.PERF
          GEN common-cmds.h
          * new build flags or prefix
          GEN perf-archive
          CC builtin-annotate.o
      <SNIP>
          AR libperf.a
          LINK perf
      [root@emilia perf]# make -j9
      [root@emilia perf]#
      
      Thanks to Sam for pointing me to try-run.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9af3a4c
    • I
      Merge branch 'perf/core' of... · c726b61c
      Ingo Molnar 提交于
      Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
      c726b61c
  2. 09 6月, 2010 23 次提交
  3. 07 6月, 2010 2 次提交
  4. 05 6月, 2010 7 次提交
    • A
      perf report: Implement --sort cpu · f60f3593
      Arun Sharma 提交于
      In a shared multi-core environment, users want to analyze why their
      program was slow. In particular, if the code ran slower only on certain
      CPUs due to interference from other programs or kernel threads, the user
      should be able to notice that.
      
      Sample usage:
      
      perf record -f -a -- sleep 3
      perf report --sort cpu,comm
      
      Workload:
      
      program is running on 16 CPUs
      Experiencing interference from an antagonist only on 4 CPUs.
      
        Samples: 106218177676 cycles
      
        Overhead  CPU          Command
        ........  ...  ...............
      
           6.25%  2            program
           6.24%  6            program
           6.24%  11           program
           6.24%  5            program
           6.24%  9            program
           6.24%  10           program
           6.23%  15           program
           6.23%  7            program
           6.23%  3            program
           6.23%  14           program
           6.22%  1            program
           6.20%  13           program
           3.17%  12           program
           3.15%  8            program
           3.14%  0            program
           3.13%  4            program
           3.11%  4         antagonist
           3.11%  0         antagonist
           3.10%  8         antagonist
           3.07%  12        antagonist
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100505181612.GA5091@sharma-home.net>
      Signed-off-by: NArun Sharma <aruns@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f60f3593
    • A
      perf tools: Make event__preprocess_sample parse the sample · 41a37e20
      Arnaldo Carvalho de Melo 提交于
      Simplifying the tools that were using both in sequence and allowing
      upcoming simplifications, such as Arun's patch to sort by cpus.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41a37e20
    • S
      perf annotate: Ask objdump to demangle symbols · 45d8e802
      Stephane Eranian 提交于
      Perf report is demangling symbols but not annotate.
      
      The former uses internal demangling via libbdf or libiberty. The latter
      executes objdump which by default does not demangle symbols.
      
      This patch adds the -C option to the objdump cmdline to enable symbol
      demangling.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4c07b323.2126e30a.6245.0e1e@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      45d8e802
    • S
      perf buildid: add perfconfig option to specify buildid cache dir · 45de34bb
      Stephane Eranian 提交于
      This patch adds the ability to specify an alternate directory to store the
      buildid cache (buildids, copy of binaries). By default, it is hardcoded to
      $HOME/.debug. This directory contains immutable data. The layout of the
      directory is such that no conflicts in filenames are possible. A modification
      in a file, yields a different buildid and thus a different location in the
      subdir hierarchy.
      
      You may want to put the buildid cache elsewhere because of disk space
      limitation or simply to share the cache between users. It is also useful for
      remote collect vs. local analysis of profiles.
      
      This patch adds a new config option to the perfconfig file.  Under the tag
      'buildid', there is a dir option. For instance, if you have:
      
      $ cat /etc/perfconfig
      [buildid]
      dir = /var/cache/perf-buildid
      
      All buildids and binaries are be saved in the directory specified. The perf
      record, buildid-list, buildid-cache, report, annotate, and archive commands
      will it to pull information out.
      
      The option can be set in the system-wide perfconfig file or in the
      $HOME/.perfconfig file.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4c055fb7.df0ce30a.5f0d.ffffae52@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      45de34bb
    • A
      perf tools: Make target to generate self contained source tarball · 8e5564e6
      Arnaldo Carvalho de Melo 提交于
      Useful for when people want to try some version of the perf tools and don't
      wants to download the kernel tarball.
      
      Here is a session using this new target:
      
        [root@emilia linux-2.6-tip]# make help | grep -i perf
          perf-tar-src-pkg    - Build perf-2.6.35-rc1.tar source tarball
          perf-targz-src-pkg  - Build perf-2.6.35-rc1.tar.gz source tarball
          perf-tarbz2-src-pkg - Build perf-2.6.35-rc1.tar.bz2 source tarball
        [root@emilia linux-2.6-tip]# make perf-tarbz2-src-pkg
          TAR
        [root@emilia linux-2.6-tip]# ls -la perf-2.6.35-rc1.tar.bz2
        -rw-r--r-- 1 root root 295731 May 31 11:18 perf-2.6.35-rc1.tar.bz2
        [root@emilia linux-2.6-tip]# tar xf perf-2.6.35-rc1.tar.bz2
        [root@emilia linux-2.6-tip]# cd perf-2.6.35-rc1
        [root@emilia perf-2.6.35-rc1]# ls
        arch  HEAD  include  lib  tools
        [root@emilia perf-2.6.35-rc1]# cd tools/perf
        [root@emilia perf]# make -j9 2>&1 | tail
            CC arch/x86/util/dwarf-regs.o
            CC util/probe-finder.o
            CC util/newt.o
            CC util/scripting-engines/trace-event-perl.o
            CC scripts/perl/Perf-Trace-Util/Context.o
            CC perf.o
            CC builtin-help.o
            AR libperf.a
            LINK perf
        rm .perf.dev.null
        [root@emilia perf]# ./perf record -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.262 MB perf.data (~11457 samples) ]
        [root@emilia perf]# ./perf report | head -12
        # Events: 6K cycles
        #
        # Overhead          Command       Shared Object  Symbol
        # ........  ...............  ..................  ......
        #
             4.73%             perf  [kernel.kallsyms]   [k] format_decode
             4.49%             perf  libc-2.12.so        [.] _IO_file_underflow_internal
             4.38%             init  [kernel.kallsyms]   [k] mwait_idle
             3.29%             perf  [kernel.kallsyms]   [k] vsnprintf
             2.38%             init  [kernel.kallsyms]   [k] sched_clock_local
             2.35%             init  [kernel.kallsyms]   [k] apic_timer_interrupt
             1.86%     sirq-timer/5  [kernel.kallsyms]   [k] find_busiest_group
        [root@emilia perf]#
      Acked-by: NMichal Marek <mmarek@suse.cz>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100528185357.GA28009@ghostprotocols.net>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8e5564e6
    • S
      perf tools: Add the ability to specify list of cpus to monitor · c45c6ea2
      Stephane Eranian 提交于
      This patch adds a -C option to stat, record, top to designate a list of CPUs to
      monitor. CPUs can be specified as a comma-separated list or ranges, no space
      allowed.
      
      Examples:
      $ perf record -a -C0-1,4-7 sleep 1
      $ perf top -C0-4
      $ perf stat -a -C1,2,3,4 sleep 1
      
      With perf record in per-thread mode with inherit mode on, samples are collected
      only when the thread runs on the designated CPUs.
      
      The -C option does not turn on system-wide mode automatically.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bff9496.d345d80a.41fe.7b00@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c45c6ea2
    • S
      perf report: Make -D print sampled CPU · 761844b9
      Stephane Eranian 提交于
      It is useful to know on which CPU a sample was captured on.
      The information is captured with perf record -R but it was
      not printed out by perf report -D. This patch adds this.
      
      When -R is not used, cpu is set to -1to indicate that
      the CPU is unknown (it is not captured).
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bff964c.e88cd80a.3106.7d31@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      761844b9
  5. 04 6月, 2010 4 次提交
    • I
      Merge branch 'perf/urgent' of... · 58cc1a9e
      Ingo Molnar 提交于
      Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent
      58cc1a9e
    • A
      perf symbols: Set the DSO long name when using symbol_conf.vmlinux_name · e7dadc00
      Arnaldo Carvalho de Melo 提交于
      We need to set the long name to the name specified via, for instance,
      'perf annotate --vmlinux /path/to/vmlinux', if not it will remain as
      '[kernel.kallsyms]' and that will make annotate fail when passing this
      as the vmlinux name in the call to objdump.
      
      The way this is setup grew unwieldly and dso__load_vmlinux is the
      function that should allocate space for the long name, with callers not
      assuming that filenames should be allocated somehow by then (strdup,
      dso__build_id_filename, etc).
      
      For now this is the minimalistic patch, a proper fix for .36 will be
      made.
      Reported-by: NStephane Eranian <eranian@google.com>
      Tested-by: NStephane Eranian <eranian@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100604003900.GD10469@ghostprotocols.net>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e7dadc00
    • S
      tracing: Remove ftrace_preempt_disable/enable · 5168ae50
      Steven Rostedt 提交于
      The ftrace_preempt_disable/enable functions were to address a
      recursive race caused by the function tracer. The function tracer
      traces all functions which makes it easily susceptible to recursion.
      One area was preempt_enable(). This would call the scheduler and
      the schedulre would call the function tracer and loop.
      (So was it thought).
      
      The ftrace_preempt_disable/enable was made to protect against recursion
      inside the scheduler by storing the NEED_RESCHED flag. If it was
      set before the ftrace_preempt_disable() it would not call schedule
      on ftrace_preempt_enable(), thinking that if it was set before then
      it would have already scheduled unless it was already in the scheduler.
      
      This worked fine except in the case of SMP, where another task would set
      the NEED_RESCHED flag for a task on another CPU, and then kick off an
      IPI to trigger it. This could cause the NEED_RESCHED to be saved at
      ftrace_preempt_disable() but the IPI to arrive in the the preempt
      disabled section. The ftrace_preempt_enable() would not call the scheduler
      because the flag was already set before entring the section.
      
      This bug would cause a missed preemption check and cause lower latencies.
      
      Investigating further, I found that the recusion caused by the function
      tracer was not due to schedule(), but due to preempt_schedule(). Now
      that preempt_schedule is completely annotated with notrace, the recusion
      no longer is an issue.
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5168ae50
    • S
      tracing/sched: Make preempt_schedule() notrace · d1f74e20
      Steven Rostedt 提交于
      The function tracer code uses ftrace_preempt_disable() to disable
      preemption instead of normal preempt_disable(). But there's a slight
      race condition that may cause it to lose a preemption check.
      
      This was made to keep the function tracer from recursing on itself
      by disabling preemption then having the enable call the function tracer
      again, causing infinite recursion.
      
      The bug was assumed to happen if the call was just in schedule, but
      this is incorrect. The bug is caused by preempt_schedule() which
      is called by preempt_enable(). The calling of preempt_enable() when
      NEED_RESCHED was set would call preempt_schedule() which would call
      the function tracer again.
      
      By making the preempt_schedule() and add_preempt_count() notrace
      then this will prevent the inifinite recursion. This is because
      the add_preempt_count() would stop the preempt_enable() in the
      function tracer from calling preempt_schedule() again.
      
      The sub_preempt_count() is also made notrace just to keep it
      symmetric.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d1f74e20
  6. 03 6月, 2010 1 次提交
    • P
      perf: Fix crash in swevents · c6df8d5a
      Peter Zijlstra 提交于
      Frederic reported that because swevents handling doesn't disable IRQs
      anymore, we can get a recursion of perf_adjust_period(), once from
      overflow handling and once from the tick.
      
      If both call ->disable, we get a double hlist_del_rcu() and trigger
      a LIST_POISON2 dereference.
      
      Since we don't actually need to stop/start a swevent to re-programm
      the hardware (lack of hardware to program), simply nop out these
      callbacks for the swevent pmu.
      Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1275557609.27810.35218.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c6df8d5a
  7. 02 6月, 2010 1 次提交