1. 27 7月, 2010 7 次提交
    • D
      perf tools: Remove unneeded code for tracking the cwd in perf sessions · 88ca895d
      Dave Martin 提交于
      Tidy-up patch to remove some code and struct perf_session data members
      which are no longer needed due to the previous patch: "perf tools: Don't
      abbreviate file paths relative to the cwd".
      
      LKML-Reference: <new-submission>
      Signed-off-by: NDave Martin <dave.martin@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      88ca895d
    • D
      perf report: Don't abbreviate file paths relative to the cwd · 361d1346
      Dave Martin 提交于
      This avoids around some problems where the full path is executables and DSOs it
      needed for finding debug symbols on platforms with separated debug symbol files
      such as Ubuntu.  This is simpler than tracking an extra name for each image.
      
      The only impact should be that paths in verbose output from the perf tools
      become absolute, instead of relative to .
      
      LKML-Reference: <new-submission>
      Signed-off-by: NDave Martin <dave.martin@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      361d1346
    • A
      perf ui: New hists tree widget · 0f0cbf7a
      Arnaldo Carvalho de Melo 提交于
      The stock newt checkbox tree widget we were using was not really
      suitable for hist entry + callchain browsing.
      
      The problems with it were manifold:
      
      - We needed to traverse the whole hist_entry rb_tree to add each entry +
        callchains beforehand.
      
      - No control over the colors used for each row
      
      So a new tree widget, based mostly on slang, was written.
      
      It extends the ui_browser class already used for annotate to allow the
      user to fold/unfold branches in the callchains tree, using extra fields
      in the symbol_map class that is embedded in hist_entry and
      callchain_node instances to store the folding state and when changing
      this state calculates the number of rows that are produced when showing
      a particular hist_entry instance.
      
      This greatly speeds up browsing as we don't have to upfront touch all
      the entries and only calculate callchain related operations when some
      callchain branch is actually unfolded.
      
      The memory footprint is also reduced as the data structure is not
      duplicated, just some extra fields for controling callchain state and to
      simplify the process of seeking thru entries (nr_rows, row_offset) were
      added.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0f0cbf7a
    • A
      perf ui: Show the scroll bar over the left window frame · 8d8c369f
      Arnaldo Carvalho de Melo 提交于
      So that we gain two columns and look more like classical (at least in
      TUIs) scroll bars bars.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8d8c369f
    • A
      perf ui: Consider the refreshed dimensions in ui_browser__show · 63160f73
      Arnaldo Carvalho de Melo 提交于
      When we call ui_browser__show we may have called
      ui_browser__refresh_dimensions to check if the maximum lenght for the
      contained entries changed, such as when zooming in and out DSOs or
      threads in the hist browser.
      
      For that to happen we must delete the old form, that will take care of
      deleting the vertical scrollbar, etc, and then recreate them, with the
      new dimensions.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63160f73
    • A
      perf hist: Introduce routine to measure lenght of formatted entry · 06daaaba
      Arnaldo Carvalho de Melo 提交于
      Will be used to figure out the window width needed in the new tree
      widget.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      06daaaba
    • A
      perf ui: Restore SPACE as an alias to PGDN in annotate · b61b55ed
      Arnaldo Carvalho de Melo 提交于
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b61b55ed
  2. 23 7月, 2010 2 次提交
  3. 18 7月, 2010 3 次提交
  4. 16 7月, 2010 3 次提交
  5. 08 7月, 2010 2 次提交
    • F
      perf: Sync callchains with period based hits · 108553e1
      Frederic Weisbecker 提交于
      Hists have their hits increased by the event period. And this
      period based counting is the foundation of all the stats in
      perf report.
      
      But callchains still use the raw number of hits, without taking
      the period into account. So when we compute the percentage,
      absolute based percentages are totally broken, and relative ones
      too in the first parent level. Because we pass the number of events
      muliplied by their period as the total number of hits to the
      callchain filtering, while callchains expect this number to be
      the number of raw hits.
      
      perf report -g graph was simply not working, showing no graph unless
      the min percent was zero. And even there the percentage of the
      branches was always 0. And may be fractal filtering was broken on
      the first branch level too.
      
      flat also was broken, but it was hidden because of other breakages.
      
      Anyway fix this by counting using periods on callchains.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      108553e1
    • F
      perf: Resurrect flat callchains · 97aa1052
      Frederic Weisbecker 提交于
      Initialize the callchain radix tree root correctly.
      
      When we walk through the parents, we must stop after the root, but
      since it wasn't well initialized, its parent pointer was random.
      
      Also the number of hits was random because uninitialized, hence it
      was part of the callchain while the root doesn't contain anything.
      
      This fixes segfaults and percentages followed by empty callchains
      while running:
      
      	perf report -g flat
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: 2.6.31.x-2.6.34.x <stable@kernel.org>
      97aa1052
  6. 06 7月, 2010 3 次提交
    • M
      perf probe: Support static and global variables · b7dcb857
      Masami Hiramatsu 提交于
      Add static and global variables support to perf probe.
      This allows user to trace non-local variables (and
      structure members) at probe points.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100519195749.2885.17451.stgit@localhost6.localdomain6>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b7dcb857
    • M
      perf probe: Support tracing an entry of array · b2a3c12b
      Masami Hiramatsu 提交于
      Add array-entry tracing support to perf probe. This enables to trace an entry
      of array which is indexed by constant value, e.g. array[0].
      
      For example:
      
        $ perf probe -a 'bio_split bi->bi_io_vec[0]'
      
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100519195742.2885.5344.stgit@localhost6.localdomain6>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b2a3c12b
    • M
      perf probe: Support "string" type · 73317b95
      Masami Hiramatsu 提交于
      Support string type casting to event argument. If perf-probe finds an argument
      casted as string, it ensures the target variable is "(unsigned/signed) char
      *(or []). perf-probe also adds dereference if the target is a pointer.
      
      So, both of 'char buf[10];' and 'char *buf;' can be accessed by 'buf:string'
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100519195734.2885.1666.stgit@localhost6.localdomain6>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      73317b95
  7. 05 7月, 2010 2 次提交
  8. 02 7月, 2010 1 次提交
  9. 30 6月, 2010 1 次提交
  10. 25 6月, 2010 1 次提交
    • F
      perf: Don't use 4 bytes as a default instruction breakpoint length · aa59a485
      Frederic Weisbecker 提交于
      4 bytes is fine as a default access for data breakpoints. But
      instruction breakpoints should take the native pointer length,
      otherwise we get a -EINVAL in x86-64.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      aa59a485
  11. 22 6月, 2010 4 次提交
  12. 18 6月, 2010 2 次提交
    • A
      perf session: fix error message on failure to open perf.data · 0f2c3de2
      Andy Isaacson 提交于
      If we cannot open our data file, print strerror(errno) for a more
      comprehensible error message; and only suggest 'perf record' on ENOENT.
      
      In particular, this fixes the nonsensical advice when:
      
          % sudo perf record sleep 1
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.009 MB perf.data (~381 samples) ]
          % perf trace
          failed to open file: perf.data  (try 'perf record' first)
          %
      
      Cc: Ingo Molnar <mingo@elte.hu>
      LPU-Reference: <20100612033615.GA24731@hexapodia.org>
      Signed-off-by: NAndy Isaacson <adi@hexapodia.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0f2c3de2
    • A
      perf debug: fix hex dump partial final line · 84c104ad
      Andy Isaacson 提交于
      The loop counter math in trace_event was much more complicated than
      necessary, resulting in incorrectly decoding the human-readable
      portion of the partial last line of hexdump in "perf trace -D" output:
      
      .  0020:  00 00 00 00 00 00 00 00 2f 73 62 69 6e 2f 69 6e  ......../sbin/i
      .  0030:  69 74 00 00 00 00 00 00                          /sbin/i
      
      With this fixed (and simpler!) code, we get the correct output:
      
      .  0020:  00 00 00 00 00 00 00 00 2f 73 62 69 6e 2f 69 6e  ......../sbin/in
      .  0030:  69 74 00 00 00 00 00 00                          it......
      
      Cc: Ingo Molnar <mingo@elte.hu>
      LPU-Reference: <20100612024404.GA24469@hexapodia.org>
      Signed-off-by: NAndy Isaacson <adi@hexapodia.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      84c104ad
  13. 17 6月, 2010 4 次提交
    • C
      perf probe: Add kernel source path option · 9ed7e1b8
      Chase Douglas 提交于
      The probe plugin requires access to the source code for some operations.  The
      source code must be in the exact same location as specified by the DWARF tags,
      but sometimes the location is an absolute path that cannot be replicated by a
      normal user. This change adds the -s|--source option to allow the user to
      specify the root of the kernel source tree.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <1276543590-10486-1-git-send-email-chase.douglas@canonical.com>
      Signed-off-by: NChase Douglas <chase.douglas@canonical.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ed7e1b8
    • S
      perf record: Add option to avoid updating buildid cache · a1ac1d3c
      Stephane Eranian 提交于
      There are situations where there is enough information in the perf.data
      to process the samples. Updating the buildid cache may add unecessary
      overhead in terms of disk space and time (copying large elf images).
      
      A persistent option to do this already exists via the perfconfig file,
      simply do:
      
      [buildid]
      dir = /dev/null
      
      This patch provides a way to suppress builid cache updates on a per-run
      basis.  It addds a new option, -N, to perf record. Buildids are still
      generated in the perf.data file.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4c19ef89.93ecd80a.40dc.fffff8e9@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a1ac1d3c
    • E
      perf symbols: Function descriptor symbol lookup · 70c3856b
      Eric B Munson 提交于
      Currently symbol resolution does not work for 64-bit programs on architectures
      that use function descriptors such as ppc64.
      
      The problem is that a symbol doesn't point to a text address, it points to a
      data area that contains (amongst other things) a pointer to the text address.
      
      We look for a section called ".opd" which is the function descriptor area. To
      create the full symbol table, when we see a symbol in the function descriptor
      section we load the first pointer and use that as the text address.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1276523793-15422-1-git-send-email-ebmunson@us.ibm.com>
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      70c3856b
    • A
      perf session: Remove threads from tree on PERF_RECORD_EXIT · 720a3aeb
      Arnaldo Carvalho de Melo 提交于
      Move them to a session->dead_threads list just like we do with maps that
      are replaced, because we may have hist_entries pointing to them.
      
      This fixes a bug when inserting maps for a new thread that reused the
      TID, mixing maps for two different threads, causing an endless loop.
      
      The code for insering maps should be made more robust but for .35 this
      is the minimalistic patch.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      720a3aeb
  14. 10 6月, 2010 1 次提交
  15. 05 6月, 2010 4 次提交
    • A
      perf report: Implement --sort cpu · f60f3593
      Arun Sharma 提交于
      In a shared multi-core environment, users want to analyze why their
      program was slow. In particular, if the code ran slower only on certain
      CPUs due to interference from other programs or kernel threads, the user
      should be able to notice that.
      
      Sample usage:
      
      perf record -f -a -- sleep 3
      perf report --sort cpu,comm
      
      Workload:
      
      program is running on 16 CPUs
      Experiencing interference from an antagonist only on 4 CPUs.
      
        Samples: 106218177676 cycles
      
        Overhead  CPU          Command
        ........  ...  ...............
      
           6.25%  2            program
           6.24%  6            program
           6.24%  11           program
           6.24%  5            program
           6.24%  9            program
           6.24%  10           program
           6.23%  15           program
           6.23%  7            program
           6.23%  3            program
           6.23%  14           program
           6.22%  1            program
           6.20%  13           program
           3.17%  12           program
           3.15%  8            program
           3.14%  0            program
           3.13%  4            program
           3.11%  4         antagonist
           3.11%  0         antagonist
           3.10%  8         antagonist
           3.07%  12        antagonist
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100505181612.GA5091@sharma-home.net>
      Signed-off-by: NArun Sharma <aruns@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f60f3593
    • A
      perf tools: Make event__preprocess_sample parse the sample · 41a37e20
      Arnaldo Carvalho de Melo 提交于
      Simplifying the tools that were using both in sequence and allowing
      upcoming simplifications, such as Arun's patch to sort by cpus.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41a37e20
    • S
      perf annotate: Ask objdump to demangle symbols · 45d8e802
      Stephane Eranian 提交于
      Perf report is demangling symbols but not annotate.
      
      The former uses internal demangling via libbdf or libiberty. The latter
      executes objdump which by default does not demangle symbols.
      
      This patch adds the -C option to the objdump cmdline to enable symbol
      demangling.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4c07b323.2126e30a.6245.0e1e@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      45d8e802
    • S
      perf buildid: add perfconfig option to specify buildid cache dir · 45de34bb
      Stephane Eranian 提交于
      This patch adds the ability to specify an alternate directory to store the
      buildid cache (buildids, copy of binaries). By default, it is hardcoded to
      $HOME/.debug. This directory contains immutable data. The layout of the
      directory is such that no conflicts in filenames are possible. A modification
      in a file, yields a different buildid and thus a different location in the
      subdir hierarchy.
      
      You may want to put the buildid cache elsewhere because of disk space
      limitation or simply to share the cache between users. It is also useful for
      remote collect vs. local analysis of profiles.
      
      This patch adds a new config option to the perfconfig file.  Under the tag
      'buildid', there is a dir option. For instance, if you have:
      
      $ cat /etc/perfconfig
      [buildid]
      dir = /var/cache/perf-buildid
      
      All buildids and binaries are be saved in the directory specified. The perf
      record, buildid-list, buildid-cache, report, annotate, and archive commands
      will it to pull information out.
      
      The option can be set in the system-wide perfconfig file or in the
      $HOME/.perfconfig file.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4c055fb7.df0ce30a.5f0d.ffffae52@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      45de34bb