1. 30 1月, 2011 2 次提交
  2. 23 1月, 2011 1 次提交
    • A
      perf tools: Fix 64 bit integer format strings · 9486aa38
      Arnaldo Carvalho de Melo 提交于
      Using %L[uxd] has issues in some architectures, like on ppc64.  Fix it
      by making our 64 bit integers typedefs of stdint.h types and using
      PRI[ux]64 like, for instance, git does.
      
      Reported by Denis Kirjanov that provided a patch for one case, I went
      and changed all cases.
      Reported-by: NDenis Kirjanov <dkirjanov@kernel.org>
      Tested-by: NDenis Kirjanov <dkirjanov@kernel.org>
      LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
      Cc: Denis Kirjanov <dkirjanov@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pingtian Han <phan@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9486aa38
  3. 22 12月, 2010 1 次提交
    • I
      perf session: Fallback to unordered processing if no sample_id_all · 21ef97f0
      Ian Munsie 提交于
      If we are running the new perf on an old kernel without support for
      sample_id_all, we should fall back to the old unordered processing of
      events. If we didn't than we would *always* process events without
      timestamps out of order, whether or not we hit a reordering race. In
      other words, instead of there being a chance of not attributing samples
      correctly, we would guarantee that samples would not be attributed.
      
      While processing all events without timestamps before events with
      timestamps may seem like an intuitive solution, it falls down as
      PERF_RECORD_EXIT events would also be processed before any samples.
      Even with a workaround for that case, samples before/after an exec would
      not be attributed correctly.
      
      This patch allows commands to indicate whether they need to fall back to
      unordered processing, so that commands that do not care about timestamps
      on every event will not be affected. If we do fallback, this will print
      out a warning if report -D was invoked.
      
      This patch adds the test in perf_session__new so that we only need to
      test once per session. Commands that do not use an event_ops (such as
      record and top) can simply pass NULL in it's place.
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <1291951882-sup-6069@au1.ibm.com>
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      21ef97f0
  4. 06 12月, 2010 1 次提交
  5. 05 12月, 2010 1 次提交
    • A
      perf session: Parse sample earlier · 640c03ce
      Arnaldo Carvalho de Melo 提交于
      At perf_session__process_event, so that we reduce the number of lines in eache
      tool sample processing routine that now receives a sample_data pointer already
      parsed.
      
      This will also be useful in the next patch, where we'll allow sample the
      identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
      timestamp) just after before every event.
      
      Also validate callchains in perf_session__process_event, i.e. as early as
      possible, and keep a counter of the number of events discarded due to invalid
      callchains, warning the user about it if it happens.
      
      There is an assumption that was kept that all events have the same sample_type,
      that will be dealt with in the future, when this preexisting limitation will be
      removed.
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ian Munsie <imunsie@au1.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      640c03ce
  6. 03 5月, 2010 1 次提交
    • T
      perf: add perf-inject builtin · 454c407e
      Tom Zanussi 提交于
      Currently, perf 'live mode' writes build-ids at the end of the
      session, which isn't actually useful for processing live mode events.
      
      What would be better would be to have the build-ids sent before any of
      the samples that reference them, which can be done by processing the
      event stream and retrieving the build-ids on the first hit.  Doing
      that in perf-record itself, however, is off-limits.
      
      This patch introduces perf-inject, which does the same job while
      leaving perf-record untouched.  Normal mode perf still records the
      build-ids at the end of the session as it should, but for live mode,
      perf-inject can be injected in between the record and report steps
      e.g.:
      
      perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -
      
      perf-inject reads a perf-record event stream and repipes it to stdout.
      At any point the processing code can inject other events into the
      event stream - in this case build-ids (-b option) are read and
      injected as needed into the event stream.
      
      Build-ids are just the first user of perf-inject - potentially
      anything that needs userspace processing to augment the trace stream
      with additional information could make use of this facility.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      454c407e
  7. 30 4月, 2010 1 次提交
  8. 28 4月, 2010 2 次提交
    • A
      perf machine: Adopt some map_groups functions · d28c6223
      Arnaldo Carvalho de Melo 提交于
      Those functions operated on members now grouped in 'struct machine', so
      move those methods to this new class.
      
      The changes made to 'perf probe' shows that using this abstraction
      inserting probes on guests almost got supported for free.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d28c6223
    • A
      perf tools: Rename "kernel_info" to "machine" · 23346f21
      Arnaldo Carvalho de Melo 提交于
      struct kernel_info and kerninfo__ are too vague, what they really
      describe are machines, virtual ones or hosts.
      
      There are more changes to introduce helpers to shorten function calls
      and to make more clear what is really being done, but I left that for
      subsequent patches.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      23346f21
  9. 24 4月, 2010 1 次提交
    • F
      perf: Use generic sample reordering in perf kmem · 587570d4
      Frederic Weisbecker 提交于
      Use the new generic sample events reordering from perf kmem,
      this drops the need of multiplexing the buffers on record time,
      improving the scalability of perf kmem.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      587570d4
  10. 19 4月, 2010 1 次提交
  11. 06 4月, 2010 1 次提交
    • A
      perf kmem: Fix breakage introduced by 5a0e3ad6 slab.h script · 8c40041f
      Arnaldo Carvalho de Melo 提交于
      Commit 5a0e3ad6 ("include cleanup: Update gfp.h and slab.h
      includes to prepare for breaking implicit slab.h inclusion
      from percpu.h") added a '#include <linux/slab.h>' to
      tools/perf/builtin-kmem.h because: that tool has lines like
      this:
      
              if (!strcmp(event->name, "kmalloc") ||
                  !strcmp(event->name, "kmem_cache_alloc")) {
                      process_alloc_event(data, event, cpu, timestamp, thread, 0);
                      return;
              }
      
      So, using the script regex:
      
      >>> import re
      >>> s = re.compile(r'^(|.*[^a-zA-Z0-9_])_*(slab_is_available|kmem_cache_|k[mzc]alloc|krealloc|kz?free|ksize|__getname|putname)')
      >>> l = '   !strcmp(event->name, "kmem_cache_alloc")) {'
      >>> s.search(l)
      <_sre.SRE_Match object at 0xb77b1ad0>
      >>>
      
      Remove that file that is not available in the tools/perf include
      path and thus builtin-kmem.c couldn't be compiled.
      Reported-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <1270561053-14308-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c40041f
  12. 03 4月, 2010 3 次提交
    • A
      perf kmem: Fixup the symbol address before using it · 71cf8b8f
      Arnaldo Carvalho de Melo 提交于
      We get absolute addresses in the events, but relative ones from the
      symbol subsystem, so calculate the absolute address by asking for the
      map where the symbol was found, that has the place where the DSO was
      actually loaded.
      
      For the core kernel this poses no problems if the kernel is not
      relocated by things like kexec, or if we use /proc/kallsyms, but for
      modules we were getting really large, negative offsets.
      
      LKML-Reference: <new-submission>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      71cf8b8f
    • A
      perf kmem: Resolve kernel symbols again · e727ca73
      Arnaldo Carvalho de Melo 提交于
      Due to the assumption in perf_session__new that the kernel maps would be
      created using the fake PERF_RECORD_MMAP event in a perf.data file 'perf
      kmem --stat caller', that doesn't have such event, ends up not being
      able to resolve the kernel addresses.
      
      Fix it by calling perf_session__create_kernel_maps() in __cmd_kmem().
      
      LKML-Reference: <new-submission>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e727ca73
    • A
      perf symbols: map_groups__find_symbol must return the map too · 7e5e1b14
      Arnaldo Carvalho de Melo 提交于
      Tools need to know from which map in the map_group a symbol was resolved
      to, so that, for isntance, we can annotate kernel modules symbols by
      getting its precise name, etc.
      
      Also add the _by_name variants for completeness.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7e5e1b14
  13. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  14. 04 2月, 2010 1 次提交
    • A
      perf symbols: Remove perf_session usage in symbols layer · 9de89fe7
      Arnaldo Carvalho de Melo 提交于
      I noticed while writing the first test in 'perf regtest' that to
      just test the symbol handling routines one needs to create a
      perf session, that is a layer centered on a perf.data file,
      events, etc, so I untied these layers.
      
      This reduces the complexity for the users as the number of
      parameters to most of the symbols and session APIs now was
      reduced while not adding more state to all the map instances by
      only having data that is needed to split the kernel (kallsyms
      and ELF symtab sections) maps and do vmlinux relocation on the
      main kernel map.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9de89fe7
  15. 29 1月, 2010 1 次提交
  16. 20 1月, 2010 2 次提交
    • P
      perf kmem: Print usage help for unknown commands · b00eca8c
      Pekka Enberg 提交于
      This patch fixes "perf kmem" to print usage help instead of
      doing nothing.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      LKML-Reference: <1263921971-10782-1-git-send-email-penberg@cs.helsinki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b00eca8c
    • P
      perf kmem: Increase "Hit" column length · 47103277
      Pekka Enberg 提交于
      It's fairly easy to overflow the "Hit" column with just few
      seconds of tracing so increase the column length to avoid broken
      formatting.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      LKML-Reference: <1263921803-10214-1-git-send-email-penberg@cs.helsinki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      47103277
  17. 16 1月, 2010 1 次提交
  18. 14 1月, 2010 1 次提交
    • A
      perf tools: Encode kernel module mappings in perf.data · b7cece76
      Arnaldo Carvalho de Melo 提交于
      We were always looking at the running machine /proc/modules,
      even when processing a perf.data file, which only makes sense
      when we're doing 'perf record' and 'perf report' on the same
      machine, and in close sucession, or if we don't use modules at
      all, right Peter? ;-)
      
      Now, at 'perf record' time we read /proc/modules, find the long
      path for modules, and put them as PERF_MMAP events, just like we
      did to encode the reloc reference symbol for vmlinux. Talking
      about that now it is encoded in .pgoff, so that we can use
      .{start,len} to store the address boundaries for the kernel so
      that when we reconstruct the kmaps tree we can do lookups right
      away, without having to fixup the end of the kernel maps like we
      did in the past (and now only in perf record).
      
      One more step in the 'perf archive' direction when we'll finally
      be able to collect data in one machine and analyse in another.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1263396139-4798-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7cece76
  19. 28 12月, 2009 5 次提交
  20. 16 12月, 2009 2 次提交
  21. 15 12月, 2009 1 次提交
  22. 14 12月, 2009 7 次提交
  23. 12 12月, 2009 2 次提交
    • A
      perf tools: Introduce perf_session class · 94c744b6
      Arnaldo Carvalho de Melo 提交于
      That does all the initialization boilerplate, opening the file,
      reading the header, checking if it is valid, etc.
      
      And that will as well have the threads list, kmap (now) global
      variable, etc, so that we can handle two (or more) perf.data files
      describing sessions to compare.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260573842-19720-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94c744b6
    • A
      perf symbols: Rename kthreads to kmaps, using another abstraction for it · 9958e1f0
      Arnaldo Carvalho de Melo 提交于
      Using a struct thread instance just to hold the kernel space maps
      (vmlinux + modules) is overkill and confuses people trying to
      understand the perf symbols abstractions.
      
      The kernel maps are really present in all threads, i.e. the kernel
      is a library, not a separate thread.
      
      So introduce the 'map_groups' abstraction and use it for the kernel
      maps, now in the kmaps global variable.
      
      It, in turn, will move, together with the threads list to the
      perf_file abstraction, so that we can support multiple perf_file
      instances, needed by perf diff.
      
      Brainstormed-with: Eduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260550239-5372-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9958e1f0