1. 14 8月, 2014 1 次提交
  2. 25 7月, 2014 1 次提交
  3. 12 6月, 2014 1 次提交
    • J
      perf tools: Cache register accesses for unwind processing · 0c4e774f
      Jiri Olsa 提交于
      Caching registers value into an array. Got about 4% speed up
      of perf_reg_value function for report command processing
      dwarf unwind stacks.
      
      Output from report over 1.5 GB data with DWARF unwind stacks:
      (TODO fix perf diff)
      
        current code:
         5.84%     perf  perf                       [.] perf_reg_value
        change:
         1.94%     perf  perf                       [.] perf_reg_value
      
      And little bit of overall speed up:
      (perf stat -r 5 -e '{cycles,instructions}:u' ...)
      
        current code:
         310,298,611,754      cycles                     ( +-  0.33% )
         439,669,689,341      instructions               ( +-  0.03% )
      
           188.656753166 seconds time elapsed            ( +-  0.82% )
      
        change:
         291,315,329,878      cycles                     ( +-  0.22% )
         391,763,485,304      instructions               ( +-  0.03%  )
      
           180.742249687 seconds time elapsed            ( +-  0.64% )
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1401892622-30848-2-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      0c4e774f
  4. 09 6月, 2014 1 次提交
  5. 05 5月, 2014 1 次提交
  6. 18 2月, 2014 1 次提交
  7. 01 2月, 2014 2 次提交
  8. 13 1月, 2014 1 次提交
  9. 12 11月, 2013 1 次提交
  10. 06 11月, 2013 1 次提交
  11. 11 10月, 2013 1 次提交
    • J
      perf evlist: Fix perf_evlist__mmap_read event overflow · a65cb4b9
      Jiri Olsa 提交于
      The perf_evlist__mmap_read used 'union perf_event' as a placeholder for
      event crossing the mmap boundary.
      
      This is ok for sample shorter than ~PATH_MAX. However we could grow up
      to the maximum sample size which is 16 bits max.
      
      I hit this overflow issue when using 'perf top -G dwarf' which produces
      sample with the size around 8192 bytes.  We could configure any valid
      sample size here using: '-G dwarf,size'.
      
      Using array with sample max size instead for the event placeholder. Also
      adding another safe check for the dynamic size of the user stack.
      
      TODO: The 'struct perf_mmap' is quite big now, maybe we could use some
      lazy allocation for event_copy size.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1380721599-24285-1-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a65cb4b9
  12. 09 10月, 2013 1 次提交
  13. 04 10月, 2013 1 次提交
  14. 11 9月, 2013 1 次提交
    • S
      perf tools: Add attr->mmap2 support · 5c5e854b
      Stephane Eranian 提交于
      This patch adds support for the new PERF_RECORD_MMAP2 record type
      exposed by the kernel. This is an extended PERF_RECORD_MMAP record.
      
      It adds for each file-backed mapping the device major, minor number and
      the inode number and generation.
      
      This triplet uniquely identifies the source of a file-backed mapping. It
      can be used to detect identical virtual mappings between processes, for
      instance.
      
      The patch will prefer MMAP2 over MMAP.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1377079825-19057-3-git-send-email-eranian@google.com
      [ Cope with 314add6b "Change machine__findnew_thread() to set thread pid",
        fix 'perf test' regression test entry affected,
        use perf_missing_features.mmap2 to fallback to not using .mmap2 in older kernels,
        so that new tools can work with kernels where this feature is not present ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5c5e854b
  15. 30 8月, 2013 5 次提交
  16. 12 8月, 2013 1 次提交
  17. 08 8月, 2013 1 次提交
  18. 16 7月, 2013 1 次提交
  19. 01 4月, 2013 2 次提交
    • S
      perf tools: Add mem access sampling core support · 98a3b32c
      Stephane Eranian 提交于
      This patch adds the sorting and histogram support
      functions to enable profiling of memory accesses.
      
      The following sorting orders are added:
       - symbol_daddr: data address symbol (or raw address)
       - dso_daddr: data address shared object
       - locked: access uses locked transaction
       - tlb : TLB access
       - mem : memory level of the access (L1, L2, L3, RAM, ...)
       - snoop: access snoop mode
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1359040242-8269-12-git-send-email-eranian@google.com
      [ committer note: changed to cope with fc5871ed, the move of methods to
        machine.[ch], and the rename of dsrc to data_src, to match the change
        made in the PERF_SAMPLE_DSRC in a previous patch. ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      98a3b32c
    • A
      perf tools: Add support for weight v7 (modified) · 05484298
      Andi Kleen 提交于
      perf record has a new option -W that enables weightened sampling.
      
      Add sorting support in top/report for the average weight per sample and the
      total weight sum. This allows to both compare relative cost per event
      and the total cost over the measurement period.
      
      Add the necessary glue to perf report, record and the library.
      
      v2: Merge with new hist refactoring.
      v3: Fix manpage. Remove value check.
      Rename global_weight to weight and weight to local_weight.
      v4: Readd sort keys to manpage
      v5: Move weight to end
      v6: Move weight to template
      v7: Rename weight key.
      
      Original patch from Andi modified by Stephane Eranian <eranian@google.com>
      to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
      [ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      05484298
  20. 29 10月, 2012 1 次提交
  21. 07 10月, 2012 1 次提交
    • A
      perf event: No need to create a thread when handling PERF_RECORD_EXIT · f62d3f0f
      Arnaldo Carvalho de Melo 提交于
      When we were processing a PERF_RECORD_EXIT event we first used
      machine__findnew_thread for both the thread exiting and for its parent,
      only to use just the thread struct associated with the one exiting, and
      to just delete it.
      
      If it existed, i.e. not created at this very moment in
      machine__findnew_thread, it will be moved to the machine->dead_threads
      linked list, because we may have hist_entries pointing to it, but if it
      was created just do be deleted, it will just sit there with no
      references at all.
      
      Use the new machine__find_thread() method so that if it is not there, we
      don't create it.
      
      As a bonus the parent thread will also not be created at this point.
      
      Create process_fork() and process_exit() helpers to use this and make
      the builtins use it instead of the generic process_task(), ditched by
      this patch.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-z7n2y98ebjyrvmytaope4vdl@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f62d3f0f
  22. 11 9月, 2012 1 次提交
    • I
      perf tools: fix ALIGN redefinition in system headers · 9ac3e487
      Irina Tirdea 提交于
      On some systems (e.g. Android), ALIGN is defined in system headers as
      ALIGN(p).  The definition of ALIGN used in perf takes 2 parameters:
      ALIGN(x,a).  This leads to redefinition conflicts.
      
      Redefinition error on Android:
      In file included from util/include/linux/list.h:1:0,
      from util/callchain.h:5,
      from util/hist.h:6,
      from util/session.h:4,
      from util/build-id.h:4,
      from util/annotate.c:11:
      util/include/linux/kernel.h:11:0: error: "ALIGN" redefined [-Werror]
      bionic/libc/include/sys/param.h:38:0: note: this is the location of
      the previous definition
      
      Conflics with system defined ALIGN in Android:
      util/event.c: In function 'perf_event__synthesize_comm':
      util/event.c:115:32: error: macro "ALIGN" passed 2 arguments, but takes just 1
      util/event.c:115:9: error: 'ALIGN' undeclared (first use in this function)
      util/event.c:115:9: note: each undeclared identifier is reported only once for
      each function it appears in
      
      In order to avoid this redefinition, ALIGN is renamed to PERF_ALIGN.
      Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Irina Tirdea <irina.tirdea@intel.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1347315303-29906-5-git-send-email-irina.tirdea@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ac3e487
  23. 11 8月, 2012 1 次提交
    • J
      perf tools: Support user regs and stack in sample parsing · 0f6a3015
      Jiri Olsa 提交于
      Adding following info to be parsed out of the event sample:
       - user register set
       - user stack dump
      
      Both are global and specific to all events within the session.
      This info will be used in the unwind patches coming in shortly.
      
      Adding simple output printout (report -D) for both register and
      stack dumps.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Original-patch-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-11-git-send-email-jolsa@redhat.com
      [ Use evsel->attr.sample_regs_user ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0f6a3015
  24. 02 8月, 2012 1 次提交
  25. 09 3月, 2012 1 次提交
  26. 12 12月, 2011 1 次提交
  27. 02 12月, 2011 1 次提交
  28. 28 11月, 2011 3 次提交
  29. 24 9月, 2011 1 次提交
    • D
      perf tool: Fix endianness handling of u32 data in samples · 936be503
      David Ahern 提交于
      Currently, analyzing PPC data files on x86 the cpu field is always 0 and
      the tid and pid are backwards. For example, analyzing a PPC file on PPC
      the pid/tid fields show:
      
              rsyslogd  1210/1212
      
      and analyzing the same PPC file using an x86 perf binary shows:
      
              rsyslogd  1212/1210
      
      The problem is that the swap_op method for samples is
      perf_event__all64_swap which assumes all elements in the sample_data
      struct are u64s. cpu, tid and pid are u32s and need to be handled
      individually. Given that the swap is done before the sample is parsed,
      the simplest solution is to undo the 64-bit swap of those elements when
      the sample is parsed and do the proper swap.
      
      The RAW data field is generic and perf cannot have programmatic knowledge
      of how to treat that data. Instead a warning is given to the user.
      
      Thanks to Anton Blanchard for providing a data file for a mult-CPU
      PPC system so I could verify the fix for the CPU fields.
      
      v3 -> v4:
      - fixed use of WARN_ONCE
      
      v2 -> v3:
      - used WARN_ONCE for message regarding raw data
      - removed struct wrapper around union
      - fixed whitespace issues
      
      v1 -> v2:
      - added a union for undoing the byte-swap on u64 and redoing swap on
        u32's to address compiler errors (see git commit 65014ab3)
      
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1315321946-16993-1-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      936be503
  30. 03 6月, 2011 1 次提交
    • A
      perf evlist: Don't die if sample_{id_all|type} is invalid · 56722381
      Arnaldo Carvalho de Melo 提交于
      Fixes two more cases where the python binding would not load:
      
      . Not finding die(), which it shouldn't anyway, not good to just stop the
        world because some particular perf.data file is invalid, just propagate
        the error to the caller.
      
      . Not finding perf_sample_size: fix it by moving it from event.c to evsel,
        where it belongs, as most cases are moving to operate on an evsel object.o
      
      One of the fixed problems:
      
      [root@emilia ~]# python
      >>> import perf
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
      >>>
      [root@emilia ~]#
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      56722381
  31. 02 6月, 2011 1 次提交
    • A
      perf evlist: Don't die if sample_{id_all|type} is invalid · c2a70653
      Arnaldo Carvalho de Melo 提交于
      Fixes two more cases where the python binding would not load:
      
      . Not finding die(), which it shouldn't anyway, not good to just stop the
        world because some particular perf.data file is invalid, just propagate
        the error to the caller.
      
      . Not finding perf_sample_size: fix it by moving it from event.c to evsel,
        where it belongs, as most cases are moving to operate on an evsel object.o
      
      One of the fixed problems:
      
      [root@emilia ~]# python
      >>> import perf
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
      >>>
      [root@emilia ~]#
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2a70653
  32. 22 5月, 2011 1 次提交