1. 04 10月, 2013 2 次提交
  2. 12 8月, 2013 1 次提交
  3. 01 4月, 2013 1 次提交
    • A
      perf tools: Add support for weight v7 (modified) · 05484298
      Andi Kleen 提交于
      perf record has a new option -W that enables weightened sampling.
      
      Add sorting support in top/report for the average weight per sample and the
      total weight sum. This allows to both compare relative cost per event
      and the total cost over the measurement period.
      
      Add the necessary glue to perf report, record and the library.
      
      v2: Merge with new hist refactoring.
      v3: Fix manpage. Remove value check.
      Rename global_weight to weight and weight to local_weight.
      v4: Readd sort keys to manpage
      v5: Move weight to end
      v6: Move weight to template
      v7: Rename weight key.
      
      Original patch from Andi modified by Stephane Eranian <eranian@google.com>
      to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
      [ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      05484298
  4. 03 3月, 2013 1 次提交
  5. 16 2月, 2013 1 次提交
  6. 07 2月, 2013 1 次提交
    • D
      perf evlist: Make event_copy local to mmaps · 0479b8b9
      David Ahern 提交于
      I am getting segfaults *after* the time sorting of perf samples where
      the event type is off the charts:
      
      (gdb) bt
      \#0  0x0807b1b2 in hists__inc_nr_events (hists=0x80a99c4, type=1163281902) at util/hist.c:1225
      \#1  0x08070795 in perf_session_deliver_event (session=0x80a9b90, event=0xf7a6aff8, sample=0xffffc318, tool=0xffffc520,
          file_offset=0) at util/session.c:884
      \#2  0x0806f9b9 in flush_sample_queue (s=0x80a9b90, tool=0xffffc520) at util/session.c:555
      \#3  0x0806fc53 in process_finished_round (tool=0xffffc520, event=0x0, session=0x80a9b90) at util/session.c:645
      
      This is bizarre because the event has already been processed once --
      before it was added to the samples queue -- and the event was found to
      be sane at that time.
      
      There seem to be 2 causes:
      
      1. perf_evlist__mmap_read updates the read location even though there
      are outstanding references to events sitting in the mmap buffers via the
      ordered samples queue.
      
      2. There is a single evlist->event_copy for all evlist entries.
      event_copy is used to handle an event wrapping at the mmap buffer
      boundary.
      
      This patch addresses the second problem - making event_copy local to
      each perf_mmap. With this change my highly repeatable use case no longer
      fails.
      
      The first problem is much more complicated and will be the subject of a
      future patch.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1360098762-61827-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0479b8b9
  7. 25 1月, 2013 2 次提交
  8. 20 11月, 2012 2 次提交
    • D
      perf: Make perf build for x86 with UAPI disintegration applied · d2709c7c
      David Howells 提交于
      Make perf build for x86 once the UAPI disintegration patches for that arch
      have been applied by adding the appropriate -I flags - in the right order -
      and then converting some #includes that use ../.. notation to find main kernel
      headerfiles to use <asm/foo.h> and <linux/foo.h> instead.
      
      Note that -Iarch/foo/include/uapi is present _before_ -Iarch/foo/include.
      This makes sure we get the userspace version of the pt_regs struct.  Ideally,
      we wouldn't have the latter -I flag at all, but unfortunately we want
      asm/svm.h and asm/vmx.h in builtin-kvm.c and these aren't part of the UAPI -
      at least not for x86.  I wonder if the bits outside of the __KERNEL__ guards
      *should* be transferred there.
      
      I note also that perf seems to do its dependency handling manually by listing
      all the header files it might want to use in LIB_H in the Makefile.  Can this
      be changed to use -MD?
      
      Note that to do make this work, we need to export and UAPI disintegrate
      linux/hw_breakpoint.h, which I think should've been exported previously so that
      perf can access the bits.  We have to do this in the same patch to maintain
      bisectability.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d2709c7c
    • S
      perf powerpc: Use uapi/unistd.h to fix build error · f2d9cae9
      Sukadev Bhattiprolu 提交于
      Use the 'unistd.h' from arch/powerpc/include/uapi to build the perf tool.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Anton Blanchard <anton@au1.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: linuxppc-dev@ozlabs.org
      Link: http://lkml.kernel.org/r/20121107191818.GA16211@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f2d9cae9
  9. 15 11月, 2012 3 次提交
  10. 02 11月, 2012 1 次提交
  11. 01 11月, 2012 1 次提交
  12. 29 10月, 2012 1 次提交
  13. 17 10月, 2012 1 次提交
  14. 15 10月, 2012 1 次提交
  15. 17 9月, 2012 1 次提交
  16. 12 8月, 2012 1 次提交
    • J
      perf tools: Support for DWARF mode callchain · 26d33022
      Jiri Olsa 提交于
      This patch enables perf to use the DWARF unwind code.
      
      It extends the perf record '-g' option with following arguments:
        'fp'           - provides framepointer based user
                         stack backtrace
        'dwarf[,size]' - provides DWARF (libunwind) based user stack
                         backtrace. The size specifies the size of the
                         user stack dump. If omitted it is 8192 by default.
      
      If libunwind is found during the perf build, then the 'dwarf' argument
      becomes available for record command. The 'fp' stays as default option
      in any case.
      
      Examples: (perf compiled with libunwind)
      
         perf record -g dwarf ls
            - provides dwarf unwind with 8192 as stack dump size
      
         perf record -g dwarf,4096 ls
            - provides dwarf unwind with 4096 as stack dump size
      
         perf record -g -- ls
         perf record -g fp ls
            - provides frame pointer unwind
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Original-patch-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-13-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      26d33022
  17. 26 5月, 2012 1 次提交
  18. 03 5月, 2012 2 次提交
  19. 14 3月, 2012 1 次提交
    • I
      perf tools, x86: Build perf on older user-space as well · eae7a755
      Ingo Molnar 提交于
      On ancient systems I get this build failure:
      
        util/../../../arch/x86/include/asm/unistd.h:67:29: error: asm/unistd_64.h: No such file or directory
        In file included from util/cache.h:7,
                         from builtin-test.c:8:
        util/../perf.h: In function ‘sys_perf_event_open’:In file included from util/../perf.h:16
        perf.h:170: error: ‘__NR_perf_event_open’ undeclared (first use in this function)
      
      The reason is that this old system does not have the split
      unistd.h headers yet, from which to pick up the syscall
      definitions.
      
      Add the syscall numbers to the already existing i386 and x86_64
      blocks in perf.h, and also provide empty include file stubs.
      
      With this patch perf builds and works fine on 5 years old
      user-space as well.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/n/tip-jctwg64le1w47tuaoeyftsg9@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eae7a755
  20. 09 3月, 2012 2 次提交
  21. 03 3月, 2012 1 次提交
  22. 15 2月, 2012 2 次提交
  23. 14 2月, 2012 2 次提交
  24. 25 1月, 2012 1 次提交
  25. 20 12月, 2011 1 次提交
    • A
      perf record: Add ability to record event period · 3e76ac78
      Andrew Vagin 提交于
      The problem is that when SAMPLE_PERIOD is not set, the kernel generates
      a number of samples in proportion to an event's period. Number of these
      samples may be too big and the kernel throttles all samples above a
      defined limit.
      
      E.g.: I want to trace when a process sleeps. I created a process which
      sleeps for 1ms and for 4ms.  perf got 100 events in both cases.
      
      swapper 0 [000] 1141.371830: sched_stat_sleep: comm=foo pid=1801 delay=1386750 [ns]
      swapper 0 [000] 1141.369444: sched_stat_sleep: comm=foo pid=1801 delay=4499585 [ns]
      
      In the first case a kernel want to send 4499585 events and in the second
      case it wants to send 1386750 events.  perf-reports shows that process
      sleeps in both places equal time.
      
      Instead of this we can get only one sample with an attribute period. As
      result we have less data transferring between kernel and user-space and
      we avoid throttling of samples.
      
      The patch "events: Don't divide events if it has field period" added a
      kernel part of this functionality.
      Acked-by: NArun Sharma <asharma@fb.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: devel@openvz.org
      Link: http://lkml.kernel.org/r/1324391565-1369947-1-git-send-email-avagin@openvz.orgSigned-off-by: NAndrew Vagin <avagin@openvz.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3e76ac78
  26. 28 11月, 2011 4 次提交
  27. 13 10月, 2011 1 次提交
  28. 08 10月, 2011 1 次提交
    • S
      perf tools: Make perf.data more self-descriptive (v8) · fbe96f29
      Stephane Eranian 提交于
      The goal of this patch is to include more information about the host
      environment into the perf.data so it is more self-descriptive. Overtime,
      profiles are captured on various machines and it becomes hard to track
      what was recorded, on what machine and when.
      
      This patch provides a way to solve this by extending the perf.data file
      with basic information about the host machine. To add those extensions,
      we leverage the feature bits capabilities of the perf.data format.  The
      change is backward compatible with existing perf.data files.
      
      We define the following useful new extensions:
       - HEADER_HOSTNAME: the hostname
       - HEADER_OSRELEASE: the kernel release number
       - HEADER_ARCH: the hw architecture
       - HEADER_CPUDESC: generic CPU description
       - HEADER_NRCPUS: number of online/avail cpus
       - HEADER_CMDLINE: perf command line
       - HEADER_VERSION: perf version
       - HEADER_TOPOLOGY: cpu topology
       - HEADER_EVENT_DESC: full event description (attrs)
       - HEADER_CPUID: easy-to-parse low level CPU identication
      
      The small granularity for the entries is to make it easier to extend
      without breaking backward compatiblity. Many entries are provided as
      ASCII strings.
      
      Perf report/script have been modified to print the basic information as
      easy-to-parse ASCII strings. Extended information about CPU and NUMA
      topology may be requested with the -I option.
      
      Thanks to David Ahern for reviewing and testing the many versions of
      this patch.
      
       $ perf report --stdio
       # ========
       # captured on : Mon Sep 26 15:22:14 2011
       # hostname : quad
       # os release : 3.1.0-rc4-tip
       # perf version : 3.1.0-rc4
       # arch : x86_64
       # nrcpus online : 4
       # nrcpus avail : 4
       # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
       # cpuid : GenuineIntel,6,15,11
       # total memory : 8105360 kB
       # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
       # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
       # HEADER_CPU_TOPOLOGY info available, use -I to display
       # HEADER_NUMA_TOPOLOGY info available, use -I to display
       # ========
       #
       ...
      
       $ perf report --stdio -I
       # ========
       # captured on : Mon Sep 26 15:22:14 2011
       # hostname : quad
       # os release : 3.1.0-rc4-tip
       # perf version : 3.1.0-rc4
       # arch : x86_64
       # nrcpus online : 4
       # nrcpus avail : 4
       # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
       # cpuid : GenuineIntel,6,15,11
       # total memory : 8105360 kB
       # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
       # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
       # sibling cores   : 0-3
       # sibling threads : 0
       # sibling threads : 1
       # sibling threads : 2
       # sibling threads : 3
       # node0 meminfo  : total = 8320608 kB, free = 7571024 kB
       # node0 cpu list : 0-3
       # ========
       #
       ...
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/20110930134040.GA5575@quadSigned-off-by: NStephane Eranian <eranian@google.com>
      [ committer notes: Use --show-info in the tools as was in the docs, rename
        perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
        conflict with f69b64f7 "perf: Support setting the disassembler style" ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fbe96f29