1. 19 7月, 2017 3 次提交
    • D
      perf tools: Add feature header record to pipe-mode · e9def1b2
      David Carrillo-Cisneros 提交于
      Add header record types to pipe-mode, reusing the functions
      used in file-mode and leveraging the new struct feat_fd.
      
      For alignment, check that synthesized events don't exceed
      pagesize.
      
      Add the perf_event__synthesize_feature event call back to
      process the new header records.
      
      Before this patch:
      
        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
        ...
      
      After this patch:
        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
        # ========
        # captured on: Mon May 22 16:33:43 2017
        # ========
        #
        # hostname : my_hostname
        # os release : 4.11.0-dbx-up_perf
        # perf version : 4.11.rc6.g6277c80
        # arch : x86_64
        # nrcpus online : 72
        # nrcpus avail : 72
        # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
        # cpuid : GenuineIntel,6,63,2
        # total memory : 263457192 kB
        # cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1
        # HEADER_CPU_TOPOLOGY info available, use -I to display
        # HEADER_NUMA_TOPOLOGY info available, use -I to display
        # pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
        ...
      
      Support added for the subcommands: report, inject, annotate and script.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e9def1b2
    • D
      perf header: Add struct feat_fd for write · ccebbeb6
      David Carrillo-Cisneros 提交于
      Introduce struct feat_fd. This patch uses it as a wrapper around fd in
      write_* functions for feature headers. Next patches will extend its
      functionality to other feature header functions.
      
      This patch does not change behavior.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-7-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ccebbeb6
    • D
      perf header: Revamp do_write() · 3b8f51a6
      David Carrillo-Cisneros 提交于
      Now that writen takes a const buffer, use it in do_write instead of
      duplicating its functionality.
      
      Export do_write to use it consistently in header.c and build_id.c .
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-6-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3b8f51a6
  2. 04 10月, 2016 1 次提交
  3. 23 3月, 2016 1 次提交
  4. 17 2月, 2016 1 次提交
  5. 18 12月, 2015 7 次提交
  6. 14 9月, 2015 1 次提交
  7. 03 9月, 2015 1 次提交
    • K
      perf tools: Store the cpu socket and core ids in the perf.data header · 2bb00d2f
      Kan Liang 提交于
      This patch stores the cpu socket_id and core_id in a perf.data header,
      and reads them into the perf_env struct when processing perf.data files.
      
      The changes modifies the CPU_TOPOLOGY section, making sure it is
      backward/forward compatible.
      
      The patch checks the section size before reading the core and socket ids.
      
      It never reads data crossing the section boundary.  An old perf binary
      without this patch can also correctly read the perf.data from a new perf
      with this patch.
      
      Because the new info is added at the end of the cpu_topology section, an
      old perf tool ignores the extra data.
      
      Examples:
      
      1. New perf with this patch read perf.data from an old perf without the
         patch:
      
        $ perf_new report -i perf_old.data --header-only -I
        ......
        # sibling threads : 33
        # sibling threads : 34
        # sibling threads : 35
        # Core ID and Socket ID information is not available
        # node0 meminfo  : total = 32823872 kB, free = 29315548 kB
        # node0 cpu list : 0-17,36-53
        ......
      
      2. Old perf without the patch reads perf.data from a new perf with the
         patch:
      
        $ perf_old report -i perf_new.data --header-only -I
        ......
        # sibling threads : 33
        # sibling threads : 34
        # sibling threads : 35
        # node0 meminfo  : total = 32823872 kB, free = 29190932 kB
        # node0 cpu list : 0-17,36-53
        ......
      
      3. New perf read new perf.data:
      
        $ perf_new report -i perf_new.data --header-only -I
        ......
        # sibling threads : 33
        # sibling threads : 34
        # sibling threads : 35
        # CPU 0: Core ID 0, Socket ID 0
        # CPU 1: Core ID 1, Socket ID 0
        ......
        # CPU 61: Core ID 10, Socket ID 1
        # CPU 62: Core ID 11, Socket ID 1
        # CPU 63: Core ID 16, Socket ID 1
        # node0 meminfo  : total = 32823872 kB, free = 29190932 kB
        # node0 cpu list : 0-17,36-53
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1441115893-22006-2-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2bb00d2f
  8. 29 8月, 2015 1 次提交
  9. 22 7月, 2015 1 次提交
  10. 29 4月, 2015 1 次提交
  11. 05 11月, 2014 1 次提交
  12. 23 7月, 2014 1 次提交
  13. 02 5月, 2014 1 次提交
  14. 13 1月, 2014 1 次提交
    • A
      perf header: Pack 'struct perf_session_env' · 3ba4d2e1
      Arnaldo Carvalho de Melo 提交于
      Initial struct:
      
      [acme@ssdandy linux]$ pahole -C perf_session_env ~/bin/perf
      struct perf_session_env {
      	char *                     hostname;             /*     0     8 */
      	char *                     os_release;           /*     8     8 */
      	char *                     version;              /*    16     8 */
      	char *                     arch;                 /*    24     8 */
      	int                        nr_cpus_online;       /*    32     4 */
      	int                        nr_cpus_avail;        /*    36     4 */
      	char *                     cpu_desc;             /*    40     8 */
      	char *                     cpuid;                /*    48     8 */
      	long long unsigned int     total_mem;            /*    56     8 */
      	/* --- cacheline 1 boundary (64 bytes) --- */
      	int                        nr_cmdline;           /*    64     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	char *                     cmdline;              /*    72     8 */
      	int                        nr_sibling_cores;     /*    80     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	char *                     sibling_cores;        /*    88     8 */
      	int                        nr_sibling_threads;   /*    96     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	char *                     sibling_threads;      /*   104     8 */
      	int                        nr_numa_nodes;        /*   112     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	char *                     numa_nodes;           /*   120     8 */
      	/* --- cacheline 2 boundary (128 bytes) --- */
      	int                        nr_pmu_mappings;      /*   128     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	char *                     pmu_mappings;         /*   136     8 */
      	int                        nr_groups;            /*   144     4 */
      
      	/* size: 152, cachelines: 3, members: 20 */
      	/* sum members: 128, holes: 5, sum holes: 20 */
      	/* padding: 4 */
      	/* last cacheline: 24 bytes */
      };
      [acme@ssdandy linux]$
      
      [acme@ssdandy linux]$ pahole -C perf_session_env --reorganize --show_reorg_steps ~/bin/perf | grep ^/ | grep -v Final
      /* Moving 'nr_sibling_cores' from after 'cmdline' to after 'nr_cmdline' */
      /* Moving 'nr_numa_nodes' from after 'sibling_threads' to after 'nr_sibling_threads' */
      /* Moving 'nr_groups' from after 'pmu_mappings' to after 'nr_pmu_mappings' */
      [acme@ssdandy linux]$
      
      Final struct stats:
      
      [acme@ssdandy linux]$ pahole -C perf_session_env --reorganize --show_reorg_steps ~/bin/perf | tail -4
      	/* --- cacheline 2 boundary (128 bytes) --- */
      
      	/* size: 128, cachelines: 2, members: 20 */
      };   /* saved 24 bytes and 1 cacheline! */
      [acme@ssdandy linux]$
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-3d9tshamloinzxcqeb7mtd1n@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3ba4d2e1
  15. 18 7月, 2013 4 次提交
  16. 16 7月, 2013 2 次提交
  17. 13 7月, 2013 1 次提交
  18. 29 5月, 2013 1 次提交
  19. 01 2月, 2013 1 次提交
  20. 20 11月, 2012 1 次提交
    • D
      perf: Make perf build for x86 with UAPI disintegration applied · d2709c7c
      David Howells 提交于
      Make perf build for x86 once the UAPI disintegration patches for that arch
      have been applied by adding the appropriate -I flags - in the right order -
      and then converting some #includes that use ../.. notation to find main kernel
      headerfiles to use <asm/foo.h> and <linux/foo.h> instead.
      
      Note that -Iarch/foo/include/uapi is present _before_ -Iarch/foo/include.
      This makes sure we get the userspace version of the pt_regs struct.  Ideally,
      we wouldn't have the latter -I flag at all, but unfortunately we want
      asm/svm.h and asm/vmx.h in builtin-kvm.c and these aren't part of the UAPI -
      at least not for x86.  I wonder if the bits outside of the __KERNEL__ guards
      *should* be transferred there.
      
      I note also that perf seems to do its dependency handling manually by listing
      all the header files it might want to use in LIB_H in the Makefile.  Can this
      be changed to use -MD?
      
      Note that to do make this work, we need to export and UAPI disintegrate
      linux/hw_breakpoint.h, which I think should've been exported previously so that
      perf can access the bits.  We have to do this in the same patch to maintain
      bisectability.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d2709c7c
  21. 29 10月, 2012 1 次提交
  22. 15 10月, 2012 1 次提交
  23. 24 9月, 2012 2 次提交
  24. 21 9月, 2012 1 次提交
    • X
      perf kvm: Events analysis tool · bcf6edcd
      Xiao Guangrong 提交于
      Add 'perf kvm stat' support to analyze kvm vmexit/mmio/ioport smartly
      
      Usage:
      - kvm stat
        run a command and gather performance counter statistics, it is the alias of
        perf stat
      
      - trace kvm events:
        perf kvm stat record, or, if other tracepoints are interesting as well, we
        can append the events like this:
        perf kvm stat record -e timer:* -a
      
        If many guests are running, we can track the specified guest by using -p or
        --pid, -a is used to track events generated by all guests.
      
      - show the result:
        perf kvm stat report
      
      The output example is following:
      13005
      13059
      
      total 2 guests are running on the host
      
      Then, track the guest whose pid is 13059:
      ^C[ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.253 MB perf.data.guest (~11065 samples) ]
      
      See the vmexit events:
      
      Analyze events for all VCPUs:
      
                   VM-EXIT    Samples  Samples%     Time%         Avg time
      
               APIC_ACCESS        460    70.55%     0.01%     22.44us ( +-   1.75% )
                       HLT         93    14.26%    99.98% 832077.26us ( +-  10.42% )
        EXTERNAL_INTERRUPT         64     9.82%     0.00%     35.35us ( +-  14.21% )
         PENDING_INTERRUPT         24     3.68%     0.00%      9.29us ( +-  31.39% )
                 CR_ACCESS          7     1.07%     0.00%      8.12us ( +-   5.76% )
            IO_INSTRUCTION          3     0.46%     0.00%     18.00us ( +-  11.79% )
             EXCEPTION_NMI          1     0.15%     0.00%      5.83us ( +-   -nan% )
      
      Total Samples:652, Total events handled time:77396109.80us.
      
      See the mmio events:
      
      Analyze events for all VCPUs:
      
               MMIO Access    Samples  Samples%     Time%         Avg time
      
              0xfee00380:W        387    84.31%    79.28%      8.29us ( +-   3.32% )
              0xfee00300:W         24     5.23%     9.96%     16.79us ( +-   1.97% )
              0xfee00300:R         24     5.23%     7.83%     13.20us ( +-   3.00% )
              0xfee00310:W         24     5.23%     2.93%      4.94us ( +-   3.84% )
      
      Total Samples:459, Total events handled time:4044.59us.
      
      See the ioport event:
      
      Analyze events for all VCPUs:
      
            IO Port Access    Samples  Samples%     Time%         Avg time
      
               0xc050:POUT          3   100.00%   100.00%     13.75us ( +-  10.83% )
      
      Total Samples:3, Total events handled time:41.26us.
      
      And, --vcpu is used to track the specified vcpu and --key is used to sort the
      result:
      
      Analyze events for VCPU 0:
      
                   VM-EXIT    Samples  Samples%     Time%         Avg time
      
                       HLT         27    13.85%    99.97% 405790.24us ( +-  12.70% )
        EXTERNAL_INTERRUPT         13     6.67%     0.00%     27.94us ( +-  22.26% )
               APIC_ACCESS        146    74.87%     0.03%     21.69us ( +-   2.91% )
            IO_INSTRUCTION          2     1.03%     0.00%     17.77us ( +-  20.56% )
                 CR_ACCESS          2     1.03%     0.00%      8.55us ( +-   6.47% )
         PENDING_INTERRUPT          5     2.56%     0.00%      6.27us ( +-   3.94% )
      
      Total Samples:195, Total events handled time:10959950.90us.
      Signed-off-by: NDong Hao <haodong@linux.vnet.ibm.com>
      Signed-off-by: NRunzhen Wang <runzhen@linux.vnet.ibm.com>
      [ Dong Hao <haodong@linux.vnet.ibm.com>
        Runzhen Wang <runzhen@linux.vnet.ibm.com>:
           - rebase it on current acme's tree
           - fix the compiling-error on i386 ]
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1347870675-31495-4-git-send-email-haodong@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bcf6edcd
  25. 11 9月, 2012 1 次提交
    • J
      perf tools: Back [vdso] DSO with real data · 7dbf4dcf
      Jiri Olsa 提交于
      Storing data for VDSO shared object, because we need it for the post
      unwind processing.
      
      The VDSO shared object is same for all process on a running system, so
      it makes no difference when we store it inside the tracer - perf.
      
      When [vdso] map memory is hit, we retrieve [vdso] DSO image and store it
      into temporary file.
      
      During the build-id processing phase, the [vdso] DSO image is stored in
      build-id db, and build-id reference is made inside perf.data. The
      build-id vdso file object is called '[vdso]'. We don't use temporary
      file name which gets removed when record is finished.
      
      During report phase the vdso build-id object is treated as any other
      build-id DSO object.
      
      Adding following API for vdso object:
      
        bool is_vdso_map(const char *filename)
          - returns true if the filename matches vdso map name
      
        struct dso *vdso__dso_findnew(struct list_head *head)
          - find/create proper vdso DSO object
      
        vdso__exit(void)
          - removes temporary VDSO image if there's any
      
      This change makes backtrace dwarf post unwind possible from [vdso] maps.
      
      Following output is current report of [vdso] sample dwarf backtrace:
      
        # Overhead  Command      Shared Object                         Symbol
        # ........  .......  .................  .............................
        #
            99.52%       ex  [vdso]             [.] 0x00007fff3ace89af
                         |
                         --- 0x7fff3ace89af
      
      Following output is new report of [vdso] sample dwarf backtrace:
      
        # Overhead  Command      Shared Object                         Symbol
        # ........  .......  .................  .............................
        #
            99.52%       ex  [vdso]             [.] 0x00000000000009af
                         |
                         --- 0x7fff3ace89af
                             main
                             __libc_start_main
                             _start
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1347295819-23177-5-git-send-email-jolsa@redhat.com
      [ committer note: s/ALIGN/PERF_ALIGN/g to cope with the android build changes ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7dbf4dcf
  26. 23 8月, 2012 1 次提交
    • R
      perf tools: Add pmu mappings to header information · 50a9667c
      Robert Richter 提交于
      With dynamic pmu allocation there are also dynamically assigned pmu ids.
      These ids are used in event->attr.type to describe the pmu to be used
      for that event. The information is available in sysfs, e.g:
      
       /sys/bus/event_source/devices/breakpoint/type: 5
       /sys/bus/event_source/devices/cpu/type: 4
       /sys/bus/event_source/devices/ibs_fetch/type: 6
       /sys/bus/event_source/devices/ibs_op/type: 7
       /sys/bus/event_source/devices/software/type: 1
       /sys/bus/event_source/devices/tracepoint/type: 2
      
      These mappings are needed to know which samples belong to which pmu.  If
      a pmu is added dynamically like for ibs_fetch or ibs_op the type value
      may vary.
      
      Now, when decoding samples from perf.data this information in sysfs
      might be no longer available or may have changed. We need to store it in
      perf.data. Using the header for this. Now the header information created
      with perf report contains an additional section looking like this:
      
       # pmu mappings: ibs_op = 7, ibs_fetch = 6, cpu = 4, breakpoint = 5, tracepoint = 2, software = 1
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1345144224-27280-9-git-send-email-robert.richter@amd.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      50a9667c
  27. 17 8月, 2012 1 次提交