1. 03 10月, 2017 1 次提交
    • K
      perf top: Implement multithreading for perf_event__synthesize_threads · 340b47f5
      Kan Liang 提交于
      The proc files which is sorted with alphabetical order are evenly
      assigned to several synthesize threads to be processed in parallel.
      
      For 'perf top', the threads number hard code to online CPU number. The
      following patch will introduce an option to set it.
      
      For other perf tools, the thread number is 1. Because the process
      function is not ready for multithreading, e.g.
      process_synthesized_event.
      
      This patch series only support event synthesize multithreading for 'perf
      top'. For other tools, it can be done separately later.
      
      With multithread applied, the total processing time can get up to 1.56x
      speedup on Knights Mill for 'perf top'.
      
      For specific single event processing, the processing time could increase
      because of the lock contention. So proc_map_timeout may need to be
      increased. Otherwise some proc maps will be truncated.
      
      Based on my test, increasing the proc_map_timeout has small impact
      on the total processing time. The total processing time still get 1.49x
      speedup on Knights Mill after increasing the proc_map_timeout.
      The patch itself doesn't increase the proc_map_timeout.
      
      Doesn't need to implement multithreading for per task monitoring,
      perf_event__synthesize_thread_map. It doesn't have performance issue.
      
      Committer testing:
      
        # getconf _NPROCESSORS_ONLN
        4
        # perf trace --no-inherit -e clone -o /tmp/output perf top
        # tail -4 /tmp/bla
           0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf)
           0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf)
           0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf)
         246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf)
        #
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      340b47f5
  2. 02 9月, 2017 1 次提交
  3. 19 7月, 2017 2 次提交
    • J
      perf util: Create branch.c/.h for common branch functions · 992c7e92
      Jin Yao 提交于
      Create new util/branch.c and util/branch.h to contain the common branch
      functions. Such as:
      
      branch_type_count(): Count the numbers of branch types
      branch_type_name() : Return the name of branch type
      branch_type_stat_display(): Display branch type statistics info
      branch_type_str(): Construct the branch type string.
      
      The branch type is saved in branch_flags.
      
      Change log:
      
      v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.
      
      v7: Since the common branch type name is changed (e.g. JCC->COND),
          this patch is performed the modification accordingly.
      
      v6: Move that multiline conditional code inside {} brackets.
          Move branch_type_stat_display() from builtin-report.c to
            branch.c.
          Move branch_type_str() from callchain.c to branch.c.
      
      v5: It's a new patch in v5 patch series.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1500379995-6449-6-git-send-email-yao.jin@linux.intel.com
      [ Don't use 'index' and 'stat' as names for variables, it shadows global decls in older distros ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      992c7e92
    • D
      perf tools: Add feature header record to pipe-mode · e9def1b2
      David Carrillo-Cisneros 提交于
      Add header record types to pipe-mode, reusing the functions
      used in file-mode and leveraging the new struct feat_fd.
      
      For alignment, check that synthesized events don't exceed
      pagesize.
      
      Add the perf_event__synthesize_feature event call back to
      process the new header records.
      
      Before this patch:
      
        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
        ...
      
      After this patch:
        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
        # ========
        # captured on: Mon May 22 16:33:43 2017
        # ========
        #
        # hostname : my_hostname
        # os release : 4.11.0-dbx-up_perf
        # perf version : 4.11.rc6.g6277c80
        # arch : x86_64
        # nrcpus online : 72
        # nrcpus avail : 72
        # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
        # cpuid : GenuineIntel,6,63,2
        # total memory : 263457192 kB
        # cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1
        # HEADER_CPU_TOPOLOGY info available, use -I to display
        # HEADER_NUMA_TOPOLOGY info available, use -I to display
        # pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
        ...
      
      Support added for the subcommands: report, inject, annotate and script.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e9def1b2
  4. 30 6月, 2017 1 次提交
  5. 27 6月, 2017 1 次提交
    • A
      perf script: Add 'synth' event type for synthesized events · 1405720d
      Adrian Hunter 提交于
      Instruction trace decoders such as Intel PT may have additional information
      recorded in the trace. For example, Intel PT has power information and a
      there is a new instruction 'ptwrite' that can write a value into a PTWRITE
      trace packet.
      
      Such information may be associated with an IP and so can be treated as a
      sample (PERF_RECORD_SAMPLE). Custom data can be incorporated in the
      sample as raw_data (PERF_SAMPLE_RAW).
      
      However a means of identifying the raw data format is needed. That will
      be done by synthesizing an attribute for it.
      
      So add an attribute type for custom synthesized events.  Different
      synthesized events will be identified by the attribute 'config'.
      
      Committer notes:
      
      Start those PERF_TYPE_ after the PMU range, i.e. after (INT_MAX + 1U),
      i.e. after perf_pmu_register() -> idr_alloc(end=0).
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1498040239-32418-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1405720d
  6. 04 5月, 2017 1 次提交
  7. 03 5月, 2017 1 次提交
    • A
      perf symbols: Accept symbols starting at address 0 · b843f62a
      Arnaldo Carvalho de Melo 提交于
      That is the case of _text on s390, and we have some functions that return an
      address, using address zero to report problems, oops.
      
      This would lead the symbol loading routines to not use "_text" as the reference
      relocation symbol, or the first symbol for the kernel, but use instead
      "_stext", that is at the same address on x86_64 and others, but not on s390:
      
        [acme@localhost perf-4.11.0-rc6]$ head -15 /proc/kallsyms
        0000000000000000 T _text
        0000000000000418 t iplstart
        0000000000000800 T start
        000000000000080a t .base
        000000000000082e t .sk8x8
        0000000000000834 t .gotr
        0000000000000842 t .cmd
        0000000000000846 t .parm
        000000000000084a t .lowcase
        0000000000010000 T startup
        0000000000010010 T startup_kdump
        0000000000010214 t startup_kdump_relocated
        0000000000011000 T startup_continue
        00000000000112a0 T _ehead
        0000000000100000 T _stext
        [acme@localhost perf-4.11.0-rc6]$
      
      Which in turn would make 'perf test vmlinux' to fail because it wouldn't find
      the symbols before "_stext" in kallsyms.
      
      Fix it by using the return value only for errors and storing the
      address, when the symbol is successfully found, in a provided pointer
      arg.
      
      Before this patch:
      
      After:
      
        [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1
         1: vmlinux symtab matches kallsyms            :
        --- start ---
        test child forked, pid 40693
        Looking at the vmlinux_path (8 entries long)
        Using /usr/lib/debug/lib/modules/3.10.0-654.el7.s390x/vmlinux for symbols
        ERR : 0: _text not on kallsyms
        ERR : 0x418: iplstart not on kallsyms
        ERR : 0x800: start not on kallsyms
        ERR : 0x80a: .base not on kallsyms
        ERR : 0x82e: .sk8x8 not on kallsyms
        ERR : 0x834: .gotr not on kallsyms
        ERR : 0x842: .cmd not on kallsyms
        ERR : 0x846: .parm not on kallsyms
        ERR : 0x84a: .lowcase not on kallsyms
        ERR : 0x10000: startup not on kallsyms
        ERR : 0x10010: startup_kdump not on kallsyms
        ERR : 0x10214: startup_kdump_relocated not on kallsyms
        ERR : 0x11000: startup_continue not on kallsyms
        ERR : 0x112a0: _ehead not on kallsyms
        <SNIP warnings>
        test child finished with -1
        ---- end ----
        vmlinux symtab matches kallsyms: FAILED!
        [acme@localhost perf-4.11.0-rc6]$
      
      After:
      
        [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1
         1: vmlinux symtab matches kallsyms            :
        --- start ---
        test child forked, pid 47160
        <SNIP warnings>
        test child finished with 0
        ---- end ----
        vmlinux symtab matches kallsyms: Ok
        [acme@localhost perf-4.11.0-rc6]$
      Reported-by: NMichael Petlan <mpetlan@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-9x9bwgd3btwdk1u51xie93fz@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b843f62a
  8. 26 4月, 2017 1 次提交
  9. 25 4月, 2017 1 次提交
  10. 17 3月, 2017 1 次提交
    • A
      perf tools: Handle partial AUX records and print a warning · 05a1f47e
      Alexander Shishkin 提交于
      This patch decodes the 'partial' flag in AUX records and prints
      a warning to the user, so that they don't have to guess why their
      PT traces contain gaps (or missing altogether):
      
        Warning:
        AUX data had gaps in it 8 times out of 8!
      
        Are you running a KVM guest in the background?
      
      Trying to be even more helpful, we will detect if the user's kvm driver sets up
      exclusive VMX root mode for the entire lifespan of the kvm process:
      
        Reloading kvm_intel module with vmm_exclusive=0
        will reduce the gaps to only guest's timeslices.
      
      Note however, that you'll still have gaps in cpu-wide traces even with
      vmm_exclusive=0, but the number of gaps will be below 100% (as opposed to the
      above example).
      
      Currently this is the only reason for partial records.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vince@deater.net>
      Link: http://lkml.kernel.org/r/8760j941ig.fsf@ashishki-desk.ger.corp.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      05a1f47e
  11. 15 3月, 2017 1 次提交
    • H
      perf record: Synthesize namespace events for current processes · e907caf3
      Hari Bathini 提交于
      Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior
      to invocation of perf record. The data for this is taken from /proc/$PID/ns.
      These changes make way for analyzing events with regard to namespaces.
      
      Committer notes:
      
      Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the
      test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread".
      
      Testing it:
      
        # ps axH > /tmp/allthreads
        # perf record -a --namespaces usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ]
        # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l
        602
        # wc -l /tmp/allthreads
        601 /tmp/allthreads
        # tail /tmp/allthreads
        16951 pts/4    T      0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^
        16952 pts/4    T      0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^
        17176 pts/4    T      0:00 git commit --amend --no-post-rewrite
        17204 pts/4    T      0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG
        18939 ?        S      0:00 [kworker/2:1]
        18947 ?        S      0:00 [kworker/3:0]
        18974 ?        S      0:00 [kworker/1:0]
        19047 ?        S      0:00 [kworker/0:1]
        19152 pts/6    S+     0:00 weechat
        19153 pts/7    R+     0:00 ps axH
        # perf report -D | grep PERF_RECORD_NAMESPACES | tail
        0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7
        0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7
        0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7
        0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7
        0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7
        0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7
        0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7
        0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7
        0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
        0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
        #
      
      Humm, investigate why we got two record for the 19155 pid/tid...
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e907caf3
  12. 14 3月, 2017 1 次提交
    • H
      perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info · f3b3614a
      Hari Bathini 提交于
      Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
      by the kernel when fork, clone, setns or unshare are invoked. And update
      perf-record documentation with the new option to record namespace
      events.
      
      Committer notes:
      
      Combined it with a later patch to allow printing it via 'perf report -D'
      and be able to test the feature introduced in this patch. Had to move
      here also perf_ns__name(), that was introduced in another later patch.
      
      Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
      
        util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
           ret  += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
                                               ^
      Testing it:
      
        # perf record --namespaces -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
        #
        # perf report -D
        <SNIP>
        3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
                      [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                       4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      
        0x1151e0 [0x30]: event: 9
        .
        . ... raw event: size 48 bytes
        .  0000:  09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00  ......0..q.h....
        .  0010:  a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00  .9...9...(.c....
        .  0020:  03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00  ................
        <SNIP>
              NAMESPACES events:          1
        <SNIP>
        #
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b3614a
  13. 24 10月, 2016 1 次提交
  14. 25 7月, 2016 1 次提交
  15. 13 7月, 2016 1 次提交
  16. 31 3月, 2016 1 次提交
    • A
      perf tools: Add time conversion event · 46bc29b9
      Adrian Hunter 提交于
      Intel PT uses the time members from the perf_event_mmap_page to convert
      between TSC and perf time.
      
      Due to a lack of foresight when Intel PT was implemented, those time
      members were recorded in the (implementation dependent) AUXTRACE_INFO
      event, the structure of which is generally inaccessible outside of the
      Intel PT decoder.  However now the conversion between TSC and perf time
      is needed when processing a jitdump file when Intel PT has been used for
      tracing.
      
      So add a user event to record the time members.  'perf record' will
      synthesize the event if the information is available.  And session
      processing will put a copy of the event on the session so that tools
      like 'perf inject' can easily access it.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      46bc29b9
  17. 23 3月, 2016 3 次提交
  18. 18 12月, 2015 18 次提交
  19. 29 9月, 2015 1 次提交
  20. 23 9月, 2015 1 次提交
    • N
      perf record: Synthesize COMM event for a command line workload · e803cf97
      Namhyung Kim 提交于
      When perf creates a new child to profile, the events are enabled on
      exec().  And in this case, it doesn't synthesize any event for the
      child since they'll be generated during exec().  But there's an window
      between the enabling and the event generation.
      
      It used to be overcome since samples are only in kernel (so we always
      have the map) and the comm is overridden by a later COMM event.
      However it won't work if events are processed and displayed before the
      COMM event overrides like in 'perf script'.  This leads to those early
      samples (like native_write_msr_safe) not having a comm but pid (like
      ':15328').
      
      So it needs to synthesize COMM event for the child explicitly before
      enabling so that it can have a correct comm.  But at this time, the
      comm will be "perf" since it's not exec-ed yet.
      
      Committer note:
      
      Before this patch:
      
        # perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
        # perf script --show-task-events
          :4429  4429 27909.079372:          1 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
          :4429  4429 27909.079375:          1 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
          :4429  4429 27909.079376:         10 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
          :4429  4429 27909.079377:        223 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
          :4429  4429 27909.079378:       6571 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
         usleep  4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429
         usleep  4429 27909.079381:     185403 cycles:  ffffffff810a72d3 flush_signal_handlers (/lib/modules/4.
         usleep  4429 27909.079444:    2241110 cycles:      7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so)
         usleep  4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429)
      
      After:
      
        # perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
        # perf script --show-task
           perf     0     0.000000: PERF_RECORD_COMM: perf:8446/8446
           perf  8446 30154.038944:          1 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
           perf  8446 30154.038948:          1 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
           perf  8446 30154.038949:          9 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
           perf  8446 30154.038950:        230 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
           perf  8446 30154.038951:       6772 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
         usleep  8446 30154.038952: PERF_RECORD_COMM exec: usleep:8446/8446
         usleep  8446 30154.038954:     196923 cycles:  ffffffff81766440 _raw_spin_lock (/lib/modules/4.3.0-rc1
         usleep  8446 30154.039021:    2292130 cycles:      7f609a173dc4 memcpy (/usr/lib64/ld-2.20.so)
         usleep  8446 30154.039349: PERF_RECORD_EXIT(8446:8446):(8446:8446)
        #
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1442881495-2928-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e803cf97