1. 25 7月, 2017 2 次提交
    • T
      perf report: Fix kernel symbol adjustment for s390x · cf6383f7
      Thomas Richter 提交于
      On s390x the kernel text segment starts at address 0x0.  When perf
      report reads kernel symbols from vmlinux file it adds an offset of
      0x1000.
      
      For example see symbol set_reset_devices:
      
        [root@s8360047 linux-devel]# nm -A vmlinux| fgrep set_reset_devices
        vmlinux:0000000001379000 t set_reset_devices
        [root@s8360047 linux-devel]#
      
        [root@s8360047 linux-devel]# fgrep set_reset_devices /proc/kallsyms
        0000000001379000 t set_reset_devices
        [root@s8360047 linux-devel]#
      
      The kernel symbol table and the vmlinux file have the same address for
      symbol set_reset_devices namely 1379000.
      
      When perf report reads this symbols it displays it with address
      symbol__new: set_reset_devices 0x137a000-0x137a018
      
      There is a difference between perf report and vmlinux of 0x1000.
      
      The reason for the difference is at kernel symbol load time in function
      dso__load_sym(). The vmlinux file is investigated with its ELF header.
      Command readelf shows this:
      
        Section Headers:
        [Nr] Name              Type             Address           Offset
             Size              EntSize          Flags  Link  Info  Align
        [ 0]                   NULL             0000000000000000  00000000
             0000000000000000  0000000000000000           0     0     0
        [ 1] .text             PROGBITS         0000000000000000  00001000
             0000000000b0e0c2  0000000000000000  AX       0     0     128
      
      This leads to an invalid calculation of the symbol start address, see
      file utit/symbol-elf.c line 974:
      
              /* Adjust symbol to map to file offset */
              if (adjust_kernel_syms)
                      sym.st_value -= shdr.sh_addr - shdr.sh_offset;
      
      With shdr.sh_addr set to 0x0 and shdr.sh_offset set to 0x1000 as read
      from the ELF .text section 0x1000 is added to the symbol address.
      
      I would like to fix this by introducing an archticture specific function
      named elf__needs_adjust_symbols(). This is the same approach as done by
      PowerPC.  The function currently does not exist for s390x and the
      default weak one is used.  The s390x specific one returns false when
      symsrc_init() is invoked for kernel symbols and results in variable
      adjust_kernel_syms being false.  This omits the adjustment and the
      correct address is displayed (when symbol resolvement does not work).
      
      The s390x specific function returns false for kernel symbol adjustment
      and returns true for kernel modules, processes and shared libraries.
      Signed-off-by: NThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      LPU-Reference: 20170713130252.6167-1-tmricht@linux.vnet.ibm.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cf6383f7
    • T
      perf annotate stdio: Fix --show-total-period · 585d93c5
      Taeung Song 提交于
      We were showing the total number of samples, not the total period as
      asked by the user, fix it.
      Reported-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Link: http://lkml.kernel.org/n/tip-lh2nh89rtqn5x5vbfthw6qml@git.kernel.org
      Fixes: 0c4a5bce ("perf annotate: Display total number of samples with --show-total-period")
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      585d93c5
  2. 21 7月, 2017 8 次提交
  3. 20 7月, 2017 15 次提交
    • A
      perf trace: Introduce filter_loop_pids() · dd1a5037
      Arnaldo Carvalho de Melo 提交于
      No change in functionality, just to make clearer that what we want when
      filtering the tracer pid in a system wide tracing session is to avoid a
      feedback loop.
      
      This also paves the way for a more interesting loop avoidance algorithm,
      one that tries to figure out if we are in a ssh session, xterm, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-5fcttc5kdjkcyp9404ezkuy9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dd1a5037
    • A
      perf trace beauty clone: Suppress unused args according to 'flags' arg · 15bed274
      Arnaldo Carvalho de Melo 提交于
      The 'parent_tidptr', 'child_tidptr' and 'tls' arguments to the 'clone'
      syscall are only used when certain flags are set in 'flags', suppress
      them when those aren't there.
      
      E.g:
      
         9886.919 (0.236 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19608 (fetchmail)
        12876.052 (0.249 ms): qemu-system-x8/21238 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f48117fc770, parent_tidptr: 0x7f48117ff9d0, child_tidptr: 0x7f48117ff9d0, tls: 0x7f48117ff700) = 19611 (qemu-system-x86)
        12876.555 (0.048 ms): worker/19611 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f480f7f8770, parent_tidptr: 0x7f480f7fb9d0, child_tidptr: 0x7f480f7fb9d0, tls: 0x7f480f7fb700) = 19612 (worker)
        16575.240 (0.469 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19613 (fetchmail)
        20797.270 (0.335 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19614 (fetchmail)
        21228.585 (0.501 ms): vim/19519 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fbad6ac27d0) = 19615 (vim)
        21232.193 (0.137 ms): bash/19615 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fad8bff49d0) = 19616 (bash)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-0um93djul9knf239gwa5mpcb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      15bed274
    • A
      perf trace beauty clone: Beautify syscall arguments · 33396a3a
      Arnaldo Carvalho de Melo 提交于
      Now, syswide tracing, selected entries:
      
        # trace -e clone
        24417.203 ( 0.158 ms): bash/11323 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11325 (bash)
                ? (     ?   ): bash/11325  ... [continued]: clone()) = 0
        24419.355 ( 0.093 ms): bash/10586 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11326 (bash)
                ? (     ?   ): bash/11326  ... [continued]: clone()) = 0
        24419.744 ( 0.102 ms): bash/11326 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11327 (bash)
                ? (     ?   ): bash/11327  ... [continued]: clone()) = 0
        24420.138 ( 0.105 ms): bash/11327 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11328 (bash)
                ? (     ?   ): bash/11328  ... [continued]: clone()) = 0
        35747.722 ( 0.044 ms): gpg-agent/18087 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ff0755f6ff0, parent_tidptr: 0x7ff0755f79d0, child_tidptr: 0x7ff0755f79d0, tls: 0x7ff0755f7700) = 11329 (gpg-agent)
                ? (     ?   ): gpg-agent/11329  ... [continued]: clone()) = 0
        35748.359 ( 0.022 ms): gpg-agent/18087 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ff075df7ff0, parent_tidptr: 0x7ff075df89d0, child_tidptr: 0x7ff075df89d0, tls: 0x7ff075df8700) = 11330 (gpg-agent)
                ? (     ?   ): gpg-agent/11330  ... [continued]: clone()) = 0
        35781.422 ( 0.452 ms): NetworkManager/1112 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f2f1fffedb0, parent_tidptr: 0x7f2f1ffff9d0, child_tidptr: 0x7f2f1ffff9d0, tls: 0x7f2f1ffff700) = 11331 (NetworkManager)
                ? (     ?   ): NetworkManager/11331  ... [continued]: clone()) = 0
      
      Need to improve the formatting of the second return, to the child, this
      cset only focused on the argument formatting.
      
      If we trace just one pid:
      
        # trace -e clone -p 19863
           0.349 ( 0.025 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb84eaac70, parent_tidptr: 0x7ffb84eab9d0, child_tidptr: 0x7ffb84eab9d0, tls: 0x7ffb84eab700) = 11637 (Chrome_IOThread)
           0.392 ( 0.013 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb664b8c70, parent_tidptr: 0x7ffb664b99d0, child_tidptr: 0x7ffb664b99d0, tls: 0x7ffb664b9700) = 11638 (Chrome_IOThread)
           0.573 ( 0.015 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb6046cc70, parent_tidptr: 0x7ffb6046d9d0, child_tidptr: 0x7ffb6046d9d0, tls: 0x7ffb6046d700) = 11639 (Chrome_IOThread)
           0.617 ( 0.014 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb730dcc70, parent_tidptr: 0x7ffb730dd9d0, child_tidptr: 0x7ffb730dd9d0, tls: 0x7ffb730dd700) = 11640 (Chrome_IOThread)
           4.350 ( 0.065 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb720d9c70, parent_tidptr: 0x7ffb720da9d0, child_tidptr: 0x7ffb720da9d0, tls: 0x7ffb720da700) = 11642 (Chrome_IOThread)
           5.642 ( 0.079 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb718d8c70, parent_tidptr: 0x7ffb718d99d0, child_tidptr: 0x7ffb718d99d0, tls: 0x7ffb718d9700) = 11643 (Chrome_IOThread)
      ^C#
      
      We'll also have to fix the argument ordering in different arches,
      probably having multiple syscall_fmt entries with each possible order
      and then use perf_evsel__env_arch() (if dealing with a perf.data file)
      or the current system info, for live sessions.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-am068uyubgj83snepolwhbfe@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      33396a3a
    • A
      tools include uapi: Grab a copy of linux/sched.h · 450c86c9
      Arnaldo Carvalho de Melo 提交于
      So that we make sure we have recent enough defines for things
      such as 'perf trace' system call argument beautifiers.
      
      For instance, the 'clone' syscall argument 'flag' needs to use
      CLONE_NEWCGROUP, and that is not available in RHEL7.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-81sln0ng4a2lcxrth14vcov4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      450c86c9
    • A
      perf trace: Allow specifying names to syscall arguments formatters · c51bdfec
      Arnaldo Carvalho de Melo 提交于
      For tracepointless syscalls, like clone, otherwise get them from the
      tracepoint's /format file.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ml5qvv1w5k96ghwhxpzzsmm3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c51bdfec
    • A
      perf trace: Allow specifying number of syscall args for tracepointless syscalls · 332337da
      Arnaldo Carvalho de Melo 提交于
      When we don't have syscalls:sys_{enter,exit}_NAME, we had to resort to
      dumping all the 6 syscall arguments, fix it by providing that info for
      such syscalls, like 'clone'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-dfq1jtrxj8dqvqoeqqpr3slu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      332337da
    • A
      perf trace: Ditch __syscall__arg_val() variant, not needed anymore · 325f5091
      Arnaldo Carvalho de Melo 提交于
      All callers now can use syscall__arg_val(arg, idx), be it to iterate
      thru the syscall arguments while taking into account alignment, or to
      get values for other arguments that affect how the current argument
      should be formatted (think of fcntl's 'cmd' and 'arg' arguments).
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-wm5b156d8kro1r4y3b33eyta@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      325f5091
    • A
      perf trace: Use the syscall_fmt formatters without a tracepoint · d032d79e
      Arnaldo Carvalho de Melo 提交于
      Previously we only used the syscall_fmt when we had sc->tp_format set,
      i.e. when we found the (enter, exit) pair in tracefs/events/syscalls/.
      
      But we really only need to use what is in sc->arg_fmt to apply the arg
      beautifiers to the syscall argument values, so do it.
      
      With this we will be able to provide formatters to the "clone" syscall,
      which doesn't have entries in tracefs/events/syscalls/.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-y41nl41jrayjo5ucnde2peix@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d032d79e
    • A
      perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint · 5e58fcfa
      Arnaldo Carvalho de Melo 提交于
      At least "clone" doesn't have (enter, exit) entries tracefs/events/syscalls/,
      but we can provide a syscall_fmt and use it instead, as will be done for
      "clone" in the next cset.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-o12kejgcxddyovn2hlg4gbim@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5e58fcfa
    • A
      perf trace beauty mmap: Ignore 'fd' and 'offset' args for MAP_ANONYMOUS · d57da8c9
      Arnaldo Carvalho de Melo 提交于
      Just suppress them, not used by the kernel.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-atpt07y2x9a8ttlwja94ow3j@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d57da8c9
    • A
      perf trace: Add missing ' = ' in the default formatting of syscall returns · 6f8fe61e
      Arnaldo Carvalho de Melo 提交于
      We lost it recently, put it back.
      
      Before:
      
        789.499 ( 0.001 ms): libvirtd/1175 lseek(fd: 22, whence: CUR) 4328
      
      After:
      
        789.499 ( 0.001 ms): libvirtd/1175 lseek(fd: 22, whence: CUR) = 4328
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 1f63139c ("perf trace beauty: Simplify syscall return formatting")
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6f8fe61e
    • K
      perf intel-pt: Always set no branch for dummy event · 91a8c5b8
      Kan Liang 提交于
      An earlier kernel patch allowed enabling PT and LBR at the same time on
      Goldmont.
      
      commit ccbebba4 ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity
      if the core supports it")
      
      However, users still cannot use Intel PT and LBRs simultaneously.  $
      sudo perf record -e cycles,intel_pt//u -b  -- sleep 1 Error: PMU
      Hardware doesn't support sampling/overflow-interrupts.
      
      PT implicitly adds dummy event in perf tool. dummy event is software
      event which doesn't support LBR.
      
      Always setting no branch for dummy event in Intel PT.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170630141656.1626-2-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91a8c5b8
    • K
      perf intel-pt: Set no_aux_samples for the tracking event · 69d8bd8a
      Kan Liang 提交于
      The reason of introducing the tracking event (a dummy software event) is
      to collect side-band information. Additional sampling is wasteful.
      no_aux_samples should be set for tracking event.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170630141656.1626-1-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      69d8bd8a
    • I
      Merge tag 'perf-core-for-mingo-4.13-20170718' of... · 510457ec
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-4.13-20170718' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
      - Initial support for namespaces, using setns to access files in
        namespaces, grabbing their build-ids, etc. We still need to work
        more to deal with namespaces that vanish before we can get the
        needed data to do analysis, but this should be as good as what is
        in bcc now (Krister Johansen)
      
      - Add header record types to pipe-mode, now this command:
      
        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
      
        Will show the same as in non-pipe mode, i.e. involving a perf.data
        file (David Carrillo-Cisneros)
      
      - Implement a visual marker for fused x86 instructions in the annotate
        TUI browser, available now in 'perf report', more work needed to have
        it available as well in 'perf top' (Jin Yao)
      
        Further explanation from one of Jin's patches:
      
             │   ┌──cmpl   $0x0,argp_program_version_hook
       81.93 │   ├──je     20
             │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
             │   │↓ jne    29
             │   │↓ jmp    43
       11.47 │20:└─→cmpxch %esi,0x38a999(%rip)
      
        That means the cmpl+je is a fused instruction pair and they should be
        considered together.
      
      - Record the branch type and then show statistics and info about
        in callchain entries (Jin Yao)
      
        Example from one of Jin's patches:
      
        # perf record -g -j any,save_type
        # perf report --branch-history --stdio --no-children
      
        38.50%  div.c:45                [.] main                    div
                |
                ---main div.c:42 (RET CROSS_2M cycles:2)
                   compute_flag div.c:28 (cycles:2)
                   compute_flag div.c:27 (RET CROSS_2M cycles:1)
                   rand rand.c:28 (cycles:1)
                   rand rand.c:28 (RET CROSS_2M cycles:1)
                   __random random.c:298 (cycles:1)
                   __random random.c:297 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (RET CROSS_2M cycles:9)
      
      - Beautify the fcntl syscall, which is an interesting one in the sense
        that infrastructure had to be put in place to change the formatters of
        some arguments according to the value in a previous one, i.e. cmd
        dictates how arg and the syscall return will be formatted.
        (Arnaldo Carvalho de Melo
      
      Infrastructure changes:
      
      - 'perf test attr' fixes (Jiri Olsa)
      
      Vendor events changes:
      
      - Add POWER9 PMU events Sukadev (Bhattiprolu)
      
      - Support additional POWER8+ PVR in PMU mapfile (Shriya)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      510457ec
    • A
      perf/core: Fix scheduling regression of pinned groups · 3bda69c1
      Alexander Shishkin 提交于
      Vince Weaver reported:
      
      > I was tracking down some regressions in my perf_event_test testsuite.
      > Some of the tests broke in the 4.11-rc1 timeframe.
      >
      > I've bisected one of them, this report is about
      >	tests/overflow/simul_oneshot_group_overflow
      > This test creates an event group containing two sampling events, set
      > to overflow to a signal handler (which disables and then refreshes the
      > event).
      >
      > On a good kernel you get the following:
      > 	Event perf::instructions with period 1000000
      > 	Event perf::instructions with period 2000000
      > 		fd 3 overflows: 946 (perf::instructions/1000000)
      > 		fd 4 overflows: 473 (perf::instructions/2000000)
      > 	Ending counts:
      > 		Count 0: 946379875
      > 		Count 1: 946365218
      >
      > With the broken kernels you get:
      > 	Event perf::instructions with period 1000000
      > 	Event perf::instructions with period 2000000
      > 		fd 3 overflows: 938 (perf::instructions/1000000)
      > 		fd 4 overflows: 318 (perf::instructions/2000000)
      > 	Ending counts:
      > 		Count 0: 946373080
      > 		Count 1: 653373058
      
      The root cause of the bug is that the following commit:
      
        487f05e1 ("perf/core: Optimize event rescheduling on active contexts")
      
      erronously assumed that event's 'pinned' setting determines whether the
      event belongs to a pinned group or not, but in fact, it's the group
      leader's pinned state that matters.
      
      This was discovered by Vince in the test case described above, where two instruction
      counters are grouped, the group leader is pinned, but the other event is not;
      in the regressed case the counters were off by 33% (the difference between events'
      periods), but should be the same within the error margin.
      
      Fix the problem by looking at the group leader's pinning.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Tested-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: 487f05e1 ("perf/core: Optimize event rescheduling on active contexts")
      Link: http://lkml.kernel.org/r/87lgnmvw7h.fsf@ashishki-desk.ger.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3bda69c1
  4. 19 7月, 2017 15 次提交
    • J
      perf report: Show branch type in callchain entry · b851dd49
      Jin Yao 提交于
      Show branch type in callchain entry. The branch type is printed
      with other LBR information (such as cycles/abort/...).
      
      For example:
      
        perf record -g -j any,save_type
        perf report --branch-history --stdio --no-children
      
        38.50%  div.c:45                [.] main                    div
                |
                ---main div.c:42 (RET CROSS_2M cycles:2)
                   compute_flag div.c:28 (cycles:2)
                   compute_flag div.c:27 (RET CROSS_2M cycles:1)
                   rand rand.c:28 (cycles:1)
                   rand rand.c:28 (RET CROSS_2M cycles:1)
                   __random random.c:298 (cycles:1)
                   __random random.c:297 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (RET CROSS_2M cycles:9)
      
      Change log
      
      v6: Remove the branch_type_str() since it's moved to branch.c.
      
      v5: Rewrite the branch info print code in util/callchain.c.
      
      v4: Comparing to previous version, the major changes are:
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1500379995-6449-8-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b851dd49
    • J
      perf report: Show branch type statistics for stdio mode · 2d78b189
      Jin Yao 提交于
      Show the branch type statistics at the end of perf report --stdio.
      
      For example:
      
        perf report --stdio
      
        COND_FWD:  28.5%
        COND_BWD:   9.4%
        CROSS_4K:   0.7%
        CROSS_2M:  14.1%
            COND:  37.9%
          UNCOND:   0.2%
             IND:   6.7%
            CALL:  26.5%
             RET:  28.7%
          SYSRET:   0.0%
      
        The branch types are:
      
         COND_FWD: conditional forward
         COND_BWD: conditional backward
             COND: conditional branch
           UNCOND: unconditional branch
              IND: indirect
             CALL: function call
           IND_CALL: indirect function call
              RET: function return
          SYSCALL: syscall
           SYSRET: syscall return
        COND_CALL: conditional function call
         COND_RET: conditional function return
      
      CROSS_4K and CROSS_2M:
      
      They are the metrics checking for branches cross 4K or 2MB pages.
      It's an approximate computing. We don't know if the area is 4K or
      2MB, so always compute both.
      
      To make the output simple, if a branch crosses 2M area, CROSS_4K
      will not be incremented.
      
      Change log
      
      v7: Since the common branch type definitions are changed, some
          tags/strings are updated accordingly.
      
      v6: Remove branch_type_stat_display() since it's moved to branch.c.
      
      v5: Remove the unnecessary sort__mode checking in
          hist_iter__branch_callback().
      
      v4: Comparing to previous version, the major changes are:
      
      Add the computing of JCC forward/JCC backward and cross page checking
      by using the from and to addresses.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1500379995-6449-7-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2d78b189
    • J
      perf util: Create branch.c/.h for common branch functions · 992c7e92
      Jin Yao 提交于
      Create new util/branch.c and util/branch.h to contain the common branch
      functions. Such as:
      
      branch_type_count(): Count the numbers of branch types
      branch_type_name() : Return the name of branch type
      branch_type_stat_display(): Display branch type statistics info
      branch_type_str(): Construct the branch type string.
      
      The branch type is saved in branch_flags.
      
      Change log:
      
      v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.
      
      v7: Since the common branch type name is changed (e.g. JCC->COND),
          this patch is performed the modification accordingly.
      
      v6: Move that multiline conditional code inside {} brackets.
          Move branch_type_stat_display() from builtin-report.c to
            branch.c.
          Move branch_type_str() from callchain.c to branch.c.
      
      v5: It's a new patch in v5 patch series.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1500379995-6449-6-git-send-email-yao.jin@linux.intel.com
      [ Don't use 'index' and 'stat' as names for variables, it shadows global decls in older distros ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      992c7e92
    • J
      perf report: Refactor the branch info printing code · 8d51735f
      Jin Yao 提交于
      The branch info such as predicted/cycles/... are printed at the
      callchain entries.
      
      For example: perf report --branch-history --no-children --stdio
      
          --1.07%--main div.c:39 (predicted:52.4% cycles:1 iterations:17)
                    main div.c:44 (predicted:52.4% cycles:1)
                    main div.c:42 (cycles:2)
                    compute_flag div.c:28 (cycles:2)
                    compute_flag div.c:27 (cycles:1)
                    rand rand.c:28 (cycles:1)
                    rand rand.c:28 (cycles:1)
                    __random random.c:298 (cycles:1)
                    __random random.c:297 (cycles:1)
                    __random random.c:295 (cycles:1)
                    __random random.c:295 (cycles:1)
                    __random random.c:295 (cycles:1)
      
      But the current code is difficult to maintain and extend. This patch
      refactors the code for easy maintenance.
      
      Change log:
      
      v6: 1. Put the multiline condition code into {} brackets in
             counts_str_build()
      
          2. Keep the original display order, that is:
             predicted, abort, cycles, iterations
      
      v5: It's a new patch in v5 patch series.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1500379995-6449-5-git-send-email-yao.jin@linux.intel.com
      [ Don't use 'index' as a name for a variable, it shadows a globa decl in older distros ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8d51735f
    • J
      perf record: Create a new option save_type in --branch-filter · 60f83fa6
      Jin Yao 提交于
      The option indicates the kernel to save branch type during sampling.
      
      One example:
      
        perf record -g --branch-filter any,save_type <command>
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1500379995-6449-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      60f83fa6
    • J
      perf/x86/intel: Record branch type · d5c7f9dc
      Jin Yao 提交于
      Perf already has support for disassembling the branch instruction
      and using the branch type for filtering. The patch just records
      the branch type in perf_branch_entry.
      
      Before recording, the patch converts the x86 branch type to
      common branch type.
      
      Change log:
      
      v10: Set the branch_map array to be static. The previous version
           has it on stack then makes the compiler to create it every
           time when the function gets called.
      
      v9: Use __ffs() to find first bit in type in common_branch_type().
          It lets the code be clear.
      
      v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.
      
      v7: Just convert following x86 branch types to common branch types.
      
      X86_BR_CALL      -> PERF_BR_CALL
      X86_BR_RET       -> PERF_BR_RET
      X86_BR_JCC       -> PERF_BR_COND
      X86_BR_JMP       -> PERF_BR_UNCOND
      X86_BR_IND_CALL  -> PERF_BR_IND_CALL
      X86_BR_ZERO_CALL -> PERF_BR_CALL
      X86_BR_IND_JMP   -> PERF_BR_IND
      X86_BR_SYSCALL   -> PERF_BR_SYSCALL
      X86_BR_SYSRET    -> PERF_BR_SYSRET
      
      Others are set to PERF_BR_NONE
      
      v6: Not changed.
      
      v5: Just fix the merge error. No other update.
      
      v4: Comparing to previous version, the major changes are:
      
      1. Uses a lookup table to convert x86 branch type to common branch
         type.
      
      2. Move the JCC forward/JCC backward and cross page computing to
         user space.
      
      3. Initialize branch type to 0 in intel_pmu_lbr_read_32 and
         intel_pmu_lbr_read_64
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Link: http://lkml.kernel.org/r/1500379995-6449-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d5c7f9dc
    • J
      perf/core: Define the common branch type classification · eb0baf8a
      Jin Yao 提交于
      It is often useful to know the branch types while analyzing branch data.
      For example, a call is very different from a conditional branch.
      
      Currently we have to look it up in binary while the binary may later not
      be available and even the binary is available but user has to take some
      time. It is very useful for user to check it directly in perf report.
      
      Perf already has support for disassembling the branch instruction to get
      the x86 branch type.
      
      To keep consistent on kernel and userspace and make the classification
      more common, the patch adds the common branch type classification
      in perf_event.h.
      
      The patch only defines a minimum but most common set of branch types.
      
      PERF_BR_UNKNOWN         : unknown
      PERF_BR_COND            :conditional
      PERF_BR_UNCOND          : unconditional
      PERF_BR_IND             : indirect
      PERF_BR_CALL            : function call
      PERF_BR_IND_CALL        : indirect function call
      PERF_BR_RET             : function return
      PERF_BR_SYSCALL         : syscall
      PERF_BR_SYSRET          : syscall return
      PERF_BR_COND_CALL       : conditional function call
      PERF_BR_COND_RET        : conditional function return
      
      The patch also adds a new field type (4 bits) in perf_branch_entry
      to record the branch type.
      
      Since the disassembling of branch instruction needs some overhead,
      a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
      needs to disassemble the branch instruction and record the branch
      type.
      
      Change log:
      
      v10: Not changed.
      
      v9: Not changed.
      
      v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.
          No other change.
      
      v7: Just keep the most common branch types.
          Others are removed.
      
      v6: Not changed.
      
      v5: Not changed. The v5 patch series just change the userspace.
      
      v4: Comparing to previous version, the major changes are:
      
      1. Remove the PERF_BR_JCC_FWD/PERF_BR_JCC_BWD, they will be
         computed later in userspace.
      
      2. Remove the "cross" field in perf_branch_entry. The cross page
         computing will be done later in userspace.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1500379995-6449-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eb0baf8a
    • D
      perf header: Add event desc to pipe-mode header · f9ebdccf
      David Carrillo-Cisneros 提交于
      Add event descriptor to perf header output in pipe-mode.
      
      After this patch:
      
        $ perf record -e cycles sleep 1 | perf report --header
        # ========
        # captured on: Mon Jun  5 22:52:13 2017
        # ========
        #
        # hostname : lphh20
        # os release : 4.3.5-smp-801.43.0.0
        # perf version : 4.12.rc2.g439987
        # arch : x86_64
        # nrcpus online : 72
        # nrcpus avail : 72
        # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
        # cpuid : GenuineIntel,6,63,2
        # total memory : 264134144 kB
        # cmdline : /root/perf record -e cycles sleep 1
        # event : name = cycles, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1
        # CPU_TOPOLOGY info available, use -I to display
        # NUMA_TOPOLOGY info available, use -I to display
        # pmu mappings: intel_bts = 6, cpu = 4, msr = 49, uncore_cbox_10 = 36, uncore_cbox_11 = 37, uncore_cbox_12 = 38, uncore_cbox_13 = 39, uncore_cbox_14 = 40, uncore_cbox_15 = 41, uncore_cbox_16 = 42, uncore_cbox_17 = 43, software = 1, power = 7, uncore_irp = 24, uncore_pcu = 48, tracepoint = 2, uncore_imc_0 = 16, uncore_imc_1 = 17, uncore_imc_2 = 18, uncore_imc_3 = 19, uncore_imc_4 = 20, uncore_imc_5 = 21, uncore_imc_6 = 22, uncore_imc_7 = 23, uncore_qpi_0 = 8, uncore_qpi_1 = 9, uncore_cbox_0 = 26, uncore_cbox_1 = 27, uncore_cbox_2 = 28, uncore_cbox_3 = 29, uncore_cbox_4 = 30, uncore_cbox_5 = 31, uncore_cbox_6 = 32, uncore_cbox_7 = 33, uncore_cbox_8 = 34, uncore_cbox_9 = 35, uncore_r2pcie = 13, uncore_r3qpi_0 = 10, uncore_r3qpi_1 = 11, uncore_r3qpi_2 = 12, uncore_sbox_0 = 44, uncore_sbox_1 = 45, uncore_sbox_2 = 46, uncore_sbox_3 = 47, breakpoint = 5, uncore_ha_0 = 14, uncore_ha_1 = 15, uncore_ubox = 25
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB (null) ]
      
      Prior to this patch, event was not printed.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-17-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9ebdccf
    • D
      perf tools: Add feature header record to pipe-mode · e9def1b2
      David Carrillo-Cisneros 提交于
      Add header record types to pipe-mode, reusing the functions
      used in file-mode and leveraging the new struct feat_fd.
      
      For alignment, check that synthesized events don't exceed
      pagesize.
      
      Add the perf_event__synthesize_feature event call back to
      process the new header records.
      
      Before this patch:
      
        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
        ...
      
      After this patch:
        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
        # ========
        # captured on: Mon May 22 16:33:43 2017
        # ========
        #
        # hostname : my_hostname
        # os release : 4.11.0-dbx-up_perf
        # perf version : 4.11.rc6.g6277c80
        # arch : x86_64
        # nrcpus online : 72
        # nrcpus avail : 72
        # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
        # cpuid : GenuineIntel,6,63,2
        # total memory : 263457192 kB
        # cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1
        # HEADER_CPU_TOPOLOGY info available, use -I to display
        # HEADER_NUMA_TOPOLOGY info available, use -I to display
        # pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
        ...
      
      Support added for the subcommands: report, inject, annotate and script.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e9def1b2
    • D
      perf tool: Add show_feature_header to perf_tool · 114f709e
      David Carrillo-Cisneros 提交于
      Add show_feat_hdr to control level of printed information of feature
      headers.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-15-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      114f709e
    • D
      perf header: Change FEAT_OP* macros · a4d8c985
      David Carrillo-Cisneros 提交于
      There are three FEAT_OP* macros:
        - FEAT_OPA: for features without process record.
        - FEAT_OPP: for features with process record.
        - FEAT_OPF: like FEAT_OPP but to show only if show_full_info flags
          is set.
      
      To add pipe-mode headers we need yet another variation of the macros
      (one to specify whether a feature generates an auxiliar record).
      
      Instead, we redefine macros so that:
        - show_full_info is specified as an argument (to remove the
        FEAT_OPF variation) and,
        - it always sets "process" handler (to remove the FEAT_OPA variation).
        Individual process handlers can be NULLed individually.
      
      This allows to define two variations only:
        - FEAT_OPR: synthesizes auxiliar event record.
        - FEAT_OPN: doesn't synthesize an auxiliar event record.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-14-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a4d8c985
    • D
      perf header: Add a buffer to struct feat_fd · 0b3d3410
      David Carrillo-Cisneros 提交于
      Extend struct feat_fd to use a temporal buffer in pipe-mode, instead of
      perf.data's file descriptor.
      
      The header features build_id and aux_trace already have logic to print
      in file-mode that heavily rely on lseek the file. For now, leave such
      features inactive in pipe-mode and print a warning if their functions
      are called in pipe-mode.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-13-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0b3d3410
    • D
      perf header: Make write_pmu_mappings pipe-mode friendly · a02c395c
      David Carrillo-Cisneros 提交于
      In pipe-mode, we will operate over a buffer instead of a file descriptor
      but write_pmu_mappings uses lseek to move over the perf.data file.
      
      Refactor write_pmu_mappings to avoid the usage of lseek and allow
      reusing the same logic in pipe-mode (next patch).
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-12-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a02c395c
    • D
      perf header: Use struct feat_fd in read header records · 48e5fcea
      David Carrillo-Cisneros 提交于
      As preparation for using header records in-pipe mode, replace int fd
      with struct feat_fd ff in read functions for all header record types.
      
      This patch does not change behavior.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-11-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      48e5fcea
    • D
      perf header: Don't pass struct perf_file_section to process_##_feat · 62552457
      David Carrillo-Cisneros 提交于
      struct perf_file_section is used in process_##_feat as container for
      size and offset in the file descriptor. These attributes are meaninful
      in pipe-mode but struct perf_file_section is not.
      
      Add offset and size variables to struct feat_fd to store
      perf_file_section's values in file-mode. Later on, the same variables
      can be reused for pipe-mode.
      
      This patch does not change behavior.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170718042549.145161-10-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      62552457