1. 12 1月, 2017 3 次提交
  2. 04 1月, 2017 3 次提交
  3. 03 1月, 2017 1 次提交
    • M
      perf probe: Fix to get correct modname from elf header · 1f2ed153
      Masami Hiramatsu 提交于
      Since 'perf probe' supports cross-arch probes, it is possible to analyze
      different arch kernel image which has different bits-per-long.
      
      In that case, it fails to get the module name because it uses the
      MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead
      of the target arch bits-per-long.
      
      This fixes above issue by changing modname-offset based on the target
      archs bit width. This is ok because linux kernel uses LP64 model on
      64bit arch.
      
      E.g. without this (on x86_64, and target module is arm32):
      
        $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
        p:probe/configfs_lookup :configfs_lookup+0
                                ^-Here is an empty module name.
      
      With this fix, you can see correct module name:
      
        $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
        p:probe/configfs_lookup configfs:configfs_lookup+0
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f2ed153
  4. 20 12月, 2016 2 次提交
    • K
      perf diff: Do not overwrite valid build id · ed6c166c
      Kan Liang 提交于
      Fixes a perf diff regression issue which was introduced by commit
      5baecbcd ("perf symbols: we can now read separate debug-info files
      based on a build ID")
      
      The binary name could be same when perf diff different binaries. Build
      id is used to distinguish between them.
      However, the previous patch assumes the same binary name has same build
      id. So it overwrites the build id according to the binary name,
      regardless of whether the build id is set or not.
      
      Check the has_build_id in dso__load. If the build id is already set, use
      it.
      
      Before the fix:
      
        $ perf diff 1.perf.data 2.perf.data
        # Event 'cycles'
        #
        # Baseline    Delta  Shared Object     Symbol
        # ........  .......  ................  .............................
        #
          99.83%  -99.80%  tchain_edit       [.] f2
           0.12%  +99.81%  tchain_edit       [.] f3
           0.02%   -0.01%  [ixgbe]           [k] ixgbe_read_reg
      
        After the fix:
        $ perf diff 1.perf.data 2.perf.data
        # Event 'cycles'
        #
        # Baseline    Delta  Shared Object     Symbol
        # ........  .......  ................  .............................
        #
          99.83%   +0.10%  tchain_edit       [.] f3
           0.12%   -0.08%  tchain_edit       [.] f2
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      CC: Dima Kogan <dima@secretsauce.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: 5baecbcd ("perf symbols: we can now read separate debug-info files based on a build ID")
      Link: http://lkml.kernel.org/r/1481642984-13593-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ed6c166c
    • R
      perf annotate: Don't throw error for zero length symbols · edee44be
      Ravi Bangoria 提交于
      'perf report --tui' exits with error when it finds a sample of zero
      length symbol (i.e. addr == sym->start == sym->end). Actually these are
      valid samples. Don't exit TUI and show report with such symbols.
      Reported-and-Tested-by: NAnton Blanchard <anton@samba.org>
      Link: https://lkml.org/lkml/2016/10/8/189Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@kernel.org # v4.9+
      Link: http://lkml.kernel.org/r/1479804050-5028-1-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      edee44be
  5. 16 12月, 2016 5 次提交
    • R
      perf annotate: Fix jump target outside of function address range · e216874c
      Ravi Bangoria 提交于
      If jump target is outside of function range, perf is not handling it
      correctly. Especially when target address is lesser than function start
      address, target offset will be negative. But, target address declared to
      be unsigned, converts negative number into 2's complement. See below
      example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is
      lesser than function start address(34cf0).
      
              34ac0 - 34cf0 = -0x230 = 0xfffffffffffffdd0
      
      Objdump output:
      
        0000000000034cf0 <__sigaction>:
        __GI___sigaction():
          34cf0: lea    -0x20(%rdi),%eax
          34cf3: cmp    -bashx1,%eax
          34cf6: jbe    34d00 <__sigaction+0x10>
          34cf8: jmpq   34ac0 <__GI___libc_sigaction>
          34cfd: nopl   (%rax)
          34d00: mov    0x386161(%rip),%rax        # 3bae68 <_DYNAMIC+0x2e8>
          34d07: movl   -bashx16,%fs:(%rax)
          34d0e: mov    -bashxffffffff,%eax
          34d13: retq
      
      perf annotate before applying patch:
      
        __GI___sigaction  /usr/lib64/libc-2.22.so
                 lea    -0x20(%rdi),%eax
                 cmp    -bashx1,%eax
              v  jbe    10
              v  jmpq   fffffffffffffdd0
                 nop
          10:    mov    _DYNAMIC+0x2e8,%rax
                 movl   -bashx16,%fs:(%rax)
                 mov    -bashxffffffff,%eax
                 retq
      
      perf annotate after applying patch:
      
        __GI___sigaction  /usr/lib64/libc-2.22.so
                 lea    -0x20(%rdi),%eax
                 cmp    -bashx1,%eax
              v  jbe    10
              ^  jmpq   34ac0 <__GI___libc_sigaction>
                 nop
          10:    mov    _DYNAMIC+0x2e8,%rax
                 movl   -bashx16,%fs:(%rax)
                 mov    -bashxffffffff,%eax
                 retq
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/1480953407-7605-3-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e216874c
    • R
      perf annotate: Support jump instruction with target as second operand · 3ee2eb6d
      Ravi Bangoria 提交于
      Architectures like PowerPC have jump instructions that includes a target
      address as a second operand. For example, 'bne cr7,0xc0000000000f6154'.
      Add support for such instruction in perf annotate.
      
      objdump o/p:
        c0000000000f6140:   ld     r9,1032(r31)
        c0000000000f6144:   cmpdi  cr7,r9,0
        c0000000000f6148:   bne    cr7,0xc0000000000f6154
        c0000000000f614c:   ld     r9,2312(r30)
        c0000000000f6150:   std    r9,1032(r31)
        c0000000000f6154:   ld     r9,88(r31)
      
      Corresponding perf annotate o/p:
      
      Before patch:
               ld     r9,1032(r31)
               cmpdi  cr7,r9,0
            v  bne    3ffffffffff09f2c
               ld     r9,2312(r30)
               std    r9,1032(r31)
        74:    ld     r9,88(r31)
      
      After patch:
               ld     r9,1032(r31)
               cmpdi  cr7,r9,0
            v  bne    74
               ld     r9,2312(r30)
               std    r9,1032(r31)
        74:    ld     r9,88(r31)
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/1480953407-7605-2-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3ee2eb6d
    • J
      perf evsel: Allow to ignore missing pid · a359c17a
      Jiri Olsa 提交于
      Adding perf_evsel::ignore_missing_cpu_thread bool.
      
      When set true, it allows perf to ignore error of missing pid of perf
      event syscall.
      
      We remove missing thread id from the thread_map, so the rest of the
      processing like ioctl and mmap won't get disturbed with -1 fd.
      
      The reason for supporting this is to ease up monitoring group of pids,
      that 'disappear' before perf opens their event. This currently leads
      perf to report error and exit and makes perf record's -u option unusable
      under certain setup.
      
      With this change we will allow this race and ignore such failure with
      following warning:
      
        WARNING: Ignored open failure for pid 8605
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20161213074622.GA3084@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a359c17a
    • J
      perf thread_map: Add thread_map__remove function · 38af91f0
      Jiri Olsa 提交于
      Add thread_map__remove function to remove thread from thread map.
      
      Add automated test also.
      
      Committer notes:
      
      Testing it:
      
        # perf test "Remove thread map"
        39: Remove thread map                          : Ok
        # perf test -v "Remove thread map"
        39: Remove thread map                          :
        --- start ---
        test child forked, pid 4483
        2 threads: 4482, 4483
        1 thread: 4483
        0 thread:
        test child finished with 0
        ---- end ----
        Remove thread map: Ok
        #
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1481538943-21874-4-git-send-email-jolsa@kernel.org
      [ Added stdlib.h, to get the free() declaration ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      38af91f0
    • J
      perf evsel: Use variable instead of repeating lengthy FD macro · 83c2e4f3
      Jiri Olsa 提交于
      It's more readable and will ease up following patches.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1481538943-21874-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      83c2e4f3
  6. 07 12月, 2016 1 次提交
  7. 06 12月, 2016 9 次提交
    • R
      perf annotate: Show raw form for jump instruction with indirect target · bec60e50
      Ravi Bangoria 提交于
      For jump instructions that does not include target address as direct operand,
      show the original disassembled line for them. This is needed for certain
      powerpc jump instructions that use target address in a register (such as bctr,
      btar, ...).
      
      Before:
           ld     r12,32088(r12)
           mtctr  r12
        v  bctr   ffffffffffffca2c
           std    r2,24(r1)
           addis  r12,r2,-1
      
      After:
           ld     r12,32088(r12)
           mtctr  r12
        v  bctr
           std    r2,24(r1)
           addis  r12,r2,-1
      
      Committer notes:
      
      Testing it using a perf.data file and vmlinux for powerpc64,
      cross-annotating it on a x86_64 workstation:
      
      Before:
      
        .__bpf_prog_run  vmlinux.powerpc
               │        std    r10,512(r9)                      ▒
               │        lbz    r9,0(r31)                        ▒
               │        rldicr r9,r9,3,60                       ▒
               │        ldx    r9,r30,r9                        ▒
               │        mtctr  r9                               ▒
        100.00 │      ↓ bctr   3fffffffffe01510                 ▒
               │        lwa    r10,4(r31)                       ▒
               │        lwz    r9,0(r31)                        ▒
        <SNIP>
        Invalid jump offset: 3fffffffffe01510
      
      After:
      
        .__bpf_prog_run  vmlinux.powerpc
               │        std    r10,512(r9)                      ▒
               │        lbz    r9,0(r31)                        ▒
               │        rldicr r9,r9,3,60                       ▒
               │        ldx    r9,r30,r9                        ▒
               │        mtctr  r9                               ▒
        100.00 │      ↓ bctr                                    ▒
               │        lwa    r10,4(r31)                       ▒
               │        lwz    r9,0(r31)                        ▒
        <SNIP>
        Invalid jump offset: 3fffffffffe01510
      
      This, in turn, uncovers another problem with jumps without operands, the
      ENTER/-> operation, to jump to the target, still continues using the bogus
      target :-)
      
      BTW, this was the file used for the above tests:
      
        [acme@jouet ravi_bangoria]$ perf report --header-only -i perf.data.f22vm.powerdev
        # ========
        # captured on: Thu Nov 24 12:40:38 2016
        # hostname : pdev-f22-qemu
        # os release : 4.4.10-200.fc22.ppc64
        # perf version : 4.9.rc1.g6298ce
        # arch : ppc64
        # nrcpus online : 48
        # nrcpus avail : 48
        # cpudesc : POWER7 (architected), altivec supported
        # cpuid : 74,513
        # total memory : 4158976 kB
        # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
        # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, c
        # HEADER_CPU_TOPOLOGY info available, use -I to display
        # HEADER_NUMA_TOPOLOGY info available, use -I to display
        # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
        # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
        # ========
        #
        [acme@jouet ravi_bangoria]$
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/1480953407-7605-1-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bec60e50
    • W
      perf clang: Compile BPF script using builtin clang support · edd695b0
      Wang Nan 提交于
      After this patch, perf utilizes builtin clang support to build BPF
      script, no longer depend on external clang, but fallbacking to it
      if for some reason the builtin compiling framework fails.
      
      Test:
      
        $ type clang
        -bash: type: clang: not found
        $ cat ~/.perfconfig
        $ echo '#define LINUX_VERSION_CODE 0x040700' > ./test.c
        $ cat ./tools/perf/tests/bpf-script-example.c >> ./test.c
        $ ./perf record -v --dry-run -e ./test.c 2>&1 | grep builtin
        bpf: successfull builtin compilation
        $
      
      Can't pass cflags so unable to include kernel headers now. Will be fixed
      by following commits.
      
      Committer notes:
      
      Make sure '-v' comes before the '-e ./test.c' in the command line otherwise the
      'verbose' variable will not be set when the bpf event is parsed and thus the
      pr_debug indicating a 'successfull builtin compilation' will not be output, as
      the debug level (1) will be less than what 'verbose' has at that point (0).
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-16-wangnan0@huawei.com
      [ Spell check/reflow successfull pr_debug string ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      edd695b0
    • W
      perf clang: Support compile IR to BPF object and add testcase · 5e08a765
      Wang Nan 提交于
      getBPFObjectFromModule() is introduced to compile LLVM IR(Module)
      to BPF object. Add new testcase for it.
      
      Test result:
        $ ./buildperf/perf test -v clang
        51: builtin clang support                               :
        51.1: builtin clang compile C source to IR              :
        --- start ---
        test child forked, pid 21822
        test child finished with 0
        ---- end ----
        builtin clang support subtest 0: Ok
        51.2: builtin clang compile C source to ELF object      :
        --- start ---
        test child forked, pid 21823
        test child finished with 0
        ---- end ----
        builtin clang support subtest 1: Ok
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-15-wangnan0@huawei.com
      [ Remove redundant "Test" from entry descriptions ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5e08a765
    • W
      perf clang: Update test case to use real BPF script · e67d52d4
      Wang Nan 提交于
      Allow C++ code to use util.h and tests/llvm.h. Let 'perf test' compile a
      real BPF script.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-14-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e67d52d4
    • W
      perf clang: Allow passing CFLAGS to builtin clang · a9495fe9
      Wang Nan 提交于
      Improve getModuleFromSource() API to accept a cflags list. This feature
      will be used to pass LINUX_VERSION_CODE and -I flags.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-13-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a9495fe9
    • W
      perf clang: Use real file system for #include · 77dfa84a
      Wang Nan 提交于
      Utilize clang's OverlayFileSystem facility, allow CompilerInstance to
      access real file system.
      
      With this patch the '#include' directive can be used.
      
      Add a new getModuleFromSource for real file.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-12-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      77dfa84a
    • W
      perf clang: Add builtin clang support ant test case · 00b86691
      Wang Nan 提交于
      Add basic clang support in clang.cpp and test__clang() testcase. The
      first testcase checks if builtin clang is able to generate LLVM IR.
      
      tests/clang.c is a proxy. Real testcase resides in
      utils/c++/clang-test.cpp in c++ and exports C interface to perf test
      subsystem.
      
      Test result:
      
         $ perf test -v clang
         51: builtin clang support                               :
         51.1: Test builtin clang compile C source to IR              :
         --- start ---
         test child forked, pid 13215
         test child finished with 0
         ---- end ----
         Test builtin clang support subtest 0: Ok
      
      Committer note:
      
      Make sure you've enabled CLANG and LLVM builtin support by setting
      the LIBCLANGLLVM variable on the make command line, e.g.:
      
        make LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin
      
      Otherwise you'll get this when trying to do the 'perf test' call above:
      
        # perf test clang
        51: builtin clang support                      : Skip (not compiled in)
        #
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-11-wangnan0@huawei.com
      [ Removed "Test" from descriptions, redundant and already removed from all the other entries ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      00b86691
    • W
      perf llvm: Extract helpers in llvm-utils.c · 2bd42de0
      Wang Nan 提交于
      The following commits will use builtin clang to compile BPF scripts.
      
      llvm__get_kbuild_opts() and llvm__get_nr_cpus() are extracted to help
      building '-DKERNEL_VERSION_CODE' and '-D__NR_CPUS__' macros.
      
      Doing object dumping in bpf loader, so further builtin clang compiling
      needn't consider it.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-7-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2bd42de0
    • W
      perf tools: Pass context to perf hook functions · 8ad85e9e
      Wang Nan 提交于
      Pass a pointer to perf hook functions so they receive context
      information during setup.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-6-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8ad85e9e
  8. 02 12月, 2016 4 次提交
    • K
      perf annotate: AArch64 support · 0fcb1da4
      Kim Phillips 提交于
      This is a regex converted version from the original:
      
      	https://lkml.org/lkml/2016/5/19/461
      
      Add basic support to recognise AArch64 assembly. This allows perf to
      identify AArch64 instructions that branch to other parts within the
      same function, thereby properly annotating them.
      
      Rebased onto new cross-arch annotation bits:
      
      	https://lkml.org/lkml/2016/11/25/546
      
      Sample output:
      
      security_file_permission  vmlinux
        5.80 │    ← ret                                                  ▒
             │70:   ldr    w0, [x21,#68]                                 ▒
        4.44 │    ↓ tbnz   d0                                            ▒
             │      mov    w0, #0x24                       // #36        ▒
        1.37 │      ands   w0, w22, w0                                   ▒
             │    ↑ b.eq   60                                            ▒
        1.37 │    ↓ tbnz   e4                                            ▒
             │      mov    w19, #0x20000                   // #131072    ▒
        1.02 │    ↓ tbz    ec                                            ▒
             │90:┌─→ldr    x3, [x21,#24]                                 ▒
        1.37 │   │  add    x21, x21, #0x10                               ▒
             │   │  mov    w2, w19                                       ▒
        1.02 │   │  mov    x0, x21                                       ▒
             │   │  mov    x1, x3                                        ▒
        1.71 │   │  ldr    x20, [x3,#48]                                 ▒
             │   │→ bl     __fsnotify_parent                             ▒
        0.68 │   │↑ cbnz   60                                            ▒
             │   │  mov    x2, x21                                       ▒
        1.37 │   │  mov    w1, w19                                       ▒
             │   │  mov    x0, x20                                       ▒
        0.68 │   │  mov    w5, #0x0                        // #0         ▒
             │   │  mov    x4, #0x0                        // #0         ▒
        1.71 │   │  mov    w3, #0x1                        // #1         ▒
             │   │→ bl     fsnotify                                      ▒
        1.37 │   │↑ b      60                                            ▒
             │d0:│  mov    w0, #0x0                        // #0         ▒
             │   │  ldp    x19, x20, [sp,#16]                            ▒
             │   │  ldp    x21, x22, [sp,#32]                            ▒
             │   │  ldp    x29, x30, [sp],#48                            ▒
             │   │← ret                                                  ▒
             │e4:│  mov    w19, #0x10000                   // #65536     ▒
             │   └──b      90                                            ◆
             │ec:   brk    #0x800                                        ▒
      Press 'h' for help on key bindings
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Signed-off-by: NChris Ryder <chris.ryder@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/20161130092344.012e18e3e623bea395162f95@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0fcb1da4
    • K
      perf annotate: Use arch->objdump.comment_char in dec__parse() · 859afa6c
      Kim Phillips 提交于
      Presume neglected in commit 786c1b51 "perf annotate: Start supporting
      cross arch annotation".  This doesn't fix a bug since none of the
      affected arches support parsing dec/inc instructions yet.
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Ryder <chris.ryder@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/20161130092333.1cca5dd2c77e1790d61c1e9c@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      859afa6c
    • D
      perf tools: Move parse_nsec_time to time-utils.c · c284d669
      David Ahern 提交于
      Code move only; no functional change intended.
      
      Committer notes:
      
      Fix the build on Ubuntu 16.04 x86-64 cross-compiling to S/390, with this
      set of auto-detected features:
      
        ...                         dwarf: [ on  ]
        ...            dwarf_getlocations: [ on  ]
        ...                         glibc: [ on  ]
        ...                          gtk2: [ OFF ]
        ...                      libaudit: [ OFF ]
        ...                        libbfd: [ OFF ]
        ...                        libelf: [ on  ]
        ...                       libnuma: [ OFF ]
        ...        numa_num_possible_cpus: [ OFF ]
        ...                       libperl: [ OFF ]
        ...                     libpython: [ OFF ]
        ...                      libslang: [ OFF ]
        ...                     libcrypto: [ OFF ]
        ...                     libunwind: [ OFF ]
        ...            libdw-dwarf-unwind: [ on  ]
        ...                          zlib: [ on  ]
        ...                          lzma: [ OFF ]
        ...                     get_cpuid: [ OFF ]
        ...                           bpf: [ on  ]
      
      Where it was failing with:
      
          CC       /tmp/build/perf/util/time-utils.o
        util/time-utils.c: In function 'parse_nsec_time':
        util/time-utils.c:17:13: error: implicit declaration of function 'strtoul' [-Werror=implicit-function-declaration]
          time_sec = strtoul(str, &end, 10);
                     ^
        util/time-utils.c:17:2: error: nested extern declaration of 'strtoul' [-Werror=nested-externs]
          time_sec = strtoul(str, &end, 10);
          ^
        util/time-utils.c: In function 'perf_time__parse_str':
        util/time-utils.c:93:2: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
          free(str);
          ^
        util/time-utils.c:93:2: error: incompatible implicit declaration of built-in function 'free' [-Werror]
        util/time-utils.c:93:2: note: include '<stdlib.h>' or provide a declaration of 'free'
      
      Do as suggested and add a '#include <stdlib.h>' to get the free() and strtoul()
      declarations and fix the build.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1480439746-42695-3-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c284d669
    • D
      perf tools: Add time-based utility functions · fdf9dc4b
      David Ahern 提交于
      Add function to parse a user time string of the form <start>,<stop>
      where start and stop are time in sec.nsec format. Both start and stop
      times are optional.
      
      Add function to determine if a sample time is within a given time
      time window of interest.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1480439746-42695-2-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fdf9dc4b
  9. 30 11月, 2016 1 次提交
    • D
      perf script: Add option to stop printing callchain · 64eff7d9
      David Ahern 提交于
      Allow user to specify list of symbols which cause the dump of callchains
      to stop at that symbol.
      
      Committer notes:
      
      Testing it:
      
        # perf record -ag usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ]
        #
        # # Without it:
        #
        # perf script
        swapper   0 [000]  9693.370039:          1 cycles:ppp:
                        2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                       137ffeb start_kernel ([kernel.vmlinux].init.text)
                       137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
                       137f419 x86_64_start_kernel ([kernel.vmlinux].init.text)
      
        swapper   0 [000]  9693.370044:          1 cycles:ppp:
                        20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                       137ffeb start_kernel ([kernel.vmlinux].init.text)
                       137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
        #
        # # Using it to see just what are the calls from the 'remote_function' function:
        #
        # perf script --stop-bt remote_function
        swapper   0 [000]  9693.370039:          1 cycles:ppp:
                        2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
      
        swapper   0 [000]  9693.370044:          1 cycles:ppp:
                        20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                        3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1480104021-36275-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      64eff7d9
  10. 29 11月, 2016 1 次提交
    • W
      perf tools: Introduce perf hooks · a074865e
      Wang Nan 提交于
      Perf hooks allow hooking user code at perf events. They can be used for
      manipulation of BPF maps, taking snapshot and reporting results. In this
      patch two perf hook points are introduced: record_start and record_end.
      
      To avoid buggy user actions, a SIGSEGV signal handler is introduced into
      'perf record'. It turns off perf hook if it causes a segfault and report
      an error to help debugging.
      
      A test case for perf hook is introduced.
      
      Test result:
        $ ./buildperf/perf test -v hook
        50: Test perf hooks                                          :
        --- start ---
        test child forked, pid 10311
        SIGSEGV is observed as expected, try to recover.
        Fatal error (SEGFAULT) in perf hook 'test'
        test child finished with 0
        ---- end ----
        Test perf hooks: Ok
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161126070354.141764-5-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a074865e
  11. 25 11月, 2016 10 次提交
    • W
      perf tools: Add missing struct definition in probe_event.h · d6be1671
      Wang Nan 提交于
      Commit 0b3c2264 ("perf symbols: Fix kallsyms perf test on ppc64le")
      refers struct symbol in probe_event.h, but forgets to include its
      definition.  Gcc will complain about it when that definition is not
      added, by sheer luck, by some other header included before
      probe_event.h.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161115040617.69788-4-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d6be1671
    • W
      perf record: Fix segfault when running with suid and kptr_restrict is 1 · 3dbe46c5
      Wang Nan 提交于
      Before this patch perf panics if kptr_restrict is set to 1 and perf is
      owned by root with suid set:
      
        $ whoami
        wangnan
        $ ls -l ./perf
        -rwsr-xr-x 1 root root 19781908 Sep 21 19:29 /home/wangnan/perf
        $ cat /proc/sys/kernel/kptr_restrict
        1
        $ cat /proc/sys/kernel/perf_event_paranoid
        -1
        $ ./perf record -a
        Segmentation fault (core dumped)
        $
      
      The reason is that perf assumes it is allowed to read kptr from
      /proc/kallsyms when euid is root, but in fact the kernel doesn't allow
      reading kptr when euid and uid do not match with each other:
      
        $ cp /bin/cat .
        $ sudo chown root:root ./cat
        $ sudo chmod u+s ./cat
        $ cat /proc/kallsyms | grep do_fork
        0000000000000000 T _do_fork          <--- kptr is hidden even euid is root
        $ sudo cat /proc/kallsyms | grep do_fork
        ffffffff81080230 T _do_fork
      
      See lib/vsprintf.c for kernel side code.
      
      This patch fixes this problem by checking both uid and euid.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161115040617.69788-3-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3dbe46c5
    • W
      perf tools: Fix kernel version error in ubuntu · d18acd15
      Wang Nan 提交于
      On ubuntu the internal kernel version code is different from what can
      be retrived from uname:
      
       $ uname -r
       4.4.0-47-generic
       $ cat /lib/modules/`uname -r`/build/include/generated/uapi/linux/version.h
       #define LINUX_VERSION_CODE 263192
       #define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))
       $ cat /lib/modules/`uname -r`/build/include/generated/utsrelease.h
       #define UTS_RELEASE "4.4.0-47-generic"
       #define UTS_UBUNTU_RELEASE_ABI 47
       $ cat /proc/version_signature
       Ubuntu 4.4.0-47.68-generic 4.4.24
      
      The macro LINUX_VERSION_CODE is set to 4.4.24 (263192 == 0x40418), but
      `uname -r` reports 4.4.0.
      
      This mismatch causes LINUX_VERSION_CODE macro passed to BPF script become
      an incorrect value, results in magic failure in BPF loading:
      
       $ sudo ./buildperf/perf record -e ./tools/perf/tests/bpf-script-example.c ls
       event syntax error: './tools/perf/tests/bpf-script-example.c'
                            \___ Failed to load program for unknown reason
      
      According to Ubuntu document (https://wiki.ubuntu.com/Kernel/FAQ), the
      correct kernel version can be retrived through /proc/version_signature, which
      is ubuntu specific.
      
      This patch checks the existance of /proc/version_signature, and returns
      version number through parsing this file instead of uname. Version string
      is untouched (value returns from uname) because `uname -r` is required
      to be consistence with path of kbuild directory in /lib/module.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/20161115040617.69788-2-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d18acd15
    • N
      perf sched timehist: Mark schedule function in callchains · cdeb01bf
      Namhyung Kim 提交于
      The sched_switch event always captured from the scheduler function.  So
      it'd be great omit them from the callchain.  This patch marks the
      functions to be omitted by later patch.
      
      Committer notes:
      
      Testing it:
      
      Before:
      
        [root@jouet experimental]# perf sched record -g ls
        Dockerfile  perf.data  x-mips64
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.355 MB perf.data (29 samples) ]
        [root@jouet experimental]# perf sched timehist
            time  cpu  task name         wait time sch delay run time
                       [tid/pid]             (msec) (msec) (msec)
        ----------- -----  ----------------- ------ ------ ------
        6.494998 [001] <idle>                0.000  0.000  0.000
        6.495027 [002] perf[519]             0.000  0.000  0.000 __schedule <- schedule <- schedule_hrtimeout_range_clock <- schedule_hrtimeou
        6.495096 [003] <idle>                0.000  0.000  0.000
        6.495100 [003] rcuos/0[9]            0.000  0.005  0.003 __schedule <- schedule <- rcu_nocb_kthread <- kthread <- ret_from_fork
        6.495113 [001] perf[520]             0.000  0.008  0.114 __schedule <- preempt_schedule_common <- _cond_resched <- wait_for_completion
        6.495121 [000] <idle>                0.000  0.000  0.000
        6.495129 [001] migration/1[17]       0.000  0.003  0.016 __schedule <- schedule <- smpboot_thread_fn <- kthread <- ret_from_fork
        6.496085 [002] <idle>                0.000  0.000  1.057
        6.496096 [002] kworker/u16:1[31169]  0.000  0.004  0.011 __schedule <- schedule <- worker_thread <- kthread <- ret_from_fork
        6.496096 [003] <idle>                0.003  0.000  0.996
        6.496169 [002] <idle>                0.011  0.000  0.072
        6.496171 [000] ls[520]               0.008  0.000  1.049 __schedule <- schedule <- do_exit <- do_group_exit <- [unknown]
        6.496172 [003] gnome-terminal-[4391] 0.000  0.003  0.076 __schedule <- schedule <- schedule_hrtimeout_range_clock <- schedule_hrtimeo
      
      After:
      
        [root@jouet experimental]# perf sched timehist
            time  cpu  task name         wait time sch delay run time
                       [tid/pid]            (msec)  (msec)  (msec)
        ----------- -----  ----------------- -----  -----  ------
        6.494998 [001] <idle>                0.000  0.000  0.000
        6.495027 [002] perf[519]             0.000  0.000  0.000 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_t
        6.495096 [003] <idle>                0.000  0.000  0.000
        6.495100 [003] rcuos/0[9]            0.000  0.005  0.003 rcu_nocb_kthread <- kthread <- ret_from_fork
        6.495113 [001] perf[520]             0.000  0.008  0.114 preempt_schedule_common <- _cond_resched <- wait_for_completion <- stop_one_c
        6.495121 [000] <idle>                0.000  0.000  0.000
        6.495129 [001] migration/1[17]       0.000  0.003  0.016 smpboot_thread_fn <- kthread <- ret_from_fork
        6.496085 [002] <idle>                0.000  0.000  1.057
        6.496096 [002] kworker/u16:1[31169]  0.000  0.004  0.011 worker_thread <- kthread <- ret_from_fork
        6.496096 [003] <idle>                0.003  0.000  0.996
        6.496169 [002] <idle>                0.011  0.000  0.072
        6.496171 [000] ls[520]               0.008  0.000  1.049 do_exit <- do_group_exit <- [unknown]
        6.496172 [003] gnome-terminal-[4391] 0.000  0.003  0.076 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_
        [root@jouet experimental]#
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20161124011114.7102-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cdeb01bf
    • N
      perf callchain: Add option to skip ignore symbol when printing callchains · 2d9bbf6e
      Namhyung Kim 提交于
      For tracepoint events, callchains always contain certain functions.
      Sometimes it'd be better to skip those functions as they have no value.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20161124011114.7102-2-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2d9bbf6e
    • R
      perf annotate: Initial PowerPC support · dbdebdc5
      Ravi Bangoria 提交于
      Support the PowerPC architecture using the ins_ops association
      method.
      
      Committer notes:
      
      Testing it with a perf.data file collected on a PowerPC machine and
      cross-annotated on a x86_64 workstation, using the associated vmlinux
      file:
      
      $ perf report -i perf.data.f22vm.powerdev --vmlinux vmlinux.powerpc
        .ktime_get  vmlinux.powerpc
              │      clrldi r9,r28,63
         8.57 │   ┌──bne    e0                   <- TUI cursor positioned here
              │54:│  lwsync
         2.86 │   │  std    r2,40(r1)
              │   │  ld     r9,144(r31)
              │   │  ld     r3,136(r31)
              │   │  ld     r30,184(r31)
              │   │  ld     r10,0(r9)
              │   │  mtctr  r10
              │   │  ld     r2,8(r9)
         8.57 │   │→ bctrl
              │   │  ld     r2,40(r1)
              │   │  ld     r10,160(r31)
              │   │  ld     r5,152(r31)
              │   │  lwz    r7,168(r31)
              │   │  ld     r9,176(r31)
         8.57 │   │  lwz    r6,172(r31)
              │   │  lwsync
         2.86 │   │  lwz    r8,128(r31)
              │   │  cmpw   cr7,r8,r28
         2.86 │   │↑ bne    48
              │   │  subf   r10,r10,r3
              │   │  mr     r3,r29
              │   │  and    r10,r10,r5
         2.86 │   │  mulld  r10,r10,r7
              │   │  add    r9,r10,r9
              │   │  srd    r9,r9,r6
              │   │  add    r9,r9,r30
              │   │  std    r9,0(r29)
              │   │  addi   r1,r1,144
              │   │  ld     r0,16(r1)
              │   │  ld     r28,-32(r1)
              │   │  ld     r29,-24(r1)
              │   │  ld     r30,-16(r1)
              │   │  mtlr   r0
              │   │  ld     r31,-8(r1)
              │   │← blr
         5.71 │e0:└─→mr     r1,r1
        11.43 │      mr     r2,r2
        11.43 │      lwz    r28,128(r31)
        Press 'h' for help on key bindings
      
        $ perf report -i perf.data.f22vm.powerdev --header-only
        # ========
        # captured on: Thu Nov 24 12:40:38 2016
        # hostname : pdev-f22-qemu
        # os release : 4.4.10-200.fc22.ppc64
        # perf version : 4.9.rc1.g6298ce
        # arch : ppc64
        # nrcpus online : 48
        # nrcpus avail : 48
        # cpudesc : POWER7 (architected), altivec supported
        # cpuid : 74,513
        # total memory : 4158976 kB
        # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
        # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1
        # HEADER_CPU_TOPOLOGY info available, use -I to display
        # HEADER_NUMA_TOPOLOGY info available, use -I to display
        # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
        # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
        # ========
        #
        $
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Link: http://lkml.kernel.org/n/tip-tbjnp40ddoxxl474uvhwi6g4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dbdebdc5
    • A
      perf annotate: Improve support for ARM · acc9bfb5
      Arnaldo Carvalho de Melo 提交于
      By using arch->init() to set up some regular expressions to associate
      ins_ops to ARM instructions, ditching that old table that has
      instructions not present on ARM.
      
      Take advantage of having an arch->init() to hide more arm specific stuff
      from the common code, like the objdump details.
      
      The regular expressions comes from a patch written by Kim Phillips.
      Reviewed-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-77m7lufz9ajjimkrebtg5ead@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      acc9bfb5
    • A
      perf annotate: Allow arches to have a init routine and a priv area · 0781ea92
      Arnaldo Carvalho de Melo 提交于
      Arches like ARM will want to use regular expressions when deciding what
      instructions to associate with what ins_ops, provide infrastructure for
      that.
      Reviewed-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-7dmnk9el2ipu3nxog092k9z5@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0781ea92
    • A
      perf annotate: Introduce alternative method of keeping instructions table · 2a1ff812
      Arnaldo Carvalho de Melo 提交于
      Some arches may want to dynamically populate the table using regular
      expressions on the instruction names to associate them with a set of
      parsing/formatting/etc functions (struct ins_ops), so provide a fallback
      for when the ins__find() method fails.
      
      That fall back will be able to resize the arch->instructions, setting
      arch->nr_instructions appropriately, helper functions to associate an
      ins_ops to an instruction name, growing the arch->instructions if needed
      and resorting it are provided, all the arch specific callback needs to
      do is to decide if the missing instruction should be added to
      arch->instructions with a ins_ops association.
      Reviewed-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-auu13yradxf7g5dgtpnzt97a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2a1ff812
    • A
      perf annotate: Remove duplicate 'name' field from disasm_line · 75b49202
      Arnaldo Carvalho de Melo 提交于
      The disasm_line::name field is always equal to ins::name, being used
      just to locate the instruction's ins_ops from the per-arch instructions
      table.
      
      Eliminate this duplication, nuking that field and instead make
      ins__find() return an ins_ops, store it in disasm_line::ins.ops, and
      keep just in disasm_line::ins.name what was in disasm_line::name, this
      way we end up not keeping a reference to entries in the per-arch
      instructions table.
      
      This in turn will help supporting multiple ways to manage the per-arch
      instructions table, allowing resorting that array, for instance, when
      the entries will move after references to its addresses were made. The
      same problem is avoided when one grows the array with realloc.
      
      So architectures simply keeping a constant array will work as well as
      architectures building the table using regular expressions or other
      logic that involves resorting the table.
      Reviewed-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-vr899azvabnw9gtuepuqfd9t@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      75b49202