1. 12 4月, 2018 7 次提交
    • J
      perf script: Use HAVE_LIBXXX_SUPPORT to replace NO_LIBXXX · 90ce61b9
      Jin Yao 提交于
      In Makefile.config, we define the conditional compilation variables
      HAVE_LIBPERL_SUPPORT and HAVE_LIBPYTHON_SUPPORT.
      
      To make the C code more consistent, this patch replaces
      NO_LIBPERL/NO_LIBPYTHON in C code with HAVE_LIBPERL_SUPPORT/
      HAVE_LIBPYTHON_SUPPORT.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1523269609-28824-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      90ce61b9
    • A
      perf tests bpf: Remove unused ptrace.h include from LLVM test · c13009c1
      Arnaldo Carvalho de Melo 提交于
      The bpf-script-test-kbuild.c script, used in one of the LLVM subtests,
      includes ptrace.h unnecessarily, and that ends up making it include a
      header that uses asm(_ASM_SP), a feature that is not supported by clang
      <= 4.0, breaking that 'perf test' entry.
      
      This ended up leading to the ca26cffa ("x86/asm: Allow again using
      asm.h when building for the 'bpf' clang target"), adding an ifndef
      __BPF__ to the arch/x86/include/asm/asm.h file.
      
      Newer clang versions accept that asm(_ASM_SP) construct, so just remove
      the ptrace.h include, which paves the way for reverting ca26cffa
      ("x86/asm: Allow again using asm.h when building for the 'bpf' clang
      target").
      Suggested-by: NYonghong Song <yhs@fb.com>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/r/613f0a0d-c433-8f4d-dcc1-c9889deae39e@fb.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-clbcnzbakdp18ibme4wt43ib@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c13009c1
    • A
      perf jvmti: Give hints about package names needed to build · e14b733c
      Arnaldo Carvalho de Melo 提交于
      Give as examples of package names to install to have this built for
      fedora and debian, to help the user a bit.
      
      The part from 'e.g.:' onwards:
      
        No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel
      
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-edbi4r2pvzn7no6ebxbtczng@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e14b733c
    • A
      perf annotate browser: Allow showing offsets in more than just jump targets · 51f39603
      Arnaldo Carvalho de Melo 提交于
      Jesper wanted to see offsets at callq sites when doing some performance
      investigation related to retpolines, so save him some time by providing
      a 'O' hotkey to allow showing offsets from function start at call
      instructions or in all instructions, just go on pressing 'O' till the
      offsets you need appear.
      
      Example:
      
      Starts with:
      
      Samples: 64  of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963
      ixgbe_read_reg  /proc/kcore
      Percent│    ↑ je     2a
             │   ┌──cmp    $0xffffffff,%r13d
             │   ├──je     d0
             │   │  mov    $0x53e3,%edi
             │   │→ callq  __const_udelay
             │   │  sub    $0x1,%r15d
             │   │↑ jne    83
             │   │  mov    0x8(%rbp),%rax
             │   │  testb  $0x20,0x1799(%rax)
             │   │↑ je     2a
             │   │  mov    0x200(%rax),%rdi
             │   │  mov    %r13d,%edx
             │   │  mov    $0xffffffffc02595d8,%rsi
             │   │→ callq  netdev_warn
             │   │↑ jmpq   2a
             │d0:└─→mov    0x8(%rbp),%rsi
             │      mov    %rbp,%rdi
             │      mov    %eax,0x4(%rsp)
             │    → callq  ixgbe_remove_adapter.isra.77
             │      mov    0x4(%rsp),%eax
      Press 'h' for help on key bindings
      ============================================================================
      
      Pess 'O':
      
      Samples: 64  of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963
      ixgbe_read_reg  /proc/kcore
      Percent│    ↑ je     2a
             │   ┌──cmp    $0xffffffff,%r13d
             │   ├──je     d0
             │   │  mov    $0x53e3,%edi
             │99:│→ callq  __const_udelay
             │   │  sub    $0x1,%r15d
             │   │↑ jne    83
             │   │  mov    0x8(%rbp),%rax
             │   │  testb  $0x20,0x1799(%rax)
             │   │↑ je     2a
             │   │  mov    0x200(%rax),%rdi
             │   │  mov    %r13d,%edx
             │   │  mov    $0xffffffffc02595d8,%rsi
             │c6:│→ callq  netdev_warn
             │   │↑ jmpq   2a
             │d0:└─→mov    0x8(%rbp),%rsi
             │      mov    %rbp,%rdi
             │      mov    %eax,0x4(%rsp)
             │db: → callq  ixgbe_remove_adapter.isra.77
             │      mov    0x4(%rsp),%eax
      Press 'h' for help on key bindings
      ============================================================================
      
      Press 'O' again:
      
      Samples: 64  of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963
      ixgbe_read_reg  /proc/kcore
      Percent│8c: ↑ je     2a
             │8e:┌──cmp    $0xffffffff,%r13d
             │92:├──je     d0
             │94:│  mov    $0x53e3,%edi
             │99:│→ callq  __const_udelay
             │9e:│  sub    $0x1,%r15d
             │a2:│↑ jne    83
             │a4:│  mov    0x8(%rbp),%rax
             │a8:│  testb  $0x20,0x1799(%rax)
             │af:│↑ je     2a
             │b5:│  mov    0x200(%rax),%rdi
             │bc:│  mov    %r13d,%edx
             │bf:│  mov    $0xffffffffc02595d8,%rsi
             │c6:│→ callq  netdev_warn
             │cb:│↑ jmpq   2a
             │d0:└─→mov    0x8(%rbp),%rsi
             │d4:   mov    %rbp,%rdi
             │d7:   mov    %eax,0x4(%rsp)
             │db: → callq  ixgbe_remove_adapter.isra.77
             │e0:   mov    0x4(%rsp),%eax
      Press 'h' for help on key bindings
      ============================================================================
      
      Press 'O' again and it will show just jump target offsets.
      Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-upp6pfdetwlsx18ec2uf1od4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      51f39603
    • A
      perf annotate: Allow showing offsets in more than just jump targets · 592c10e2
      Arnaldo Carvalho de Melo 提交于
      Jesper wanted to see offsets at callq sites when doing some performance
      investigation related to retpolines, so save him some time by providing
      an 'struct annotation_options' to control where offsets should appear:
      just on jump targets? That + call instructions? All?
      
      This puts in place the logic to show the offsets, now we need to wire
      this up in the TUI browser (next patch) and on the 'perf annotate --stdio2"
      interface, where we need a more general mechanism to setup the
      'annotation_options' struct from the command line.
      Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-m3jc9c3swobye9tj08gnh5i7@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      592c10e2
    • K
      perf tests: Run dwarf unwind test on arm32 · af72cfb8
      Kim Phillips 提交于
      Enable the unwind test on arm32:
      
        $ perf test unwind
        58: DWARF unwind                                          : Ok
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Brian Robbins <brianrob@microsoft.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180410191624.a3a468670dd4548c66d3d094@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af72cfb8
    • A
      perf stat: Enable 1ms interval for printing event counters values · 9dc9a95f
      Alexey Budankov 提交于
      Currently print count interval for performance counters values is
      limited by 10ms so reading the values at frequencies higher than 100Hz
      is restricted by the tool.
      
      This change makes perf stat -I possible on frequencies up to 1KHz and,
      to some extent, makes perf stat -I to be on-par with perf record
      sampling profiling.
      
      When running perf stat -I for monitoring e.g. PCIe uncore counters and
      at the same time profiling some I/O workload by perf record e.g. for
      cpu-cycles and context switches, it is then possible to observe
      consolidated CPU/OS/IO(Uncore) performance picture for that workload.
      
      Tool overhead warning printed when specifying -v option can be missed
      due to screen scrolling in case you have output to the console
      so message is moved into help available by running perf stat -h.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/b842ad6a-d606-32e4-afe5-974071b5198e@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9dc9a95f
  2. 09 4月, 2018 4 次提交
    • S
      perf tests clang: Fix function name for clang IR test · fcbd8fa4
      Sandipan Das 提交于
      As stated in tests/llvm-src-base.c, the name of the bpf function should
      be "bpf_func__SyS_epoll_pwait" but this clang test fails as it tries to
      lookup "bpf_func__SyS_epoll_wait".
      
      Before applying patch:
      
      55: builtin clang support                                 :
      55.1: builtin clang compile C source to IR                : FAILED!
      55.2: builtin clang compile C source to ELF object        : Skip
      
      After applying patch:
      
      55: builtin clang support                                 :
      55.1: builtin clang compile C source to IR                : Ok
      55.2: builtin clang compile C source to ELF object        : Ok
      Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Fixes: e67d52d4 ("perf clang: Update test case to use real BPF script")
      Link: http://lkml.kernel.org/r/20180404180419.19056-3-sandipan@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fcbd8fa4
    • S
      perf clang: Add support for recent clang versions · 7854e499
      Sandipan Das 提交于
      The clang API calls used by perf have changed in recent releases and
      builds succeed with libclang-3.9 only. This introduces compatibility
      with libclang-4.0 and above.
      
      Without this patch, we will see the following compilation errors with
      libclang-4.0+:
      
       util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’:
       util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope
         Opts.Inputs.emplace_back(Path, IK_C);
                                        ^~~~
       util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::Module> perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr<clang::vfs::FileSystem>)’:
       util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’
         Clang.setInvocation(&*CI);
                                 ^
       In file included from util/c++/clang.cpp:14:0:
       /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr<clang::CompilerInvocation>)
          void setInvocation(std::shared_ptr<CompilerInvocation> Value);
               ^~~~~~~~~~~~~
      
      Committer testing:
      
      Tested on Fedora 27 after installing the clang-devel and llvm-devel
      packages, versions:
      
        # rpm -qa | egrep llvm\|clang
        llvm-5.0.1-6.fc27.x86_64
        clang-libs-5.0.1-5.fc27.x86_64
        clang-5.0.1-5.fc27.x86_64
        clang-tools-extra-5.0.1-5.fc27.x86_64
        llvm-libs-5.0.1-6.fc27.x86_64
        llvm-devel-5.0.1-6.fc27.x86_64
        clang-devel-5.0.1-5.fc27.x86_64
        #
      
      Make sure you don't have some older version lying around in /usr/local,
      etc, then:
      
        $ make LIBCLANGLLVM=1 -C tools/perf install-bin
      
      And in the end perf will be linked agains these libraries:
      
        # ldd ~/bin/perf | egrep -i llvm\|clang
      	libclangAST.so.5 => /lib64/libclangAST.so.5 (0x00007f8bb2eb4000)
      	libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x00007f8bb29e3000)
      	libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x00007f8bb23f7000)
      	libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x00007f8bb2060000)
      	libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x00007f8bb1d06000)
      	libclangLex.so.5 => /lib64/libclangLex.so.5 (0x00007f8bb1a3e000)
      	libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x00007f8bb17d4000)
      	libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x00007f8bb15c5000)
      	libclangSema.so.5 => /lib64/libclangSema.so.5 (0x00007f8bb0cc9000)
      	libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x00007f8bb0a23000)
      	libclangParse.so.5 => /lib64/libclangParse.so.5 (0x00007f8bb0725000)
      	libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x00007f8bb039a000)
      	libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x00007f8bace98000)
      	libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x00007f8bab735000)
      	libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x00007f8bab4b2000)
      	libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x00007f8bab2a1000)
      	libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x00007f8bab08e000)
        #
      Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Fixes: 00b86691 ("perf clang: Add builtin clang support ant test case")
      Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandipan@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7854e499
    • S
      perf tools: Fix perf builds with clang support · c2fb54a1
      Sandipan Das 提交于
      For libclang, some distro packages provide static libraries (.a) while
      some provide shared libraries (.so). Currently, perf code can only be
      linked with static libraries. This makes perf build possible for both
      cases.
      Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Fixes: d58ac0bf ("perf build: Add clang and llvm compile and linking support")
      Link: http://lkml.kernel.org/r/20180404180419.19056-1-sandipan@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2fb54a1
    • A
      perf tools: No need to include namespaces.h in util.h · ad0902e0
      Arnaldo Carvalho de Melo 提交于
      The only thing that is needed there is a forward declaration for 'struct
      nsinfo', so disentanble this, which in turns allows built-in clang
      builds, i.e. 'make LIBCLANGLLVM=1 -C tools/perf'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-vq26rsuwq1cqylpcyvq89c84@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ad0902e0
  3. 06 4月, 2018 4 次提交
  4. 05 4月, 2018 4 次提交
  5. 04 4月, 2018 4 次提交
    • C
      perf trace: Remove redundant ')' · 51125a29
      Changbin Du 提交于
      There is a redundant ')' at the tail of each event. So remove it.
      
      $ sudo perf trace --no-syscalls -e 'kmem:*' -a
         899.342 kmem:kfree:(vfs_writev+0xb9) call_site=ffffffff9c453979 ptr=(nil))
         899.344 kmem:kfree:(___sys_recvmsg+0x188) call_site=ffffffff9c9b8b88 ptr=(nil))
      Signed-off-by: NChangbin Du <changbin.du@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1520937601-24952-1-git-send-email-changbin.du@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      51125a29
    • A
      perf annotate stdio2: Print more descriptive event information header · 520d3f01
      Arnaldo Carvalho de Melo 提交于
      To match the recently added event header information to --tui, e.g.:
      
        # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
        Samples: 128  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 48617682
        _raw_spin_lock_irqsave() /proc/kcore
          0.78        nop
          7.03        push   %rbx
          3.12        pushfq
          6.25        pop    %rax
                      nop
                      mov    %rax,%rbx
          3.12        cli
                      nop
                      xor    %eax,%eax
                      mov    $0x1,%edx
         79.69        lock   cmpxchg %edx,(%rdi)
                      test   %eax,%eax
                    ↓ jne    2b
                      mov    %rbx,%rax
                      pop    %rbx
                    ← retq
                2b:   mov    %eax,%esi
                    → callq  *ffffffffb30eaed0
                      mov    %rbx,%rax
                      pop    %rbx
                    ← retq
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-ujy46x7cldyhyxelyf2b9quy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      520d3f01
    • A
      perf annotate browser: Show extra title line with event information · 6920e285
      Arnaldo Carvalho de Melo 提交于
      So at the top we'll have two lines, like this, from 'perf report':
      
        # perf report --group --ignore-vmlinux
      =====================================================================================================
      Samples: 46  of events 'cycles', 4000 Hz, Event count (approx.): 5154895
      _raw_spin_lock_irqsave  /proc/kcore
      Percent              │      nop
                           │      push   %rbx
        0.00  14.29   0.00 │      pushfq
        9.09   0.00   0.00 │      pop    %rax
        9.09   0.00  20.00 │      nop
                           │      mov    %rax,%rbx
                           │      cli
        4.55   7.14   0.00 │      nop
                           │      xor    %eax,%eax
                           │      mov    $0x1,%edx
                           │      lock   cmpxchg %edx,(%rdi)
       77.27  78.57  70.00 │      test   %eax,%eax
                           │    ↓ jne    2b
                           │      mov    %rbx,%rax
        0.00   0.00  10.00 │      pop    %rbx
                           │    ← retq
                           │2b:   mov    %eax,%esi
                           │    → callq  queued_spin_lock_slowpath
                           │      mov    %rbx,%rax
                           │      pop    %rbx
      Press 'h' for help on│key bindings
      =====================================================================================================
      
       9.09 + 9.09 + 4.55 + 77.27 = 100
      14.29 + 7.14 + 78.57 = 100
      20 + 70 + 10 = 100
      
      We can do the math by using 't' to toggle from 'percent' to nr
      
      =====================================================================================================
      Samples: 46  of events 'cycles', 4000 Hz, Event count (approx.): 5154895
      _raw_spin_lock_irqsave  /proc/kcore
      Period                              │      nop
                                          │      push   %rbx
                0       79273           0 │      pushfq
           190455           0           0 │      pop    %rax
           198038           0        3045 │      nop
                                          │      mov    %rax,%rbx
                                          │      cli
           217233       32562           0 │      nop
                                          │      xor    %eax,%eax
                                          │      mov    $0x1,%edx
                                          │      lock   cmpxchg %edx,(%rdi)
          3421649      979174       28273 │      test   %eax,%eax
                                          │    ↓ jne    2b
                                          │      mov    %rbx,%rax
                0           0        5193 │      pop    %rbx
                                          │    ← retq
                                          │2b:   mov    %eax,%esi
                                          │    → callq  queued_spin_lock_slowpath
                                          │      mov    %rbx,%rax
                                          │      pop    %rbx
      Press 'h' for help on│key bindings
      =====================================================================================================
      
      79273 + 190455 + 198038 + 3045 + 217233 + 32562 + 3421649 + 979174 + 28273 + 5193 = 5154895
      
      Or number of samples:
      
      =====================================================================================================
      ooSamples: 46  of events 'cycles', 4000 Hz, Event count (approx.): 5154895
      _raw_spin_lock_irqsave  /proc/kcore
      Samples              │      nop
                           │      push   %rbx
           0      2      0 │      pushfq
           2      0      0 │      pop    %rax
           2      0      2 │      nop
                           │      mov    %rax,%rbx
                           │      cli
           1      1      0 │      nop
                           │      xor    %eax,%eax
                           │      mov    $0x1,%edx
                           │      lock   cmpxchg %edx,(%rdi)
          17     11      7 │      test   %eax,%eax
                           │    ↓ jne    2b
                           │      mov    %rbx,%rax
           0      0      1 │      pop    %rbx
                           │    ← retq
                           │2b:   mov    %eax,%esi
                           │    → callq  queued_spin_lock_slowpath
                           │      mov    %rbx,%rax
                           │      pop    %rbx
      Press 'h' for help on key bindings
      =====================================================================================================
      
      2 + 2 + 2 + 2 + 1 + 1 + 17 + 11 + 7 + 1 = 46
      Suggested-by: NMartin Liška <mliska@suse.cz>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
      Link: https://lkml.kernel.org/n/tip-ezccyxld50wtwyt66np6aomo@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6920e285
    • A
      perf annotate: Introduce annotation__scnprintf_samples_period() method · b213eac2
      Arnaldo Carvalho de Melo 提交于
      To print a string using the total period (nr_events) and the number of
      samples for a given annotation, i.e. for a given symbol, the counterpart
      to hists__scnprintf_samples_period(), that is for all the samples in a
      session (be it a live session, think 'perf top' or a perf.data file,
      think 'perf report').
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
      Link: https://lkml.kernel.org/n/tip-goj2wu4fxutc8vd46mw3yg14@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b213eac2
  6. 03 4月, 2018 9 次提交
  7. 02 4月, 2018 2 次提交
    • A
      perf trace: Show only failing syscalls · 0a6545bd
      Arnaldo Carvalho de Melo 提交于
      For instance:
      
        # perf probe "vfs_getname=getname_flags:72 pathname=result->name:string"
        Added new event:
          probe:vfs_getname    (on getname_flags:72 with pathname=result->name:string)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe:vfs_getname -aR sleep 1
      
        # perf trace --failure sleep 1
           0.043 ( 0.010 ms): sleep/10978 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
      
      For reference, here are all the syscalls in this case:
      
        # perf trace sleep 1
               ? (         ): sleep/10976  ... [continued]: execve()) = 0
             0.027 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
             0.044 ( 0.010 ms): sleep/10976 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
             0.057 ( 0.006 ms): sleep/10976 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
             0.064 ( 0.002 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b370) = 0
             0.067 ( 0.003 ms): sleep/10976 mmap(len: 111457, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec8615000
             0.071 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
             0.080 ( 0.007 ms): sleep/10976 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
             0.088 ( 0.002 ms): sleep/10976 read(fd: 3, buf: 0x7fffac22b538, count: 832) = 832
             0.092 ( 0.001 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b3d0) = 0
             0.094 ( 0.002 ms): sleep/10976 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7feec8613000
             0.099 ( 0.004 ms): sleep/10976 mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7feec8057000
             0.104 ( 0.007 ms): sleep/10976 mprotect(start: 0x7feec8203000, len: 2097152) = 0
             0.112 ( 0.005 ms): sleep/10976 mmap(addr: 0x7feec8403000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1753088) = 0x7feec8403000
             0.120 ( 0.003 ms): sleep/10976 mmap(addr: 0x7feec8409000, len: 14976, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED) = 0x7feec8409000
             0.128 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
             0.139 ( 0.001 ms): sleep/10976 arch_prctl(option: 4098, arg2: 140663540761856) = 0
             0.186 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8403000, len: 16384, prot: READ) = 0
             0.204 ( 0.003 ms): sleep/10976 mprotect(start: 0x55bdc0ec3000, len: 4096, prot: READ) = 0
             0.209 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8631000, len: 4096, prot: READ) = 0
             0.214 ( 0.010 ms): sleep/10976 munmap(addr: 0x7feec8615000, len: 111457) = 0
             0.269 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
             0.271 ( 0.002 ms): sleep/10976 brk(brk: 0x55bdc2d25000) = 0x55bdc2d25000
             0.274 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d25000
             0.278 ( 0.007 ms): sleep/10976 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
             0.288 ( 0.001 ms): sleep/10976 fstat(fd: 3</usr/lib/locale/locale-archive>, statbuf: 0x7feec8408aa0) = 0
             0.290 ( 0.003 ms): sleep/10976 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec1488000
             0.297 ( 0.001 ms): sleep/10976 close(fd: 3</usr/lib/locale/locale-archive>) = 0
             0.325 (1000.193 ms): sleep/10976 nanosleep(rqtp: 0x7fffac22c0b0) = 0
          1000.560 ( 0.006 ms): sleep/10976 close(fd: 1) = 0
          1000.573 ( 0.005 ms): sleep/10976 close(fd: 2) = 0
          1000.596 (         ): sleep/10976 exit_group()
        #
      
      And can be done systemwide, etc, with backtraces:
      
        # perf trace --max-stack=16 --failure sleep 1
           0.048 ( 0.015 ms): sleep/11092 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
                                             __access (inlined)
                                             dl_main (/usr/lib64/ld-2.26.so)
        #
      
      Or for some specific syscalls:
      
        # perf trace --max-stack=16 -e openat --failure cat /tmp/rien
        cat: /tmp/rien: No such file or directory
             0.251 ( 0.012 ms): cat/11106 openat(dfd: CWD, filename: /tmp/rien) = -1 ENOENT No such file or directory
                                               __libc_open64 (inlined)
                                               main (/usr/bin/cat)
                                               __libc_start_main (/usr/lib64/libc-2.26.so)
                                               _start (/usr/bin/cat)
        #
      
      Look for inotify* syscalls that fail, system wide, for 2 seconds, with backtraces:
      
        # perf trace -a --max-stack=16 --failure -e inotify* sleep 2
         819.165 ( 0.058 ms): gmain/1724 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454) = -1 ENOENT No such file or directory
                                             __GI_inotify_add_watch (inlined)
                                             _ik_watch (/usr/lib64/libgio-2.0.so.0.5400.3)
                                             _ip_start_watching (/usr/lib64/libgio-2.0.so.0.5400.3)
                                             im_scan_missing (/usr/lib64/libgio-2.0.so.0.5400.3)
                                             g_timeout_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             g_main_context_iterate.isra.23 (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             g_main_context_iteration (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             glib_worker_main (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             g_thread_proxy (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             start_thread (/usr/lib64/libpthread-2.26.so)
                                             __GI___clone (inlined)
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-8f7d3mngaxvi7tlzloz3n7cs@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0a6545bd
    • K
      perf tools: Add a "dso_size" sort order · b74d12d5
      Kim Phillips 提交于
      Add DSO size to perf report/top sort output list.
      
      This includes adding a map__size fn to map.h, which is
      approximately equal to the DSO data file_size:
      
        DSO				file size	map (end-start)	file / (end-start)
        libwebkit2gtk-4.0.so.37.24.9	43260072	41295872	95%
        libglib-2.0.so.0.5400.1		 1125680	 1118208	99%
        libc-2.26.so			 1960656 	 1925120	101%
        libdbus-1.so.3.14.13		  309456 	  303104	102%
      
      Sample output:
      
        $ ./perf report -s dso_size,dso
        Samples: 2K of event 'cycles:uppp', Event count (approx.): 128373340
        Overhead  DSO size  Shared Object
          90.62%   unknown  [unknown]
           2.87%   1118208  libglib-2.0.so.0.5400.1
           1.92%    303104  libdbus-1.so.3.14.13
           1.42%   1925120  libc-2.26.so
           0.77%  41295872  libwebkit2gtk-4.0.so.37.24.9
           0.61%    335872  libgobject-2.0.so.0.5400.1
           0.41%   1052672  libgdk-3.so.0.2200.25
           0.36%    106496  libpthread-2.26.so
           0.29%    221184  dbus-daemon
           0.17%    159744  ld-2.26.so
           0.13%     49152  libwayland-client.so.0.3.0
           0.12%   1642496  libgio-2.0.so.0.5400.1
           0.09%   73277443  libgtk-3.so.0.2200.25
           0.09%  12324864  libmozjs-52.so.0.0.0
           0.05%   4796416  perf
           0.04%    843776  libgjs.so.0.0.0
           0.03%   1409024  libmutter-clutter-1.so
      
      Committer testing:
      
      To sort by DSO size, use:
      
        # perf report -F dso_size,dso,overhead -s dso_size
        <SNIP>
           3465216  libdns-export.so.174.0.1   0.00%
           3522560  libgc.so.1.0.3             0.00%
           3538944  libbfd-2.29-13.fc27.so     0.59%
           3670016  libunistring.so.2.1.0      0.00%
           3723264  libguile-2.0.so.22.8.1     0.00%
           3776512  libgio-2.0.so.0.5400.3     0.00%
           3891200  libc-2.26.so               0.96%
           3944448  libmozjs-17.0.so           0.00%
           4218880  libperl.so.5.26.1          0.18%
           4452352  libpython2.7.so.1.0        0.02%
           4472832  perf                       0.02%
           4603904  git                        0.01%
           4751360  libcrypto.so.1.1.0g        0.00%
           5005312  libslang.so.2.3.1          0.00%
           7315456  libgtk-3.so.0.2200.26      0.09%
           8818688  i965_dri.so                2.46%
           8818688  i965_dri.so (deleted)      1.26%
          12414976  libmozjs-52.so.0.0.0       0.03%
          23642112  cc1                        2.02%
          27889664  [kernel.kallsyms]         25.41%
          80834560  libxul.so (deleted)       15.68%
          98078720  chrome                    32.03%
        1056964608  [kernel.kallsyms]          1.59%
        #
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180327060956.1c01ebe67a2a941bb4468c6f@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b74d12d5
  8. 28 3月, 2018 6 次提交