1. 29 9月, 2017 2 次提交
  2. 25 9月, 2017 4 次提交
    • A
      perf tools: Fix syscalltbl build failure · 090657c9
      Akemi Yagi 提交于
      The build of kernel v4.14-rc1 for i686 fails on RHEL 6 with the error
      in tools/perf:
      
        util/syscalltbl.c:157: error: expected ';', ',' or ')' before '__maybe_unused'
        mv: cannot stat `util/.syscalltbl.o.tmp': No such file or directory
      
      Fix it by placing/moving:
      
        #include <linux/compiler.h>
      
        outside of #ifdef HAVE_SYSCALL_TABLE block.
      Signed-off-by: NAkemi Yagi <toracat@elrepo.org>
      Cc: Alan Bartlett <ajb@elrepo.org>
      Link: http://lkml.kernel.org/r/oq41r8$1v9$1@blaine.gmane.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      090657c9
    • M
      perf report: Fix debug messages with --call-graph option · 9789e7e9
      Mengting Zhang 提交于
      With --call-graph option, perf report can display call chains using
      type, min percent threshold, optional print limit and order. And the
      default call-graph parameter is 'graph,0.5,caller,function,percent'.
      
      Before this patch, 'perf report --call-graph' shows incorrect debug
      messages as below:
      
        # perf report --call-graph
        Invalid callchain mode: 0.5
        Invalid callchain order: 0.5
        Invalid callchain sort key: 0.5
        Invalid callchain config key: 0.5
        Invalid callchain mode: caller
        Invalid callchain mode: function
        Invalid callchain order: function
        Invalid callchain mode: percent
        Invalid callchain order: percent
        Invalid callchain sort key: percent
      
      That is because in function __parse_callchain_report_opt(),each field of
      the call-graph parameter is passed to parse_callchain_{mode,order,
      sort_key,value} in turn until it meets the matching value.
      
      For example, the order field "caller" is passed to
      parse_callchain_mode() firstly and obviously it doesn't match any mode
      field. Therefore parse_callchain_mode() will shows the debug message
      "Invalid callchain mode: caller", which could confuse users.
      
      The patch fixes this issue by moving the warning out of the function
      parse_callchain_{mode,order,sort_key,value}.
      Signed-off-by: NMengting Zhang <zhangmengting@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Krister Johansen <kjlx@templeofstupid.com>
      Cc: Li Bin <huawei.libin@huawei.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/1506154694-39691-1-git-send-email-zhangmengting@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9789e7e9
    • A
      perf evsel: Fix attr.exclude_kernel setting for default cycles:p · f1e52f14
      Arnaldo Carvalho de Melo 提交于
      Yet another fix for probing the max attr.precise_ip setting: it is not
      enough settting attr.exclude_kernel for !root users, as they _can_
      profile the kernel if the kernel.perf_event_paranoid sysctl is set to
      -1, so check that as well.
      
      Testing it:
      
      As non root:
      
        $ sysctl kernel.perf_event_paranoid
        kernel.perf_event_paranoid = 2
        $ perf record sleep 1
        $ perf evlist -v
        cycles:uppp: ..., exclude_kernel: 1, ... precise_ip: 3, ...
      
      Now as non-root, but with kernel.perf_event_paranoid set set to the
      most permissive value, -1:
      
        $ sysctl kernel.perf_event_paranoid
        kernel.perf_event_paranoid = -1
        $ perf record sleep 1
        $ perf evlist -v
        cycles:ppp: ..., exclude_kernel: 0, ... precise_ip: 3, ...
        $
      
      I.e. non-root, default kernel.perf_event_paranoid: :uppp modifier = not allowed to sample the kernel,
           non-root, most permissible kernel.perf_event_paranoid: :ppp = allowed to sample the kernel.
      
      In both cases, use the highest available precision: attr.precise_ip = 3.
      Reported-and-Tested-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: d37a3697 ("perf evsel: Fix attr.exclude_kernel setting for default cycles:p")
      Link: http://lkml.kernel.org/n/tip-nj2qkf75xsd6pw6hhjzfqqdx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f1e52f14
    • A
      perf tools: Get all of tools/{arch,include}/ in the MANIFEST · 89975bd3
      Arnaldo Carvalho de Melo 提交于
      Now that I'm switching the container builds from using a local volume
      pointing to the kernel repository with the perf sources, instead getting
      a detached tarball to be able to use a container cluster, some places
      broke because I forgot to put some of the required files in
      tools/perf/MANIFEST, namely some bitsperlong.h files.
      
      So, to fix it do the same as for tools/build/ and pack the whole
      tools/arch/ directory.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-wmenpjfjsobwdnfde30qqncj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      89975bd3
  3. 12 9月, 2017 7 次提交
    • M
      perf stat: Wait for the correct child · dfc9eec7
      Milian Wolff 提交于
      When packaging the perf userland application into an AppImage, the
      wait() call in perf stat returned too early. It turned out that some
      other child process exited, but not the one perf stat launched:
      
        $ sudo strace -e fork,execve,clone,wait4 -f ./perf-x86_64.AppImage stat sleep 1
        execve("./perf-git.3a73b7f9-x86_64.AppImage", ["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x7ffec1bbf050 /* 18 vars */) = 0
        clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6a6e7efe50) = 3912
        strace: Process 3912 attached
        [pid  3912] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6a6e7efe50) = 3914
        strace: Process 3914 attached
        [pid  3912] +++ exited with 0 +++
        [pid  3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3912, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
        [pid  3914] clone(strace: Process 3915 attached
        child_stack=0x7f6a6d9fefb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f6a6d9ff9d0, tls=0x7f6a6d9ff700, child_tidptr=0x7f6a6d9ff9d0) = 3915
        [pid  3911] execve("/tmp/.mount_perf-g6VYMpl/AppRun", ["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x14aab70 /* 21 vars */) = 0
        [pid  3911] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4ae113c4d0) = 3916
        strace: Process 3916 attached
        [pid  3911] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3912
        [pid  3916] execve("/usr/libexec/perf-core/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/tmp/./sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/home/milian/.bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/usr/lib/icecream/libexec/icecc/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/ssd2/milian/projects/compiled/other/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/home/milian/.bin/kf5/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/ssd2/milian/projects/compiled/kf5/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/home/milian/projects/compiled/other/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/home/milian/projects/compiled/kf5/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/usr/local/sbin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/usr/local/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/usr/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */
         Performance counter stats for 'sleep 1':
      
             <not counted>	task-clock
             <not counted>	context-switches
             <not counted>	cpu-migrations
             <not counted>	page-faults
             <not counted>	cycles
             <not counted>	instructions
             <not counted>      branches
             <not counted>      branch-misses
      
               0.000047194 seconds time elapsed
      
        [pid  3916] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=3911, si_uid=0} ---
        [pid  3916] +++ killed by SIGTERM +++
        [pid  3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=3916, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
        [pid  3915] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=3914, si_uid=0} ---
        [pid  3911] +++ exited with 0 +++
        [pid  3915] --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=3914, si_uid=0} ---
        [pid  3915] +++ exited with 0 +++
        +++ exited with 0 +++
      
      This patch uses waitpid instead to ensure the call waits for the
      debuggee application launched by 'perf stat'. This fixes 'perf stat'
      when launched from an AppImage:
      
        $ ./perf-x86_64.AppImage stat sleep 1
      
         Performance counter stats for 'sleep 1':
      
                0.357235      task-clock (msec)         #    0.000 CPUs utilized
                       1      context-switches          #    0.003 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      50      page-faults               #    0.140 M/sec
                 1269602      cycles                    #    3.554 GHz
                  654278      instructions              #    0.52  insn per cycle
                  129963      branches                  #  363.803 M/sec
                    7082      branch-misses             #    5.45% of all branches
      
             1.000633420 seconds time elapsed
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170912152523.4497-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dfc9eec7
    • M
      perf tools: Support running perf binaries with a dash in their name · 3192f1ed
      Milian Wolff 提交于
      Previously the part behind "perf-" was interpreted as an internal perf
      command. If the suffix could not be handled, the execution was stopped.
      This makes it impossible to launch perf binaries that got renamed to
      have the `perf-` prefix. This is e.g. the case for appimages (e.g.
      "perf-x86_64.AppImage"), but would also apply to all other scenarios
      where users symlink or rename perf themselves:
      
      Status quo with the broken behavior:
      
        $ ln -s ./perf ./perf-custom-suffix
        $ ./perf-custom-suffix list
        cannot handle custom-suffix internally$
      
      Also note the missing newline at the end of the error message.
      
      With this patch applied, the above works properly:
      
        $ ./perf-custom-suffix list
      
        List of pre-defined events (to be used in -e):
        ...
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170911111422.31903-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3192f1ed
    • T
      perf config: Check not only section->from_system_config but also item's · cba225d6
      Taeung Song 提交于
      Currently section->from_system_config is being checked multiple times.
      item->from_system_config should be checked instead, when iterating thru
      the items in a section. Fix it.
      Signed-off-by: NTaeung Song <treeze.taeung@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1504754325-9724-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cba225d6
    • J
      perf ui progress: Fix progress update · a82bfd04
      Jiri Olsa 提交于
      We currently update the 'next' variable only with a single step value.
      But it's possible the 'adv' update is bigger than single 'step' value.
      This would leave 'next' value under counted and force unnecessary
      ui_progress__ops->update calls.
      
      Calculate the amount of steps we need for 'adv' update and increase the
      'next' with that amounts of steps.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908120510.22515-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a82bfd04
    • J
      perf ui progress: Make sure we always define step value · 4d286c89
      Jiri Olsa 提交于
      Unlikely, but we could have ui_progress__init being called with total <
      16, which would set the next and step variables to 0. That would force
      unnecessary ui_progress__ops->update calls because 'next' would never
      raise.
      
      Forcing the next and step values to be always > 0.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908120510.22515-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4d286c89
    • J
      perf tools: Open perf.data with O_CLOEXEC flag · cd6379eb
      Jiri Olsa 提交于
      Do not carry the perf.data file descriptor into the workload process and
      close it when perf executes the workload.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908084621.31595-2-jolsa@kernel.org
      [ Add definitions for O_CLOEXEC for older systems ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cd6379eb
    • M
      perf tests: Fix compile when libunwind's unwind.h is available · df90cc41
      Milian Wolff 提交于
      When cross compiling perf and I want to link against a self-compiled
      libunwind, I usually make the custom path where the libunwind headers
      exist visible by adding the libunwind prefix to the include path when
      compiling perf, i.e.:
      
      ~~~~~
      $ ls $HOME/projects/compiled/other/include/
      libunwind-coredump.h  libunwind.h         libunwind-x86_64.h
      libunwind-common.h  libunwind-dynamic.h   libunwind-ptrace.h
      unwind.h
      $ make EXTRA_CFLAGS="-I$HOME/projects/compiled/other/include/
      ~~~~~~
      
      Note the `unwind.h` header from libunwind which leads to compile
      errors when compiling tests/dwarf-unwind.c, since it shadows perf's
      util/unwind.h:
      
      ~~~~~
      tests/dwarf-unwind.c:41:32: error: ‘struct unwind_entry’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
       static int unwind_entry(struct unwind_entry *entry, void *arg)
                                      ^~~~~~~~~~~~
      tests/dwarf-unwind.c: In function ‘unwind_entry’:
      tests/dwarf-unwind.c:44:22: error: dereferencing pointer to incomplete type ‘struct unwind_entry’
        char *symbol = entry->sym ? entry->sym->name : NULL;
                            ^~
      tests/dwarf-unwind.c: In function ‘unwind_thread’:
      tests/dwarf-unwind.c:92:8: error: implicit declaration of function ‘unwind__get_entries’; did you mean ‘unwind_entry’? [-Werror=implicit-function-declaration]
        err = unwind__get_entries(unwind_entry, &cnt, thread,
              ^~~~~~~~~~~~~~~~~~~
              unwind_entry
      tests/dwarf-unwind.c:92:8: error: nested extern declaration of ‘unwind__get_entries’ [-Werror=nested-externs]
      ~~~~~~
      
      Fix this compile error by specificing an explicit include of perf's
      unwind.h in the util folder.
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170906150209.12579-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      df90cc41
  4. 02 9月, 2017 12 次提交
  5. 30 8月, 2017 1 次提交
    • J
      perf report: Calculate the average cycles of iterations · c4ee0625
      Jin Yao 提交于
      The branch history code has a loop detection function. With this, we can
      get the number of iterations by calculating the removed loops.
      
      While it would be nice for knowing the average cycles of iterations.
      This patch adds up the cycles in branch entries of removed loops and
      save the result to the next branch entry (e.g. branch entry A).
      
      Finally it will display the iteration number and average cycles at the
      "from" of branch entry A.
      
      For example:
      perf record -g -j any,save_type ./div
      perf report --branch-history --no-children --stdio
      
      --22.63%--main div.c:42 (RET CROSS_2M)
                compute_flag div.c:28 (cycles:2 iter:173115 avg_cycles:2)
                |
                 --10.73%--compute_flag div.c:27 (RET CROSS_2M)
                           rand rand.c:28 (cycles:1)
                           rand rand.c:28 (RET CROSS_2M)
                           __random random.c:298 (cycles:1)
                           __random random.c:297 (COND_BWD CROSS_2M)
                           __random random.c:295 (cycles:1)
                           __random random.c:295 (COND_BWD CROSS_2M)
                           __random random.c:295 (cycles:1)
                           __random random.c:295 (RET CROSS_2M)
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1502111115-18305-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c4ee0625
  6. 29 8月, 2017 11 次提交
    • L
      perf symbols: Fix plt entry calculation for ARM and AARCH64 · b2f76050
      Li Bin 提交于
      On x86, the plt header size is as same as the plt entry size, and can be
      identified from shdr's sh_entsize of the plt.
      
      But we can't assume that the sh_entsize of the plt shdr is always the
      plt entry size in all architecture, and the plt header size may be not
      as same as the plt entry size in some architecure.
      
      On ARM, the plt header size is 20 bytes and the plt entry size is 12
      bytes (don't consider the FOUR_WORD_PLT case) that refer to the binutils
      implementation. The plt section is as follows:
      
      Disassembly of section .plt:
      000004a0 <__cxa_finalize@plt-0x14>:
       4a0:   e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
       4a4:   e59fe004        ldr     lr, [pc, #4]    ; 4b0 <_init+0x1c>
       4a8:   e08fe00e        add     lr, pc, lr
       4ac:   e5bef008        ldr     pc, [lr, #8]!
       4b0:   00008424        .word   0x00008424
      
      000004b4 <__cxa_finalize@plt>:
       4b4:   e28fc600        add     ip, pc, #0, 12
       4b8:   e28cca08        add     ip, ip, #8, 20  ; 0x8000
       4bc:   e5bcf424        ldr     pc, [ip, #1060]!        ; 0x424
      
      000004c0 <printf@plt>:
       4c0:   e28fc600        add     ip, pc, #0, 12
       4c4:   e28cca08        add     ip, ip, #8, 20  ; 0x8000
       4c8:   e5bcf41c        ldr     pc, [ip, #1052]!        ; 0x41c
      
      On AARCH64, the plt header size is 32 bytes and the plt entry size is 16
      bytes.  The plt section is as follows:
      
      Disassembly of section .plt:
      0000000000000560 <__cxa_finalize@plt-0x20>:
       560:   a9bf7bf0        stp     x16, x30, [sp,#-16]!
       564:   90000090        adrp    x16, 10000 <__FRAME_END__+0xf8a8>
       568:   f944be11        ldr     x17, [x16,#2424]
       56c:   9125e210        add     x16, x16, #0x978
       570:   d61f0220        br      x17
       574:   d503201f        nop
       578:   d503201f        nop
       57c:   d503201f        nop
      
      0000000000000580 <__cxa_finalize@plt>:
       580:   90000090        adrp    x16, 10000 <__FRAME_END__+0xf8a8>
       584:   f944c211        ldr     x17, [x16,#2432]
       588:   91260210        add     x16, x16, #0x980
       58c:   d61f0220        br      x17
      
      0000000000000590 <__gmon_start__@plt>:
       590:   90000090        adrp    x16, 10000 <__FRAME_END__+0xf8a8>
       594:   f944c611        ldr     x17, [x16,#2440]
       598:   91262210        add     x16, x16, #0x988
       59c:   d61f0220        br      x17
      
      NOTES:
      
      In addition to ARM and AARCH64, other architectures, such as
      s390/alpha/mips/parisc/poperpc/sh/sparc/xtensa also need to consider
      this issue.
      Signed-off-by: NLi Bin <huawei.libin@huawei.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
      Cc: David Tolnay <dtolnay@gmail.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: zhangmengting@huawei.com
      Link: http://lkml.kernel.org/r/1496622849-21877-1-git-send-email-huawei.libin@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b2f76050
    • L
      perf probe: Fix kprobe blacklist checking condition · 2c29461e
      Li Bin 提交于
      The commit 9aaf5a5f ("perf probe: Check kprobes blacklist when
      adding new events"), 'perf probe' supports checking the blacklist of the
      fuctions which can not be probed.  But the checking condition is wrong,
      that the end_addr of the symbol which is the start_addr of the next
      symbol can't be included.
      
      Committer notes:
      
      IOW make it match its kernel counterpart in kernel/kprobes.c:
      
        bool within_kprobe_blacklist(unsigned long addr)
      
      Each entry have as its end address not its end address, but the first
      address _outside_ that symbol, which for related functions, is the first
      address of the next symbol, like these from kernel/trace/trace_probe.c:
      
      0xffffffffbd198df0-0xffffffffbd198e40	print_type_u8
      0xffffffffbd198e40-0xffffffffbd198e90	print_type_u16
      0xffffffffbd198e90-0xffffffffbd198ee0	print_type_u32
      0xffffffffbd198ee0-0xffffffffbd198f30	print_type_u64
      0xffffffffbd198f30-0xffffffffbd198f80	print_type_s8
      0xffffffffbd198f80-0xffffffffbd198fd0	print_type_s16
      0xffffffffbd198fd0-0xffffffffbd199020	print_type_s32
      0xffffffffbd199020-0xffffffffbd199070	print_type_s64
      0xffffffffbd199070-0xffffffffbd1990c0	print_type_x8
      0xffffffffbd1990c0-0xffffffffbd199110	print_type_x16
      0xffffffffbd199110-0xffffffffbd199160	print_type_x32
      0xffffffffbd199160-0xffffffffbd1991b0	print_type_x64
      
      But not always:
      
      0xffffffffbd1997b0-0xffffffffbd1997c0	fetch_kernel_stack_address (kernel/trace/trace_probe.c)
      0xffffffffbd1c57f0-0xffffffffbd1c58b0	__context_tracking_enter   (kernel/context_tracking.c)
      Signed-off-by: NLi Bin <huawei.libin@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: zhangmengting@huawei.com
      Fixes: 9aaf5a5f ("perf probe: Check kprobes blacklist when adding new events")
      Link: http://lkml.kernel.org/r/1504011443-7269-1-git-send-email-huawei.libin@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2c29461e
    • A
      perf trace beauty: Beautify pkey_{alloc,free,mprotect} arguments · 83bc9c37
      Arnaldo Carvalho de Melo 提交于
      Reuse 'mprotect' beautifiers for 'pkey_mprotect'.
      
      System wide tracing pkey_alloc, pkey_free and pkey_mprotect calls, with
      backtraces:
      
        # perf trace -e pkey_alloc,pkey_mprotect,pkey_free --max-stack=5
           0.000 ( 0.011 ms): pkey/7818 pkey_alloc(init_val: DISABLE_ACCESS|DISABLE_WRITE) = -1 EINVAL Invalid argument
                                             syscall (/usr/lib64/libc-2.25.so)
                                             pkey_alloc (/home/acme/c/pkey)
           0.022 ( 0.003 ms): pkey/7818 pkey_mprotect(start: 0x7f28c3890000, len: 4096, prot: READ|WRITE, pkey: -1) = 0
                                             syscall (/usr/lib64/libc-2.25.so)
                                             pkey_mprotect (/home/acme/c/pkey)
           0.030 ( 0.002 ms): pkey/7818 pkey_free(pkey: -1                               ) = -1 EINVAL Invalid argument
                                             syscall (/usr/lib64/libc-2.25.so)
                                             pkey_free (/home/acme/c/pkey)
      
      The tools/include/uapi/asm-generic/mman-common.h file is used to find
      the access rights defines for the pkey_alloc syscall second argument.
      
      Since we have the detector of changes for the tools/include header files
      versus its kernel origin (include/uapi/asm-generic/mman-common.h), we'll
      get whatever new flag appears for that argument automatically.
      
      This method should be used in other cases where it is easy to generate
      those flags tables because the header has properly namespaced defines
      like PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-3xq5312qlks7wtfzv2sk3nct@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      83bc9c37
    • D
      perf tools: Pass full path of FEATURES_DUMP · 70ff7c6c
      David Carrillo-Cisneros 提交于
      When building with an external FEATURES_DUMP, bpf complains
      that features dump file is not found. Fix it by passing full file path.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20170827075442.108534-7-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      70ff7c6c
    • D
      perf tools: Robustify detection of clang binary · 3866058e
      David Carrillo-Cisneros 提交于
      Prior to this patch, make scripts tested for CLANG with ifeq ($(CC),
      clang), failing to detect CLANG binaries with different names. Fix it by
      testing for the existence of __clang__ macro in the list of compiler
      defined macros.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20170827075442.108534-5-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3866058e
    • D
      perf tools: Allow external definition of flex and bison binary names · 39a59f1e
      David Carrillo-Cisneros 提交于
      Allow user to define flex and bison binary names by passing FLEX and
      BISON variables.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20170827075442.108534-3-davidcc@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      39a59f1e
    • J
      perf report: Group stat values on global event id · 9933183e
      Jiri Olsa 提交于
      There's no big value on displaying counts for every event ID, which is
      one per every CPU. Rather than that, displaying the whole sum for the
      event.
      
        $ perf record -c 100000 -e cycles:u -s test
        $ perf report -T
      
      Before:
        #  PID   TID  cycles:u  cycles:u  cycles:u  cycles:u  ... [20 more columns of 'cycles:u']
          3339  3339         0         0         0         0
          3340  3340         0         0         0         0
          3341  3341         0         0         0         0
          3342  3342         0         0         0         0
      
      Now:
        #  PID   TID  cycles:u
          3339  3339     19678
          3340  3340     18744
          3341  3341     17335
          3342  3342     26414
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170824162737.7813-10-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9933183e
    • J
      perf values: Zero value buffers · a1834fc9
      Jiri Olsa 提交于
      We need to make sure the array of value pointers are zero initialized,
      because we use them in realloc later on and uninitialized non zero value
      will cause allocation error and aborted execution.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170824162737.7813-9-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a1834fc9
    • J
      perf values: Fix allocation check · f4ef3b7c
      Jiri Olsa 提交于
      Bailing out in case the allocation failed, not the other way round.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170824162737.7813-8-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f4ef3b7c
    • J
      perf values: Fix thread index bug · 64eed1de
      Jiri Olsa 提交于
      We are taking wrong index (+1) for first thread, which leaves thread
      with index 0 unused and uninitialized.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170824162737.7813-7-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      64eed1de
    • J
      perf report: Add dump_read function · dac7f6b7
      Jiri Olsa 提交于
      Adding dump_read function to gather all the dump output of read
      function. Adding output of enabled and running times and id if enabled
      (3 new lines with '...' prefix below).
      
        $ perf record -s ...
        $ perf report -D
      
        958358311769 0x91f8 [0x40]: PERF_RECORD_READ: 3339 3339 cycles:u 0
        ... time enabled : 958358313731
        ... time running : 958358313731
        ... id           : 80
      
      Committer note:
      
      Do not use 'read' as a variable name as it breaks the build on older
      systems, such as RHEL6:
      
          CC       /tmp/build/perf/util/session.o
        cc1: warnings being treated as errors
        util/session.c: In function 'dump_read':
        util/session.c:1132: error: declaration of 'read' shadows a global declaration
        /usr/include/bits/unistd.h:35: error: shadowed declaration is here
        mv: cannot stat `/tmp/build/perf/util/.session.o.tmp': No such file or directory
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170824162737.7813-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dac7f6b7
  7. 28 8月, 2017 3 次提交