1. 02 3月, 2015 8 次提交
  2. 28 2月, 2015 12 次提交
    • I
      Merge tag 'perf-core-for-mingo' of... · 788b94ba
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Fix SIGBUS failures due to misaligned accesses in Sparc64 (David Ahern)
      
        - Fix branch stack mode in 'perf report' (He Kuang)
      
        - Fix a 'perf probe' operator precedence bug (He Kuang)
      
        - Fix Support for different binaries with same name in 'perf diff' (Kan Liang)
      
        - Check kprobes blacklist when adding new events via 'perf probe' (Masami Hiramatsu)
      
        - Add --purge FILE to remove all caches of FILE in 'perf buildid-cache' (Masami Hiramatsu)
      
        - Show usage with some incorrect params (Masami Hiramatsu)
      
        - Add new buildid cache if update target is not cached in 'buildid-cache' (Masami Hiramatsu)
      
        - Allow listing events with 'tracepoint' prefix in 'perf list' (Yunlong Song)
      
        - Sort the output of 'perf list' (Yunlong Song)
      
        - Fix bash completion of 'perf --' (Yunlong Song)
      
      Developer Zone:
      
        - Handle strdup() failure path in 'perf probe' (Arnaldo Carvalho de Melo)
      
        - Fix get_real_path to free allocated memory in error path in 'perf probe' (Masami Hiramatsu)
      
        - Use pr_debug instead of verbose && pr_info perf buildid-cache (Masami Hiramatsu)
      
        - Fix building of 'perf data' with some gcc versions due to incorrect array struct
          entry (Yunlong Song)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      788b94ba
    • H
      perf report: Fix branch stack mode cannot be set · fefd2d96
      He Kuang 提交于
      When perf.data file is obtained using 'perf record -b', perf report
      should use branch stack mode to generate output. But this function is
      broken by improper comparison between boolean and constant -1.
      
      before this patch:
      
        $ perf report -b -i perf.data
        Samples: 16  of event 'cycles', Event count (approx.): 3171896
        Overhead  Command  Shared Object      Symbol
          13.59%  ls       [kernel.kallsyms]  [k] prio_tree_remove
          13.16%  ls       [kernel.kallsyms]  [k] change_pte_range
          12.09%  ls       [kernel.kallsyms]  [k] page_fault
          12.02%  ls       [kernel.kallsyms]  [k] zap_pte_range
        ...
      
      after this patch:
      
        $ perf report -b -i perf.data
        Samples: 256  of event 'cycles', Event count (approx.): 256
        Overhead  Command  Source Shared Object  Source Symbol                               Target Shared Object  Target Symbol
           9.38%  ls       [unknown]             [k] 0000000000000000                        [unknown]             [k] 0000000000000000
           6.25%  ls       libc-2.19.so          [.] _dl_addr                                libc-2.19.so          [.] _dl_addr
           6.25%  ls       [kernel.kallsyms]     [k] zap_pte_range                           [kernel.kallsyms]     [k] zap_pte_range
           6.25%  ls       [kernel.kallsyms]     [k] change_pte_range                        [kernel.kallsyms]     [k] change_pte_range
           0.39%  ls       [kernel.kallsyms]     [k] prio_tree_remove                        [kernel.kallsyms]     [k] prio_tree_remove
        ...
      Signed-off-by: NHe Kuang <hekuang@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1423967617-28879-1-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fefd2d96
    • M
      perf buildid-cache: Show usage with incorrect params · 0497d0a8
      Masami Hiramatsu 提交于
      Show usage if no action is specified or unexpected parameter is given.
      In other words, be more user friendly.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20150227045030.1999.44006.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0497d0a8
    • M
      perf buildid-cache: Use pr_debug instead of verbose && pr_info · cc169c7c
      Masami Hiramatsu 提交于
      Use pr_debug instead of the combination of verbose and pr_info.
      
      "if (verbose) pr_info(...)" is same as "pr_debug(...)", replace it.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Suggested-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20150227045028.1999.93137.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cc169c7c
    • M
      perf buildid-cache: Add --purge FILE to remove all caches of FILE · 8d8c8e4c
      Masami Hiramatsu 提交于
      Add --purge FILE to remove all caches of FILE.
      
      Since the current --remove FILE removes a cache which has
      same build-id of given FILE. Since the command takes a
      FILE path, it can confuse user who tries to remove cache
      about FILE path.
      
        -----
        # ./perf buildid-cache -v --add ./perf
        Adding 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
        # (update the ./perf binary)
        # ./perf buildid-cache -v --remove ./perf
        Removing 305bbd1be68f66eca7e2d78db294653031edfa79 ./perf: FAIL
        ./perf wasn't in the cache
        -----
      Actually, the --remove's FAIL is not shown, it just silently fails.
      
      So, this patch adds --purge FILE action for such usecase.
      
      perf buildid-cache --purge FILE removes all caches which has same FILE
      path.
      
      In other words, it removes all caches including old binaries.
      
        -----
        # ./perf buildid-cache -v --add ./perf
        Adding 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
        # (update the ./perf binary)
        # ./perf buildid-cache -v --purge ./perf
        Removing 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
        -----
      
      BTW, if you want to purge all the caches, remove ~/.debug/* .
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20150227045026.1999.64084.stgit@localhost.localdomain
      [ s/dirname/dir_name/g to fix build on fedora14, where dirname is a global ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8d8c8e4c
    • Y
      perf tools: Fix the bash completion problem of 'perf --*' · 7335399a
      Yunlong Song 提交于
      The perf-completion.sh uses a predefined string '--help --version
      --exec-path --html-path --paginate --no-pager --perf-dir --work-tree
      --debugfs-dir' for the bash completion of 'perf --*', which has two
      problems:
      
       Problem 1: If the options of perf are changed (see handle_options() in
       perf.c), the perf-completion.sh has to be changed at the same time. If
       not, the bash completion of 'perf --*' and the options which perf
       really supports will be inconsistent.
      
       Problem 2: When typing another single character after 'perf --', e.g.
       'h', and hit TAB key to get the bash completion of 'perf --h', the
       character 'h' disappears at once. This is not what we want, we wish the
       bash completion can return '--help --html-path' and then we can
       continue to choose one.
      
       To solve this problem, we add '--list-opts' to perf, which now supports
       'perf --list-opts' directly, and its result can be used in bash
       completion now.
      
      Example:
      
       Before this patch:
      
       $ perf --h                 <-- hit TAB key after character 'h'
       $ perf --                  <-- 'h' disappears and no required result
      
       After this patch:
      
       $ perf --h                 <-- hit TAB key after character 'h'
       --help       --html-path   <-- the required result
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-8-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7335399a
    • Y
      perf list: Extend raw-dump to certain kind of events · 5ef803ee
      Yunlong Song 提交于
      Extend 'perf list --raw-dump' to 'perf list --raw-dump [hw|sw|cache
      |tracepoint|pmu|event_glob]' in order to show the raw-dump of a certain
      kind of events rather than all of the events.
      
      Example:
      
      Before this patch:
      
       $ perf list --raw-dump hw
       branch-instructions branch-misses bus-cycles cache-misses
       cache-references cpu-cycles instructions stalled-cycles-backend
       stalled-cycles-frontend
       alignment-faults context-switches cpu-clock cpu-migrations
       emulation-faults major-faults minor-faults page-faults task-clock
       ...
       ...
       writeback:writeback_thread_start writeback:writeback_thread_stop
       writeback:writeback_wait_iff_congested
       writeback:writeback_wake_background writeback:writeback_wake_thread
      
      As shown above, all of the events are printed.
      
      After this patch:
      
       $ perf list --raw-dump hw
       branch-instructions branch-misses bus-cycles cache-misses
       cache-references cpu-cycles instructions stalled-cycles-backend
       stalled-cycles-frontend
      
      As shown above, only the hw events are printed.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-5-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5ef803ee
    • Y
      perf list: Clean up the printing functions of hardware/software events · 705750f2
      Yunlong Song 提交于
      Do not need print_events_type or __print_events_type for listing hw/sw
      events, let print_symbol_events do its job instead. Moreover,
      print_symbol_events can also handle event_glob and name_only.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-4-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      705750f2
    • Y
      perf tools: Remove the '--(null)' long_name for --list-opts · 3ef1e65c
      Yunlong Song 提交于
      If the long_name of a 'struct option' is defined as NULL, --list-opts
      will incorrectly print '--(null)' in its output. As a result, '--(null)'
      will finally appear in the case of bash completion, e.g. 'perf record
      --'.
      
      Example:
      
      Before this patch:
      
       $ perf record --list-opts
      
       --event --filter --pid --tid --realtime --no-buffering --raw-samples
       --all-cpus --cpu --count --output --no-inherit --freq --mmap-pages
       --group --(null) --call-graph --verbose --quiet --stat --data
       --timestamp --period --no-samples --no-buildid-cache --no-buildid
       --cgroup --delay --uid --branch-any --branch-filter --weight
       --transaction --per-thread --intr-regs
      
      After this patch:
      
       $ perf record --list-opts
      
       --event --filter --pid --tid --realtime --no-buffering --raw-samples
       --all-cpus --cpu --count --output --no-inherit --freq --mmap-pages
       --group --call-graph --verbose --quiet --stat --data --timestamp
       --period --no-samples --no-buildid-cache --no-buildid --cgroup --delay
       --uid --branch-any --branch-filter --weight --transaction --per-thread
       --intr-regs
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-7-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3ef1e65c
    • Y
      perf list: Avoid confusion of perf output and the next command prompt · ed457520
      Yunlong Song 提交于
      Distinguish the output of 'perf list --list-opts' or 'perf --list-cmds'
      with the next command prompt, which also happens in other cases (e.g.
      record, report ...).
      
      Example:
      
      Before this patch:
      
       $perf list --list-opts
       --raw-dump $          <-- the output and the next command prompt are at
                                 the same line
      
      After this patch:
      
       $perf list --list-opts
       --raw-dump
       $                     <-- the new line
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-6-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ed457520
    • Y
      perf list: Allow listing events with 'tracepoint' prefix · 16114951
      Yunlong Song 提交于
      If somebody happens to name an event with the beginning of 'tracepoint'
      (e.g. tracepoint_foo), then it will never be showed with perf list
      event_glob, thus we parse the argument 'tracepoint' more carefully for
      accuracy.
      
      Example:
      
      Before this patch:
      
       $ perf list tracepoint_foo:*
      
         jbd2:jbd2_start_commit                             [Tracepoint event]
         jbd2:jbd2_commit_locking                           [Tracepoint event]
         jbd2:jbd2_run_stats                                [Tracepoint event]
         block:block_rq_issue                               [Tracepoint event]
         block:block_bio_complete                           [Tracepoint event]
         block:block_bio_backmerge                          [Tracepoint event]
         block:block_getrq                                  [Tracepoint event]
         ...                                                ...
      
      As shown above, all of the tracepoint events are printed. In fact, the
      command's real intention is to print the events of tracepoint_foo.
      
      After this patch:
      
       $ perf list tracepoint_foo:*
      
         tracepoint_foo:tp_foo_enter                        [Tracepoint event]
         tracepoint_foo:tp_foo_exit                         [Tracepoint event]
      
      As shown above, only the events of tracepoint_foo are printed.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-3-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      16114951
    • Y
      perf list: Sort the output of 'perf list' to view more clearly · ab0e4800
      Yunlong Song 提交于
      Sort the output according to ASCII character list (using strcmp), which
      supports both number sequence and alphabet sequence.
      
      Example:
      
      Before this patch:
      
       $ perf list
      
       List of pre-defined events (to be used in -e):
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         cache-references                                   [Hardware event]
         cache-misses                                       [Hardware event]
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         ...                                                ...
      
         jbd2:jbd2_start_commit                             [Tracepoint event]
         jbd2:jbd2_commit_locking                           [Tracepoint event]
         jbd2:jbd2_run_stats                                [Tracepoint event]
         block:block_rq_issue                               [Tracepoint event]
         block:block_bio_complete                           [Tracepoint event]
         block:block_bio_backmerge                          [Tracepoint event]
         block:block_getrq                                  [Tracepoint event]
         ...                                                ...
      
      After this patch:
      
       $ perf list
      
       List of pre-defined events (to be used in -e):
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         ...                                                ...
      
         block:block_bio_backmerge                          [Tracepoint event]
         block:block_bio_complete                           [Tracepoint event]
         block:block_getrq                                  [Tracepoint event]
         block:block_rq_issue                               [Tracepoint event]
         jbd2:jbd2_commit_locking                           [Tracepoint event]
         jbd2:jbd2_run_stats                                [Tracepoint event]
         jbd2:jbd2_start_commit                             [Tracepoint event]
         ...                                                ...
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-2-git-send-email-yunlong.song@huawei.com
      [ Don't forget closedir({sys,evt}_dir) when handling errors ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab0e4800
  3. 27 2月, 2015 5 次提交
    • Y
      perf data: Fix sentinel setting for data_cmds array · 1f924c29
      Yunlong Song 提交于
      The recent new patch "perf tools: Add new 'perf data' command" (commit
      2245bf14 in acme's git repo perf/core) has caused a building error when
      compiling the source code of perf:
      
       cc1: warnings being treated as errors
       builtin-data.c:89: error: missing initializer
       builtin-data.c:89: error: (near initialization for ‘data_cmds[1].summary’)
       make[2]: *** [builtin-data.o] Error 1
       make[2]: *** Waiting for unfinished jobs....
         LD       bench/perf-in.o
         LD       tests/perf-in.o
       make[1]: *** [perf-in.o] Error 2
       make: *** [all] Error 2
      
      This patch fixes the building error above.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425038026-27604-1-git-send-email-yunlong.song@huawei.com
      [ .name == NULL ends the loop, use it instead of seting all fields to NULL ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f924c29
    • H
      perf probe: Fix a precedence bug · f56847c2
      He Kuang 提交于
      The minus operator has higher precedence than ?: Add parentheses around
      ?: fix this.
      
      Before this patch:
      
        $ echo 'p:myprobe do_sys_open' > /sys/kernel/debug/tracing/kprobe_events
        $ perf probe -l -k ../vmlinux
          kprobes:myprobe      (on do_sys_open)
      
      After this patch:
      
        $ echo 'p:myprobe do_sys_open' > /sys/kernel/debug/tracing/kprobe_events
        $ perf probe -l -k ../vmlinux
          kprobes:myprobe      (on do_sys_open@linux.git/fs/open.c)
      Signed-off-by: NHe Kuang <hekuang@huawei.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425034373-14511-1-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f56847c2
    • K
      perf diff: Support for different binaries · 94ba462d
      Kan Liang 提交于
      Currently, the perf diff only works with same binaries. That's because
      it compares the symbol start address. It doesn't work if the perf.data
      comes from different binaries. This patch matches the symbol names.
      
      Actually, perf diff once intended to compare the symbol names.  The
      commit as below can look for a pair by name.
      
      604c5c92 (perf diff: Change the default sort order to "dso,symbol")
      However, at that time, perf diff used a global list of dsos. That means
      the binaries which has same name can only be loaded once. That's a
      problem for comparing different binaries.
      
      For example, we have an old binary and an updated binary. They very
      likely have same name and most of the functions, so only dsos from old
      binary will be loaded. When processing the data from updated binary,
      perf still use the symbol information from old binary. That's wrong.
      
      Then the commit as below used IP to replace symbol name.
      9c443dfd ("perf diff: Fix support for all --sort combinations")
      >From that time, perf diff starts to compare the symbol address.
      
      The global dsos is discarded from a patch in 2010.
      a1645ce1 ("perf: 'perf kvm' tool for monitoring guest performance
      from host")
      However, at that time, perf diff already compared by address. So perf
      diff cannot work for different binaries as well.
      
      This patch actually rolls back the perf diff to original design. The
      document is also changed, so everybody knows the original design is to
      compare the symbol names.
      
      Here are some examples:
      
      The only difference between example_v1.c and example_v2.c is the
      location of f2 and f3. There is no change in behavior, but the previous
      perf diff display the wrong differential profile.
      
      example_v1.c
      noinline void f3(void)
      {
              volatile int i;
              for (i = 0; i < 10000;) {
      
                      if(i%2)
                              i++;
                      else
                              i++;
              }
      }
      
      noinline void f2(void)
      {
              volatile int a = 100, b, c;
              for (b = 0; b < 10000; b++)
                      c = a * b;
      
      }
      
      noinline void f1(void)
      {
                      f2();
                      f3();
      }
      
      int main()
      {
              int i;
              for (i = 0; i < 100000; i++)
                      f1();
      }
      
      example_v2.c
      noinline void f2(void)
      {
              volatile int a = 100, b, c;
              for (b = 0; b < 10000; b++)
                      c = a * b;
      }
      
      noinline void f3(void)
      {
              volatile int i;
              for (i = 0; i < 10000;) {
                      if(i%2)
                              i++;
                      else
                              i++;
              }
      }
      
      noinline void f1(void)
      {
                      f2();
                      f3();
      }
      
      int main()
      {
              int i;
              for (i = 0; i < 100000; i++)
                      f1();
      }
      
      [lk@localhost perf_diff]$ gcc example_v1.c -o example
      [lk@localhost perf_diff]$ perf record -o example_v1.data ./example
      [ perf record: Woken up 4 times to write data ]
      [ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ]
      
      [lk@localhost perf_diff]$ gcc example_v2.c -o example
      [lk@localhost perf_diff]$ perf record -o example_v2.data ./example
      [ perf record: Woken up 4 times to write data ]
      [ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ]
      
      Old perf diff result:
      
      [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
       Event 'cycles'
       Baseline    Delta  Shared Object     Symbol
       ........  .......  ................  ...............................
      
                           [kernel.vmlinux]  [k] __perf_event_task_sched_out
           0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                           [kernel.vmlinux]  [k] idle_cpu
                           [kernel.vmlinux]  [k] intel_pstate_timer_func
                           [kernel.vmlinux]  [k] native_read_msr_safe
           0.00%           [kernel.vmlinux]  [k] native_read_tsc
           0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                           [kernel.vmlinux]  [k] ntp_tick_length
           0.00%           [kernel.vmlinux]  [k] rb_erase
           0.00%           [kernel.vmlinux]  [k] tick_sched_timer
           0.00%           [kernel.vmlinux]  [k] unmap_single_vma
           0.00%           [kernel.vmlinux]  [k] update_wall_time
           0.00%           example           [.] f1
          46.24%           example           [.] f2
          53.71%   -7.55%  example           [.] f3
                  +53.81%  example           [.] f3
           0.02%           example           [.] main
      
      New perf diff result:
      
      [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
                           [kernel.vmlinux]  [k] __perf_event_task_sched_out
           0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                           [kernel.vmlinux]  [k] idle_cpu
                           [kernel.vmlinux]  [k] intel_pstate_timer_func
                           [kernel.vmlinux]  [k] native_read_msr_safe
           0.00%           [kernel.vmlinux]  [k] native_read_tsc
           0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                           [kernel.vmlinux]  [k] ntp_tick_length
           0.00%           [kernel.vmlinux]  [k] rb_erase
           0.00%           [kernel.vmlinux]  [k] tick_sched_timer
           0.00%           [kernel.vmlinux]  [k] unmap_single_vma
           0.00%           [kernel.vmlinux]  [k] update_wall_time
           0.00%           example           [.] f1
          46.24%   -0.08%  example           [.] f2
          53.71%   +0.11%  example           [.] f3
           0.02%           example           [.] main
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      94ba462d
    • M
      perf buildid-cache: Add new buildid cache if update target is not cached · a50d11a1
      Masami Hiramatsu 提交于
      Add new buildid cache if the update target file is not cached.
      
      This can happen when an old binary is replaced by new one after caching
      the old one. In this case, user sees his operation just failed.
      
      But it does not look straight, since user just pass the binary "path",
      not "build-id".
      
        ----
        # ./perf buildid-cache --add ./perf
        (update ./perf to new binary)
        # ./perf buildid-cache --update ./perf
        ./perf wasn't in the cache
        #
        ----
      
      This patch adds given new binary to cache if the new binary is
      not cached. So we'll not see the above error.
      
        ----
        # ./perf buildid-cache --add ./perf
        (update ./perf to new binary)
        # ./perf buildid-cache --update ./perf
        #
        ----
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20150226065440.23912.1494.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a50d11a1
    • A
      perf probe: Handle strdup() failure · 38ae502b
      Arnaldo Carvalho de Melo 提交于
      We could end up returning 0 (Ok) with a NULL raw_path. Fix it.
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naohiro Aota <naota@elisp.net>
      Link: http://lkml.kernel.org/n/tip-l0kcbcg5f4nnzqt01cv42vec@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      38ae502b
  4. 26 2月, 2015 11 次提交
    • M
      perf probe: Fix get_real_path to free allocated memory in error path · eb47cb2e
      Masami Hiramatsu 提交于
      Fix get_real_path to free allocated memory when comp_dir is used for
      complementing path and getting an error.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naohiro Aota <naota@elisp.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150226082504.28125.74506.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eb47cb2e
    • M
      perf probe: Check kprobes blacklist when adding new events · 9aaf5a5f
      Masami Hiramatsu 提交于
      Recent linux kernel provides a blacklist of the functions which can not
      be probed. perf probe can now check this blacklist before setting new
      events and indicate better error message for users.
      
      Without this patch,
        ----
        # perf probe --add vmalloc_fault
        Added new event:
        Failed to write event: Invalid argument
          Error: Failed to add events.
        ----
      With this patch
        ----
        # perf probe --add vmalloc_fault
        Added new event:
        Warning: Skipped probing on blacklisted function: vmalloc_fault
        ----
      Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150219143113.14434.5387.stgit@localhost.localdomainSigned-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9aaf5a5f
    • D
      perf trace: Fix SIGBUS failures due to misaligned accesses · 55d43bca
      David Ahern 提交于
      On Sparc64 perf-trace is failing in many spots due to extended load
      instructions being used on misaligned accesses.
      
      (gdb) run trace ls
      Starting program: /tmp/perf/perf trace ls
      [Thread debugging using libthread_db enabled]
      Detaching after fork from child process 169460.
      
      <ls output removed>
      
      Program received signal SIGBUS, Bus error.
      0x000000000014f4dc in tp_field__u64 (field=0x4cc700, sample=0x7feffffa098) at builtin-trace.c:61
      warning: Source file is more recent than executable.
      61      TP_UINT_FIELD(64);
      
      (gdb) bt
       0  0x000000000014f4dc in tp_field__u64 (field=0x4cc700, sample=0x7feffffa098) at builtin-trace.c:61
       1  0x0000000000156ad4 in trace__sys_exit (trace=0x7feffffc268, evsel=0x4cc580, event=0xfffffc0104912000,
          sample=0x7feffffa098) at builtin-trace.c:1701
       2  0x0000000000158c14 in trace__run (trace=0x7feffffc268, argc=1, argv=0x7fefffff360) at builtin-trace.c:2160
       3  0x000000000015b78c in cmd_trace (argc=1, argv=0x7fefffff360, prefix=0x0) at builtin-trace.c:2609
       4  0x0000000000107d94 in run_builtin (p=0x4549c8, argc=2, argv=0x7fefffff360) at perf.c:341
       5  0x0000000000108140 in handle_internal_command (argc=2, argv=0x7fefffff360) at perf.c:400
       6  0x0000000000108308 in run_argv (argcp=0x7feffffef2c, argv=0x7feffffef20) at perf.c:444
       7  0x0000000000108728 in main (argc=2, argv=0x7fefffff360) at perf.c:559
      
      (gdb) p *sample
      $1 = {ip = 4391276, pid = 169472, tid = 169472, time = 6303014583281250, addr = 0, id = 72082,
        stream_id = 18446744073709551615, period = 1, weight = 0, transaction = 0, cpu = 73, raw_size = 36,
        data_src = 84410401, flags = 0, insn_len = 0, raw_data = 0xfffffc010491203c, callchain = 0x0,
        branch_stack = 0x0, user_regs = {abi = 0, mask = 0, regs = 0x0, cache_regs = 0x7feffffa098, cache_mask = 0},
        intr_regs = {abi = 0, mask = 0, regs = 0x0, cache_regs = 0x7feffffa098, cache_mask = 0}, user_stack = {
          offset = 0, size = 0, data = 0x0}, read = {time_enabled = 0, time_running = 0, {group = {nr = 0,
              values = 0x0}, one = {value = 0, id = 0}}}}
      (gdb) p *field
      $2 = {offset = 16, {integer = 0x14f4a8 <tp_field__u64>, pointer = 0x14f4a8 <tp_field__u64>}}
      
      sample->raw_data is guaranteed to not be 8-byte aligned because it is preceded
      by the size as a u3. So accessing raw data with an extended load instruction causes
      the SIGBUS. Resolve by using memcpy to a temporary variable of appropriate size.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1424376022-140608-1-git-send-email-david.ahern@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      55d43bca
    • I
      Merge tag 'perf-core-for-mingo' of... · 0afb1704
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      New user selectable features:
      
        - Support recording running/enabled time in 'perf record' (Andi Kleen)
      
        - New tool: 'perf data' for converting perf.data to other formats,
          initially for the CTF (Common Trace Format) from LTTng (Jiri Olsa, Sebastian Siewior)
      
      User visible changes:
      
        - Only insert blank duration bracket when tracing syscalls in 'perf trace' (Arnaldo Carvalho de Melo)
      
        - Filter out the trace pid when no threads are specified in 'perf trace' (Arnaldo Carvalho de Melo)
      
        - Add 'perf trace' man page entry for --event (Arnaldo Carvalho de Melo)
      
        - Dump stack on segfaults in 'perf trace' (Arnaldo Carvalho de Melo)
      
      Infrastructure changes:
      
        - Introduce set_filter_pid and set_filter_pids methods in the evlist class (Arnaldo Carvalho de Melo)
      
        - Some perf_session untanglement patches, removing the need to pass a
          perf_session instance for things that are related to evlists, so that
          tools that don't deal with perf.data files like trace in live mode can
          make use of the ordered_events class (Arnaldo Carvalho de Melo)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0afb1704
    • I
      e9e4e443
    • D
      perf tools: Make sparc64 arch point to sparc · 4861f87c
      David Ahern 提交于
      The recent build changes cause perf to not compile for sparc64 since the
      arch/sparc64/Build file does not exist:
      
      /home/dahern/kernels/linux.git/tools/build/Makefile.build:40: arch/sparc64/Build: No such file or directory
      
      Fix by converting the sparc64 RAW_ARCH to sparc ARCH -- similar to what
      is done for x86_64.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1424306222-96843-1-git-send-email-david.ahern@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4861f87c
    • D
      perf symbols: Define EM_AARCH64 for older OSes · e370a3d5
      David Ahern 提交于
      4886f2ca added an arm-64 check, but the EM_AARCH64 macro is not
      defined in older releases (e.g., RHEL6). Define if it is not defined.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Link: http://lkml.kernel.org/r/1424306017-96797-1-git-send-email-david.ahern@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e370a3d5
    • D
      perf top: Fix SIGBUS on sparc64 · a73b6c19
      David Ahern 提交于
      perf-top is terminating due to SIGBUS on sparc64. git bisect points to:
      
          commit 82396986
          Author: Arnaldo Carvalho de Melo <acme@redhat.com>
          Date:   Mon Sep 8 13:26:35 2014 -0300
      
              perf evlist: Refcount mmaps
      
              We need to know how many fds are using a perf mmap via
              PERF_EVENT_IOC_SET_OUTPUT, so that we can know when to ditch an mmap,
              refcount it.
      
      This commit added 'int refcnt' to struct perf_mmap and the addition makes the
      event_copy element no longer 8-byte aligned.
      
      Fix by adding __attribute__((aligned(8))) to the event_copy struct
      member.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Link: http://lkml.kernel.org/r/1424304198-92028-1-git-send-email-david.ahern@oracle.com
      [ Switched from 'int pad;' to using __attribute__, David tested/acked that ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a73b6c19
    • A
      perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag · 48536c91
      Adrian Hunter 提交于
      Commit f6edb53c converted the probe to
      a CPU wide event first (pid == -1). For kernels that do not support
      the PERF_FLAG_FD_CLOEXEC flag the probe fails with EINVAL. Since this
      errno is not handled pid is not reset to 0 and the subsequent use of
      pid = -1 as an argument brings in an additional failure path if
      perf_event_paranoid > 0:
      
      $ perf record -- sleep 1
      perf_event_open(..., 0) failed unexpectedly with error 13 (Permission denied)
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.007 MB /tmp/perf.data (11 samples) ]
      
      Also, ensure the fd of the confirmation check is closed and comment why
      pid = -1 is used.
      
      Needs to go to 3.18 stable tree as well.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Based-on-patch-by: NDavid Ahern <david.ahern@oracle.com>
      Acked-by: NDavid Ahern <david.ahern@oracle.com>
      Cc: David Ahern <dsahern@gmail.com>
      Link: http://lkml.kernel.org/r/54EC610C.8000403@intel.com
      Cc: stable@vger.kernel.org  # v3.18+
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      48536c91
    • S
      perf data: Add a 'perf' prefix to the generic fields · 54cf776a
      Sebastian Andrzej Siewior 提交于
      Some of the tracers bring their own id or pid fields and we can end up
      having two of them. This patch adds a "perf_" prefix to the 'generic'
      fields so we avoid a clash of the member names.
      
      The change is visible in the babeltrace output:
      
      Before:
        $ babeltrace ./ctf-data/
        [03:19:13.962131936] (+0.000001935) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 8 }
        [03:19:13.962133732] (+0.000001796) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 114 }
        ...
      
      Now:
        $ babeltrace ./ctf-data/
        [03:19:13.962131936] (+0.000001935) cycles: { }, { perf_ip = 0xFFFFFFFF8105443A, perf_tid = 20714, perf_pid = 20714, perf_period = 8 }
        [03:19:13.962133732] (+0.000001796) cycles: { }, { perf_ip = 0xFFFFFFFF8105443A, perf_tid = 20714, perf_pid = 20714, perf_period = 114 }
        ...
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jeremie Galarneau <jgalar@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1424470628-5969-5-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      54cf776a
    • J
      perf data: Add perf data to CTF conversion support · edbe9817
      Jiri Olsa 提交于
      Adding 'perf data convert' to convert perf data file into different
      format. This patch adds support for CTF format conversion.
      
      To convert perf.data into CTF run:
        $ perf data convert --to-ctf=./ctf-data/
        [ perf data convert: Converted 'perf.data' into CTF data './ctf-data/' ]
        [ perf data convert: Converted and wrote 11.268 MB (100230 samples) ]
      
      The command will create CTF metadata out of perf.data file (or one
      specified via -i option) and then convert all sample events into single
      CTF stream.
      
      Each sample_type bit is translated into separated CTF event field apart
      from following exceptions:
      
        PERF_SAMPLE_RAW          - added in next patch
        PERF_SAMPLE_READ         - TODO
        PERF_SAMPLE_CALLCHAIN    - TODO
        PERF_SAMPLE_BRANCH_STACK - TODO
        PERF_SAMPLE_REGS_USER    - TODO
        PERF_SAMPLE_STACK_USER   - TODO
      
        $ perf --debug=data-convert=2 data convert ...
      
      The converted CTF data could be analyzed by CTF tools, like babletrace
      or tracecompass [1].
      
        $ babeltrace ./ctf-data/
        [03:19:13.962125533] (+?.?????????) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 }
        [03:19:13.962130001] (+0.000004468) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 }
        [03:19:13.962131936] (+0.000001935) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 8 }
        [03:19:13.962133732] (+0.000001796) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 114 }
        [03:19:13.962135557] (+0.000001825) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 2087 }
        [03:19:13.962137627] (+0.000002070) cycles: { }, { ip = 0xFFFFFFFF81361938, tid = 20714, pid = 20714, period = 37582 }
        [03:19:13.962161091] (+0.000023464) cycles: { }, { ip = 0xFFFFFFFF8124218F, tid = 20714, pid = 20714, period = 600246 }
        [03:19:13.962517569] (+0.000356478) cycles: { }, { ip = 0xFFFFFFFF811A75DB, tid = 20714, pid = 20714, period = 1325731 }
        [03:19:13.969518008] (+0.007000439) cycles: { }, { ip = 0x34080917B2, tid = 20714, pid = 20714, period = 1144298 }
      
      The following members to the ctf-environment were decided to be added to
      distinguish and specify perf CTF data:
      
        - domain
      
          It says "kernel" because it contains a kernel trace (not to be
          confused with a user space like lttng-ust does)
      
        - tracer_name
      
          It says perf. This can be used to distinguish between lttng and perf
          CTF based trace.
      
        - version
      
          The kernel version from stream. In addition to release, this is what
          it looks like on a Debian kernel:
      
            release = "3.14-1-amd64";
            version = "3.14.0";
      
      [1] http://projects.eclipse.org/projects/tools.tracecompassSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jeremie Galarneau <jgalar@efficios.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1424470628-5969-4-git-send-email-jolsa@kernel.orgSigned-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      edbe9817
  5. 25 2月, 2015 4 次提交