1. 12 3月, 2015 5 次提交
  2. 11 3月, 2015 2 次提交
  3. 03 3月, 2015 1 次提交
    • A
      perf tools: Reference count struct thread · f3b623b8
      Arnaldo Carvalho de Melo 提交于
      We need to do that to stop accumulating entries in the dead_threads
      linked list, i.e. we were keeping references to threads in struct hists
      that continue to exist even after a thread exited and was removed from
      the machine threads rbtree.
      
      We still keep the dead_threads list, but just for debugging, allowing us
      to iterate at any given point over the threads that still are referenced
      by things like struct hist_entry.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-3ejvfyed0r7ue61dkurzjux4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b623b8
  4. 02 3月, 2015 4 次提交
    • M
      perf probe: Remove bias offset to find probe point by address · 0104fe69
      Masami Hiramatsu 提交于
      Remove bias offset to find probe point by address.
      
      Without this patch, probe points on kernel and executables are shown
      correctly, but do not work with libraries:
      
        # ./perf probe -l
          probe:do_fork        (on do_fork@kernel/fork.c)
          probe_libc:malloc    (on malloc in /usr/lib64/libc-2.17.so)
          probe_perf:strlist__new (on strlist__new@util/strlist.c in /home/mhiramat/ksrc/linux-3/tools/perf/perf)
      
      Removing bias allows it to show it as real place:
      
        # ./perf probe -l
          probe:do_fork        (on do_fork@kernel/fork.c)
          probe_libc:malloc    (on __libc_malloc@malloc/malloc.c in /usr/lib64/libc-2.17.so)
          probe_perf:strlist__new (on strlist__new@util/strlist.c in /home/mhiramat/ksrc/linux-3/tools/perf/perf)
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naohiro Aota <naota@elisp.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150302124946.9191.64085.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0104fe69
    • M
      perf probe: Warn if given uprobe event accesses memory on older kernel · 79702f61
      Masami Hiramatsu 提交于
      Warn if given uprobe event accesses memory on older kernel.
      
      Until 3.14, uprobe event only supports accessing registers so this warns
      to upgrade kernel if uprobe-event returns -EINVAL and an argument of the
      event accesses memory ($stack, @+offset, and +|-offs() symtax).
      
      With this patch (on 3.10.0-123.13.2.el7.x86_64);
        -----
        # ./perf probe -x ./perf warn_uprobe_event_compat stack=-0\(%sp\)
        Added new event:
        Failed to write event: Invalid argument
        Please upgrade your kernel to at least 3.14 to have access to feature -0(%sp)
          Error: Failed to add events.
        -----
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20150228025329.32106.70581.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      79702f61
    • A
      perf tools: Fix FORK after COMM when synthesizing records for pre-existing threads · 4aa5f4f7
      Arnaldo Carvalho de Melo 提交于
      In this commit:
      
        commit 363b785f
        Author: Don Zickus <dzickus@redhat.com>
        Date:   Fri Mar 14 10:43:44 2014 -0400
      
            perf tools: Speed up thread map generation
      
      We ended up emitting PERF_RECORD_FORK events after their corresponding
      PERF_RECORD_COMM, so the code below will remove the "existing thread"
      and then recreates it, unnecessarily:
      
        [root@ssdandy ~]# perf probe -x ~/bin/perf -L machine__process_fork_event
        <machine__process_fork_event@/home/acme/git/linux/tools/perf/util/machine.c:0>
            0  int machine__process_fork_event(struct machine *machine, union perf_event *event,
                                              struct perf_sample *sample)
            2  {
            3         struct thread *thread = machine__find_thread(machine,
                                                                   event->fork.pid,
                                                                   event->fork.tid);
            6         struct thread *parent = machine__findnew_thread(machine,
                                                                      event->fork.ppid,
                                                                      event->fork.ptid);
      
                      /* if a thread currently exists for the thread id remove it */
                      if (thread != NULL)
           12                 machine__remove_thread(machine, thread);
      
           14         thread = machine__findnew_thread(machine, event->fork.pid,
                                                       event->fork.tid);
           16         if (dump_trace)
           17                 perf_event__fprintf_task(event, stdout);
      
           19         if (thread == NULL || parent == NULL ||
           20             thread__fork(thread, parent, sample->time) < 0) {
           21                 dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n");
           22                 return -1;
                      }
      
           25         return 0;
           26  }
      
        [root@ssdandy ~]# perf probe -x ~/bin/perf fork_after_comm=machine__process_fork_event:12
        Added new event:
          probe_perf:fork_after_comm (on machine__process_fork_event:12 in /home/acme/bin/perf)
      
        You can now use it in all perf tools, such as:
      
      	perf record -e probe_perf:fork_after_comm -aR sleep 1
      
        [root@ssdandy ~]#
      
        [root@ssdandy ~]# perf record -g -e probe_perf:* trace -o /tmp/bla
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.021 MB perf.data (30 samples) ]
        Terminated
        [root@ssdandy ~]#
      
        [root@ssdandy ~]# perf report --no-children --show-total-period --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Samples: 30  of event 'probe_perf:fork_after_comm'
        # Event count (approx.): 30
        #
        # Overhead        Period  Command  Shared Object  Symbol
        # ........  ............  .......  .............  ...............................
        #
           100.00%            30  trace    trace          [.] machine__process_fork_event
                      |
                      ---machine__process_fork_event
                         __event__synthesize_thread.part.2
                         perf_event__synthesize_threads
                         cmd_trace
                         main
                         __libc_start_main
      
        [root@ssdandy ~]#
      
        And Looking at 'perf report -D' output we see it:
      
        0 0 0x8698 [0x30]: PERF_RECORD_COMM: auditd:703/707
        0 0 0x86c8 [0x38]: PERF_RECORD_FORK(703:707):(703:703)
      
      Fix it by more closely mimicking how the kernel generates those records
      when a new fork happens, i.e. first a PERF_RECORD_FORK, then a
      PERF_RECORD_COMM.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-h0emvymi2t3mw8dlqd6d6z73@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4aa5f4f7
    • D
      perf tools: Only include tsc file for x86 · ecefde62
      David Ahern 提交于
      The perf_time_to_tsc and tsc_to_perf_time functions are only used for x86.
      
      Make inclusion of tsc.c dependent on x86 as well.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <david.ahern@oracle.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1424370153-128274-1-git-send-email-david.ahern@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ecefde62
  5. 28 2月, 2015 5 次提交
    • M
      perf buildid-cache: Add --purge FILE to remove all caches of FILE · 8d8c8e4c
      Masami Hiramatsu 提交于
      Add --purge FILE to remove all caches of FILE.
      
      Since the current --remove FILE removes a cache which has
      same build-id of given FILE. Since the command takes a
      FILE path, it can confuse user who tries to remove cache
      about FILE path.
      
        -----
        # ./perf buildid-cache -v --add ./perf
        Adding 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
        # (update the ./perf binary)
        # ./perf buildid-cache -v --remove ./perf
        Removing 305bbd1be68f66eca7e2d78db294653031edfa79 ./perf: FAIL
        ./perf wasn't in the cache
        -----
      Actually, the --remove's FAIL is not shown, it just silently fails.
      
      So, this patch adds --purge FILE action for such usecase.
      
      perf buildid-cache --purge FILE removes all caches which has same FILE
      path.
      
      In other words, it removes all caches including old binaries.
      
        -----
        # ./perf buildid-cache -v --add ./perf
        Adding 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
        # (update the ./perf binary)
        # ./perf buildid-cache -v --purge ./perf
        Removing 133b7b5486d987a5ab5c3ebf4ea14941f45d4d4f ./perf: Ok
        -----
      
      BTW, if you want to purge all the caches, remove ~/.debug/* .
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20150227045026.1999.64084.stgit@localhost.localdomain
      [ s/dirname/dir_name/g to fix build on fedora14, where dirname is a global ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8d8c8e4c
    • Y
      perf list: Clean up the printing functions of hardware/software events · 705750f2
      Yunlong Song 提交于
      Do not need print_events_type or __print_events_type for listing hw/sw
      events, let print_symbol_events do its job instead. Moreover,
      print_symbol_events can also handle event_glob and name_only.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-4-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      705750f2
    • Y
      perf tools: Remove the '--(null)' long_name for --list-opts · 3ef1e65c
      Yunlong Song 提交于
      If the long_name of a 'struct option' is defined as NULL, --list-opts
      will incorrectly print '--(null)' in its output. As a result, '--(null)'
      will finally appear in the case of bash completion, e.g. 'perf record
      --'.
      
      Example:
      
      Before this patch:
      
       $ perf record --list-opts
      
       --event --filter --pid --tid --realtime --no-buffering --raw-samples
       --all-cpus --cpu --count --output --no-inherit --freq --mmap-pages
       --group --(null) --call-graph --verbose --quiet --stat --data
       --timestamp --period --no-samples --no-buildid-cache --no-buildid
       --cgroup --delay --uid --branch-any --branch-filter --weight
       --transaction --per-thread --intr-regs
      
      After this patch:
      
       $ perf record --list-opts
      
       --event --filter --pid --tid --realtime --no-buffering --raw-samples
       --all-cpus --cpu --count --output --no-inherit --freq --mmap-pages
       --group --call-graph --verbose --quiet --stat --data --timestamp
       --period --no-samples --no-buildid-cache --no-buildid --cgroup --delay
       --uid --branch-any --branch-filter --weight --transaction --per-thread
       --intr-regs
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-7-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3ef1e65c
    • Y
      perf list: Avoid confusion of perf output and the next command prompt · ed457520
      Yunlong Song 提交于
      Distinguish the output of 'perf list --list-opts' or 'perf --list-cmds'
      with the next command prompt, which also happens in other cases (e.g.
      record, report ...).
      
      Example:
      
      Before this patch:
      
       $perf list --list-opts
       --raw-dump $          <-- the output and the next command prompt are at
                                 the same line
      
      After this patch:
      
       $perf list --list-opts
       --raw-dump
       $                     <-- the new line
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-6-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ed457520
    • Y
      perf list: Sort the output of 'perf list' to view more clearly · ab0e4800
      Yunlong Song 提交于
      Sort the output according to ASCII character list (using strcmp), which
      supports both number sequence and alphabet sequence.
      
      Example:
      
      Before this patch:
      
       $ perf list
      
       List of pre-defined events (to be used in -e):
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         cache-references                                   [Hardware event]
         cache-misses                                       [Hardware event]
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         ...                                                ...
      
         jbd2:jbd2_start_commit                             [Tracepoint event]
         jbd2:jbd2_commit_locking                           [Tracepoint event]
         jbd2:jbd2_run_stats                                [Tracepoint event]
         block:block_rq_issue                               [Tracepoint event]
         block:block_bio_complete                           [Tracepoint event]
         block:block_bio_backmerge                          [Tracepoint event]
         block:block_getrq                                  [Tracepoint event]
         ...                                                ...
      
      After this patch:
      
       $ perf list
      
       List of pre-defined events (to be used in -e):
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         ...                                                ...
      
         block:block_bio_backmerge                          [Tracepoint event]
         block:block_bio_complete                           [Tracepoint event]
         block:block_getrq                                  [Tracepoint event]
         block:block_rq_issue                               [Tracepoint event]
         jbd2:jbd2_commit_locking                           [Tracepoint event]
         jbd2:jbd2_run_stats                                [Tracepoint event]
         jbd2:jbd2_start_commit                             [Tracepoint event]
         ...                                                ...
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-2-git-send-email-yunlong.song@huawei.com
      [ Don't forget closedir({sys,evt}_dir) when handling errors ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab0e4800
  6. 27 2月, 2015 4 次提交
    • H
      perf probe: Fix a precedence bug · f56847c2
      He Kuang 提交于
      The minus operator has higher precedence than ?: Add parentheses around
      ?: fix this.
      
      Before this patch:
      
        $ echo 'p:myprobe do_sys_open' > /sys/kernel/debug/tracing/kprobe_events
        $ perf probe -l -k ../vmlinux
          kprobes:myprobe      (on do_sys_open)
      
      After this patch:
      
        $ echo 'p:myprobe do_sys_open' > /sys/kernel/debug/tracing/kprobe_events
        $ perf probe -l -k ../vmlinux
          kprobes:myprobe      (on do_sys_open@linux.git/fs/open.c)
      Signed-off-by: NHe Kuang <hekuang@huawei.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425034373-14511-1-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f56847c2
    • K
      perf diff: Support for different binaries · 94ba462d
      Kan Liang 提交于
      Currently, the perf diff only works with same binaries. That's because
      it compares the symbol start address. It doesn't work if the perf.data
      comes from different binaries. This patch matches the symbol names.
      
      Actually, perf diff once intended to compare the symbol names.  The
      commit as below can look for a pair by name.
      
      604c5c92 (perf diff: Change the default sort order to "dso,symbol")
      However, at that time, perf diff used a global list of dsos. That means
      the binaries which has same name can only be loaded once. That's a
      problem for comparing different binaries.
      
      For example, we have an old binary and an updated binary. They very
      likely have same name and most of the functions, so only dsos from old
      binary will be loaded. When processing the data from updated binary,
      perf still use the symbol information from old binary. That's wrong.
      
      Then the commit as below used IP to replace symbol name.
      9c443dfd ("perf diff: Fix support for all --sort combinations")
      >From that time, perf diff starts to compare the symbol address.
      
      The global dsos is discarded from a patch in 2010.
      a1645ce1 ("perf: 'perf kvm' tool for monitoring guest performance
      from host")
      However, at that time, perf diff already compared by address. So perf
      diff cannot work for different binaries as well.
      
      This patch actually rolls back the perf diff to original design. The
      document is also changed, so everybody knows the original design is to
      compare the symbol names.
      
      Here are some examples:
      
      The only difference between example_v1.c and example_v2.c is the
      location of f2 and f3. There is no change in behavior, but the previous
      perf diff display the wrong differential profile.
      
      example_v1.c
      noinline void f3(void)
      {
              volatile int i;
              for (i = 0; i < 10000;) {
      
                      if(i%2)
                              i++;
                      else
                              i++;
              }
      }
      
      noinline void f2(void)
      {
              volatile int a = 100, b, c;
              for (b = 0; b < 10000; b++)
                      c = a * b;
      
      }
      
      noinline void f1(void)
      {
                      f2();
                      f3();
      }
      
      int main()
      {
              int i;
              for (i = 0; i < 100000; i++)
                      f1();
      }
      
      example_v2.c
      noinline void f2(void)
      {
              volatile int a = 100, b, c;
              for (b = 0; b < 10000; b++)
                      c = a * b;
      }
      
      noinline void f3(void)
      {
              volatile int i;
              for (i = 0; i < 10000;) {
                      if(i%2)
                              i++;
                      else
                              i++;
              }
      }
      
      noinline void f1(void)
      {
                      f2();
                      f3();
      }
      
      int main()
      {
              int i;
              for (i = 0; i < 100000; i++)
                      f1();
      }
      
      [lk@localhost perf_diff]$ gcc example_v1.c -o example
      [lk@localhost perf_diff]$ perf record -o example_v1.data ./example
      [ perf record: Woken up 4 times to write data ]
      [ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ]
      
      [lk@localhost perf_diff]$ gcc example_v2.c -o example
      [lk@localhost perf_diff]$ perf record -o example_v2.data ./example
      [ perf record: Woken up 4 times to write data ]
      [ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ]
      
      Old perf diff result:
      
      [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
       Event 'cycles'
       Baseline    Delta  Shared Object     Symbol
       ........  .......  ................  ...............................
      
                           [kernel.vmlinux]  [k] __perf_event_task_sched_out
           0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                           [kernel.vmlinux]  [k] idle_cpu
                           [kernel.vmlinux]  [k] intel_pstate_timer_func
                           [kernel.vmlinux]  [k] native_read_msr_safe
           0.00%           [kernel.vmlinux]  [k] native_read_tsc
           0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                           [kernel.vmlinux]  [k] ntp_tick_length
           0.00%           [kernel.vmlinux]  [k] rb_erase
           0.00%           [kernel.vmlinux]  [k] tick_sched_timer
           0.00%           [kernel.vmlinux]  [k] unmap_single_vma
           0.00%           [kernel.vmlinux]  [k] update_wall_time
           0.00%           example           [.] f1
          46.24%           example           [.] f2
          53.71%   -7.55%  example           [.] f3
                  +53.81%  example           [.] f3
           0.02%           example           [.] main
      
      New perf diff result:
      
      [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
                           [kernel.vmlinux]  [k] __perf_event_task_sched_out
           0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                           [kernel.vmlinux]  [k] idle_cpu
                           [kernel.vmlinux]  [k] intel_pstate_timer_func
                           [kernel.vmlinux]  [k] native_read_msr_safe
           0.00%           [kernel.vmlinux]  [k] native_read_tsc
           0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                           [kernel.vmlinux]  [k] ntp_tick_length
           0.00%           [kernel.vmlinux]  [k] rb_erase
           0.00%           [kernel.vmlinux]  [k] tick_sched_timer
           0.00%           [kernel.vmlinux]  [k] unmap_single_vma
           0.00%           [kernel.vmlinux]  [k] update_wall_time
           0.00%           example           [.] f1
          46.24%   -0.08%  example           [.] f2
          53.71%   +0.11%  example           [.] f3
           0.02%           example           [.] main
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      94ba462d
    • M
      perf buildid-cache: Add new buildid cache if update target is not cached · a50d11a1
      Masami Hiramatsu 提交于
      Add new buildid cache if the update target file is not cached.
      
      This can happen when an old binary is replaced by new one after caching
      the old one. In this case, user sees his operation just failed.
      
      But it does not look straight, since user just pass the binary "path",
      not "build-id".
      
        ----
        # ./perf buildid-cache --add ./perf
        (update ./perf to new binary)
        # ./perf buildid-cache --update ./perf
        ./perf wasn't in the cache
        #
        ----
      
      This patch adds given new binary to cache if the new binary is
      not cached. So we'll not see the above error.
      
        ----
        # ./perf buildid-cache --add ./perf
        (update ./perf to new binary)
        # ./perf buildid-cache --update ./perf
        #
        ----
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20150226065440.23912.1494.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a50d11a1
    • A
      perf probe: Handle strdup() failure · 38ae502b
      Arnaldo Carvalho de Melo 提交于
      We could end up returning 0 (Ok) with a NULL raw_path. Fix it.
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naohiro Aota <naota@elisp.net>
      Link: http://lkml.kernel.org/n/tip-l0kcbcg5f4nnzqt01cv42vec@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      38ae502b
  7. 26 2月, 2015 7 次提交
  8. 25 2月, 2015 3 次提交
  9. 23 2月, 2015 8 次提交
  10. 19 2月, 2015 1 次提交
    • K
      perf tools: Construct LBR call chain · 384b6055
      Kan Liang 提交于
      LBR call stack only has user-space callchains. It is output in the
      PERF_SAMPLE_BRANCH_STACK data format. For kernel callchains, it's
      still in the form of PERF_SAMPLE_CALLCHAIN.
      
      The perf tool has to handle both data sources to construct a
      complete callstack.
      
      For the "perf report -D" option, both lbr and fp information will be
      displayed.
      
      A new call chain recording option "lbr" is introduced into the perf
      tool for LBR call stack. The user can use --call-graph lbr to get
      the call stack information from hardware.
      
      Here are some examples.
      
      When profiling bc(1) on Fedora 19:
      
        echo 'scale=2000; 4*a(1)' > cmd; perf record --call-graph lbr bc -l < cmd
      
      If enabling LBR, perf report output looks like:
      
          50.36%       bc  bc                 [.] bc_divide
                       |
                       --- bc_divide
                           execute
                           run_code
                           yyparse
                           main
                           __libc_start_main
                           _start
          33.66%       bc  bc                 [.] _one_mult
                       |
                       --- _one_mult
                           bc_divide
                           execute
                           run_code
                           yyparse
                           main
                           __libc_start_main
                           _start
           7.62%       bc  bc                 [.] _bc_do_add
                       |
                       --- _bc_do_add
                          |
                          |--99.89%-- 0x2000186a8
                           --0.11%-- [...]
           6.83%       bc  bc                 [.] _bc_do_sub
                       |
                       --- _bc_do_sub
                          |
                          |--99.94%-- bc_add
                          |          execute
                          |          run_code
                          |          yyparse
                          |          main
                          |          __libc_start_main
                          |          _start
                           --0.06%-- [...]
           0.46%       bc  libc-2.17.so       [.] __memset_sse2
                       |
                       --- __memset_sse2
                          |
                          |--54.13%-- bc_new_num
                          |          |
                          |          |--51.00%-- bc_divide
                          |          |          execute
                          |          |          run_code
                          |          |          yyparse
                          |          |          main
                          |          |          __libc_start_main
                          |          |          _start
                          |          |
                          |          |--30.46%-- _bc_do_sub
                          |          |          bc_add
                          |          |          execute
                          |          |          run_code
                          |          |          yyparse
                          |          |          main
                          |          |          __libc_start_main
                          |          |          _start
                          |          |
                          |           --18.55%-- _bc_do_add
                          |                     bc_add
                          |                     execute
                          |                     run_code
                          |                     yyparse
                          |                     main
                          |                     __libc_start_main
                          |                     _start
                          |
                           --45.87%-- bc_divide
                                     execute
                                     run_code
                                     yyparse
                                     main
                                     __libc_start_main
                                     _start
      
      If using FP, perf report output looks like:
      
        echo 'scale=2000; 4*a(1)' > cmd; perf record --call-graph fp bc -l < cmd
      
          50.49%       bc  bc                 [.] bc_divide
                       |
                       --- bc_divide
          33.57%       bc  bc                 [.] _one_mult
                       |
                       --- _one_mult
           7.61%       bc  bc                 [.] _bc_do_add
                       |
                       --- _bc_do_add
                           0x2000186a8
           6.88%       bc  bc                 [.] _bc_do_sub
                       |
                       --- _bc_do_sub
           0.42%       bc  libc-2.17.so       [.] __memcpy_ssse3_back
                       |
                       --- __memcpy_ssse3_back
      
      If using LBR, perf report -D output looks like:
      
      3458145275743 0x2fd750 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x2): 9748/9748: 0x408ea8 period: 609644 addr: 0
      ... LBR call chain: nr:8
      .....  0: fffffffffffffe00
      .....  1: 0000000000408e50
      .....  2: 000000000040a458
      .....  3: 000000000040562e
      .....  4: 0000000000408590
      .....  5: 00000000004022c0
      .....  6: 00000000004015dd
      .....  7: 0000003d1cc21b43
      ... FP chain: nr:2
      .....  0: fffffffffffffe00
      .....  1: 0000000000408ea8
       ... thread: bc:9748
       ...... dso: /usr/bin/bc
      
      The LBR call stack has the following known limitations:
      
       - Zero length calls are not filtered out by the hardware
      
       - Exception handing such as setjmp/longjmp will have calls/returns not
         match
      
       - Pushing different return address onto the stack will have
         calls/returns not match
      
       - If callstack is deeper than the LBR, only the last entries are
         captured
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1420482185-29830-3-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      384b6055