1. 30 5月, 2016 1 次提交
    • A
      perf tools: Per event max-stack settings · 792d48b4
      Arnaldo Carvalho de Melo 提交于
      The tooling counterpart, now it is possible to do:
      
        # perf record -e sched:sched_switch/max-stack=10/ -e cycles/call-graph=dwarf,max-stack=4/ -e cpu-cycles/call-graph=dwarf,max-stack=1024/ usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.052 MB perf.data (5 samples) ]
        # perf evlist -v
        sched:sched_switch: type: 2, size: 112, config: 0x110, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, sample_max_stack: 10
        cycles/call-graph=dwarf,max-stack=4/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 4
        cpu-cycles/call-graph=dwarf,max-stack=1024/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 1024
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
      
      Using just /max-stack=N/ means /call-graph=fp,max-stack=N/, that should
      be further configurable by means of some .perfconfig knob.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      792d48b4
  2. 18 4月, 2016 2 次提交
  3. 14 4月, 2016 1 次提交
  4. 13 4月, 2016 1 次提交
  5. 12 4月, 2016 5 次提交
  6. 31 3月, 2016 1 次提交
    • A
      perf tools: Add time conversion event · 46bc29b9
      Adrian Hunter 提交于
      Intel PT uses the time members from the perf_event_mmap_page to convert
      between TSC and perf time.
      
      Due to a lack of foresight when Intel PT was implemented, those time
      members were recorded in the (implementation dependent) AUXTRACE_INFO
      event, the structure of which is generally inaccessible outside of the
      Intel PT decoder.  However now the conversion between TSC and perf time
      is needed when processing a jitdump file when Intel PT has been used for
      tracing.
      
      So add a user event to record the time members.  'perf record' will
      synthesize the event if the information is available.  And session
      processing will put a copy of the event on the session so that tools
      like 'perf inject' can easily access it.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      46bc29b9
  7. 23 3月, 2016 1 次提交
  8. 08 3月, 2016 1 次提交
  9. 16 1月, 2016 1 次提交
    • R
      perf kvm record/report: 'unprocessable sample' error while recording/reporting guest data · 3caeaa56
      Ravi Bangoria 提交于
      While recording guest samples in host using perf kvm record, it will
      populate unprocessable sample error, though samples will be recorded
      properly. While generating report using perf kvm report, no samples will
      be processed and same error will populate. We have seen this behaviour
      with upstream perf(4.4-rc3) on x86 and ppc64 hardware.
      
      Reason behind this failure is, when it tries to fetch machine from
      rb_tree of machines, it fails. As a part of tracing a bug, we figured
      out that this code was incorrectly refactored in commit 54245fdc
      ("perf session: Remove wrappers to machines__find").
      
      This patch will change the functionality such that if it can't fetch
      machine in first trial, it will create one node of machine and add that to
      rb_tree. So next time when it tries to fetch same machine from rb_tree,
      it won't fail. Actually it was the case before refactoring of code in
      aforementioned commit.
      
      This patch is generated from acme perf/core branch.
      
      Below I've mention an example that demonstrate the behaviour before and
      after applying patch.
      
      Before applying patch:
      [Note: One needs to run guest before recording data in host]
      
        ravi@ravi-bangoria:~$ ./perf kvm record -a
        Warning:
        5903 unprocessable samples recorded.
        Do you have a KVM guest running and not using 'perf kvm'?
        [ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]
      
        ravi@ravi-bangoria:~$ ./perf kvm report --stdio
        Warning:
        5903 unprocessable samples recorded.
        Do you have a KVM guest running and not using 'perf kvm'?
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 285  of event 'cycles'
        # Event count (approx.): 88715406
        #
        # Overhead  Command  Shared Object  Symbol
        # ........  .......  .............  ......
        #
      
        # (For a higher level overview, try: perf report --sort comm,dso)
        #
      
      After applying patch:
      
        ravi@ravi-bangoria:~$ ./perf kvm record -a
        [ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]
      
        ravi@ravi-bangoria:~$ ./perf kvm report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 17  of event 'cycles'
        # Event count (approx.): 700746
        #
        # Overhead  Command  Shared Object     Symbol
        # ........  .......  ................  ......................
        #
            34.19%  :5758    [unknown]         [g] 0xffffffff818682ab
            22.79%  :5758    [unknown]         [g] 0xffffffff812dc7f8
            22.79%  :5758    [unknown]         [g] 0xffffffff818650d0
            14.83%  :5758    [unknown]         [g] 0xffffffff8161a1b6
             2.49%  :5758    [unknown]         [g] 0xffffffff818692bf
             0.48%  :5758    [unknown]         [g] 0xffffffff81869253
             0.05%  :5758    [unknown]         [g] 0xffffffff81869250
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org # v3.19+
      Fixes: 54245fdc ("perf session: Remove wrappers to machines__find")
      Link: http://lkml.kernel.org/r/1449471302-11283-1-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3caeaa56
  10. 18 12月, 2015 8 次提交
  11. 11 12月, 2015 1 次提交
    • M
      perf tools: Make perf_session__register_idle_thread drop the refcount · 9d8b172f
      Masami Hiramatsu 提交于
      Note that since the thread was already inserted to the session
      list, it will be released when the session is released.
      Also, in perf_session__register_idle_thread() failure path,
      the thread should be put before returning.
      
      Refcnt debugger shows that the perf_session__register_idle_thread
      gets the returned thread, but the caller (__cmd_top) does not
      put the returned idle thread.
      
        ----
        ==== [0] ====
        Unreclaimed thread@0x24e6240
        Refcount +1 => 0 at
          ./perf(thread__new+0xe5) [0x4c8a75]
          ./perf(machine__findnew_thread+0x9a) [0x4bbdba]
          ./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
          ./perf(cmd_top+0xd7d) [0x43cf6d]
          ./perf() [0x47ba35]
          ./perf(main+0x617) [0x4225b7]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
          ./perf() [0x42272d]
        Refcount +1 => 1 at
          ./perf(thread__get+0x2c) [0x4c8bcc]
          ./perf(machine__findnew_thread+0xee) [0x4bbe0e]
          ./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
          ./perf(cmd_top+0xd7d) [0x43cf6d]
          ./perf() [0x47ba35]
          ./perf(main+0x617) [0x4225b7]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
          ./perf() [0x42272d]
        Refcount +1 => 2 at
          ./perf(thread__get+0x2c) [0x4c8bcc]
          ./perf(machine__findnew_thread+0x112) [0x4bbe32]
          ./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
          ./perf(cmd_top+0xd7d) [0x43cf6d]
          ./perf() [0x47ba35]
          ./perf(main+0x617) [0x4225b7]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
          ./perf() [0x42272d]
        ----
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151209021122.10245.69707.stgit@localhost.localdomain
      [ Drop the refcount in perf_session__register_idle_thread() ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9d8b172f
  12. 12 11月, 2015 1 次提交
  13. 01 10月, 2015 1 次提交
  14. 29 9月, 2015 2 次提交
  15. 18 9月, 2015 1 次提交
    • M
      perf record: Avoid infinite loop at buildid processing with no samples · 381c02f6
      Mark Rutland 提交于
      If a session contains no events, we can get stuck in an infinite loop in
      __perf_session__process_events, with a non-zero file_size and data_offset, but
      a zero data_size.
      
      In this case, we can mmap the entirety of the file (consisting of the file and
      attribute headers), and fetch_mmaped_event will correctly refuse to read any
      (unmapped and non-existent) event headers. This causes
      __perf_session__process_events to unmap the file and retry with the exact same
      parameters, getting stuck in an infinite loop.
      
      This has been observed to result in an exit-time hang when counting
      rare/unschedulable events with perf record, and can be triggered artificially
      with the script below:
      
        ----
        #!/bin/sh
        printf "REPRO: launching perf\n";
        ./perf record -e software/config=9/ sleep 1 &
        PERF_PID=$!;
        sleep 0.002;
        kill -2 $PERF_PID;
        printf "REPRO: waiting for perf (%d) to exit...\n" "$PERF_PID";
        wait $PERF_PID;
        printf "REPRO: perf exited\n";
        ----
      
      To avoid this, have __perf_session__process_events bail out early when
      the file has no data (i.e. it has no events).
      
      Commiter note:
      
      I only managed to reproduce this when setting
      /proc/sys/kernel/kptr_restrict to '1' and changing the code to
      purposefully not process any samples and no synthesized samples, i.e.
      kptr_restrict prevents 'record' from synthesizing the kernel mmaps for
      vmlinux + modules and since it is a workload started from perf, we don't
      synthesize mmap/comm records for existing threads.
      
      Adrian Hunter managed to reproduce it in his environment tho.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Tested-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1442423929-12253-1-git-send-email-mark.rutland@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      381c02f6
  16. 14 9月, 2015 2 次提交
  17. 04 9月, 2015 1 次提交
  18. 03 9月, 2015 1 次提交
    • K
      perf tools: Store the cpu socket and core ids in the perf.data header · 2bb00d2f
      Kan Liang 提交于
      This patch stores the cpu socket_id and core_id in a perf.data header,
      and reads them into the perf_env struct when processing perf.data files.
      
      The changes modifies the CPU_TOPOLOGY section, making sure it is
      backward/forward compatible.
      
      The patch checks the section size before reading the core and socket ids.
      
      It never reads data crossing the section boundary.  An old perf binary
      without this patch can also correctly read the perf.data from a new perf
      with this patch.
      
      Because the new info is added at the end of the cpu_topology section, an
      old perf tool ignores the extra data.
      
      Examples:
      
      1. New perf with this patch read perf.data from an old perf without the
         patch:
      
        $ perf_new report -i perf_old.data --header-only -I
        ......
        # sibling threads : 33
        # sibling threads : 34
        # sibling threads : 35
        # Core ID and Socket ID information is not available
        # node0 meminfo  : total = 32823872 kB, free = 29315548 kB
        # node0 cpu list : 0-17,36-53
        ......
      
      2. Old perf without the patch reads perf.data from a new perf with the
         patch:
      
        $ perf_old report -i perf_new.data --header-only -I
        ......
        # sibling threads : 33
        # sibling threads : 34
        # sibling threads : 35
        # node0 meminfo  : total = 32823872 kB, free = 29190932 kB
        # node0 cpu list : 0-17,36-53
        ......
      
      3. New perf read new perf.data:
      
        $ perf_new report -i perf_new.data --header-only -I
        ......
        # sibling threads : 33
        # sibling threads : 34
        # sibling threads : 35
        # CPU 0: Core ID 0, Socket ID 0
        # CPU 1: Core ID 1, Socket ID 0
        ......
        # CPU 61: Core ID 10, Socket ID 1
        # CPU 62: Core ID 11, Socket ID 1
        # CPU 63: Core ID 16, Socket ID 1
        # node0 meminfo  : total = 32823872 kB, free = 29190932 kB
        # node0 cpu list : 0-17,36-53
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1441115893-22006-2-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2bb00d2f
  19. 29 8月, 2015 1 次提交
  20. 07 8月, 2015 1 次提交
  21. 29 7月, 2015 1 次提交
    • A
      perf session env: Rename exit method · 4c7de49a
      Arnaldo Carvalho de Melo 提交于
      The semantic associated in tools/perf/ with foo__delete(instance) is to
      release all resources referenced by 'instance' members and then release
      the memory for 'instance' itself.
      
      The perf_session_env__delete() function isn't doing this, it just does
      the first part, but the space used by 'instance' itself isn't freed, as
      it is embedded in a larger structure, that will be freed at other stage.
      
      For these cases we se foo__exit(), i.e. the usage is:
      
       void foo__delete(foo)
       {
               if (foo) {
                       foo__exit(foo);
                       free(foo);
               }
       }
      
      But when we have something like:
      
       struct bar {
               struct foo foo;
               . . .
       }
      
      Then we can't really call foo__delete(&bar.foo), we must have this
      instead:
      
       void bar__exit(bar)
       {
               foo__exit(&bar.foo);
               /* free other bar-> resources */
       }
      
       void bar__delete(bar)
       {
               if (bar) {
      		bar__exit(bar);
                      free(bar);
               }
       }
      
      So just rename perf_session_env__delete() to perf_session_env__exit().
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-djbgpcfo5udqptx3q0flwtmk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4c7de49a
  22. 24 7月, 2015 1 次提交
  23. 22 7月, 2015 1 次提交
  24. 26 6月, 2015 1 次提交
  25. 24 6月, 2015 2 次提交