1. 07 11月, 2019 1 次提交
  2. 07 10月, 2019 1 次提交
  3. 25 9月, 2019 2 次提交
  4. 21 9月, 2019 1 次提交
  5. 20 9月, 2019 4 次提交
  6. 01 9月, 2019 3 次提交
  7. 30 8月, 2019 4 次提交
  8. 29 8月, 2019 3 次提交
  9. 27 8月, 2019 3 次提交
  10. 20 8月, 2019 1 次提交
    • A
      perf report: Dump LBR callstack data by -D jointly with thread stack · d2720c3d
      Alexey Budankov 提交于
      Make perf report -D command print captured LBR callstack chain when it is
      collected together with raw thread stack data:
      
        2752673087247083 0x5d10 [0x548]: PERF_RECORD_SAMPLE(IP, 0x4002): 5841/5841: 0x40121f period: 1543862 addr: 0
        ... FP chain: nr:0
        ... branch callstack: nr:3
        .....  0: 00000000004011d0
        .....  1: 00007f393c388411
        .....  2: 0000000000401098
        ... user regs: mask 0xff0fff ABI 64-bit
        .... AX    0x34e7
        .... BX    0x7fff5f6dd3c0
        .... CX    0xffffffff
        .... DX    0x34e6
        .... SI    0x7f393c5268d0
        .... DI    0x0
        .... BP    0x401260
        .... SP    0x7fff5f6dd3c0
        .... IP    0x40121f
        .... FLAGS 0x29f
        .... CS    0x33
        .... SS    0x2b
        .... R8    0x7f393c526800
        .... R9    0x7f393c525da0
        .... R10   0xfffffffffffff70a
        .... R11   0x246
        .... R12   0x401070
        .... R13   0x7fff5f6ddcb0
        .... R14   0x0
        .... R15   0x0
        ... ustack: size 1024, offset 0x130
         . data_src: 0x5080021
         ... thread: stack_test:5841
         ...... dso: /root/abudanko/stacks/stack_test
      
      Committer testing:
      
        # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.042 MB perf.data (10 samples) ]
        #
      
      Before:
      
        # perf report -D |& grep PERF_RECORD_SAMPLE -A28 | tail -29
        67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0
        ... FP chain: nr:0
        ... user regs: mask 0xff0fff ABI 64-bit
        .... AX    0x7f441b2b1000
        .... BX    0x7f441b55b970
        .... CX    0x7fff6e2db218
        .... DX    0x7fff6e2db218
        .... SI    0x7fff6e2db208
        .... DI    0x1
        .... BP    0x1
        .... SP    0x7fff6e2db178
        .... IP    0x7f441b2b1e20
        .... FLAGS 0x20a
        .... CS    0x33
        .... SS    0x2b
        .... R8    0x1
        .... R9    0x7f441b371c18
        .... R10   0x7f441b5a5f10
        .... R11   0x202
        .... R12   0x7fff6e2db208
        .... R13   0x7fff6e2db218
        .... R14   0x7f441b5a7150
        .... R15   0x0
        ... ustack: size 1024, offset 0x148
         . data_src: 0x5080021
         ... thread: ls:9721
         ...... dso: /usr/lib64/libpthread-2.29.so
      
        0xad00 [0x60]: event: 10
        #
      
      After:
      
        # perf report -D |& grep PERF_RECORD_SAMPLE -A31 | tail -32
        67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0
        ... FP chain: nr:0
        ... branch callstack: nr:4
        .....  0: 00007f441b2b1e20
        .....  1: 00007f441b58af1a
        .....  2: 00007f441b58b0e1
        .....  3: 00007f441b57c145
        ... user regs: mask 0xff0fff ABI 64-bit
        .... AX    0x7f441b2b1000
        .... BX    0x7f441b55b970
        .... CX    0x7fff6e2db218
        .... DX    0x7fff6e2db218
        .... SI    0x7fff6e2db208
        .... DI    0x1
        .... BP    0x1
        .... SP    0x7fff6e2db178
        .... IP    0x7f441b2b1e20
        .... FLAGS 0x20a
        .... CS    0x33
        .... SS    0x2b
        .... R8    0x1
        .... R9    0x7f441b371c18
        .... R10   0x7f441b5a5f10
        .... R11   0x202
        .... R12   0x7fff6e2db208
        .... R13   0x7fff6e2db218
        .... R14   0x7f441b5a7150
        .... R15   0x0
        ... ustack: size 1024, offset 0x148
         . data_src: 0x5080021
         ... thread: ls:9721
         ...... dso: /usr/lib64/libpthread-2.29.so
        #
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/aa82e5dd-def2-0ca8-a064-db9e2e8ad076@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d2720c3d
  11. 13 8月, 2019 1 次提交
  12. 30 7月, 2019 6 次提交
  13. 23 7月, 2019 1 次提交
  14. 09 7月, 2019 3 次提交
  15. 28 5月, 2019 1 次提交
  16. 16 5月, 2019 2 次提交
    • A
      perf report: Implement perf.data record decompression · cb62c6f1
      Alexey Budankov 提交于
      zstd_init(, comp_level = 0) initializes decompression part of API only
      hat now consists of zstd_decompress_stream() function.
      
      The perf.data PERF_RECORD_COMPRESSED records are decompressed using
      zstd_decompress_stream() function into a linked list of mmaped memory
      regions of mmap_comp_len size (struct decomp).
      
      After decompression of one COMPRESSED record its content is iterated and
      fetched for usual processing. The mmaped memory regions with
      decompressed events are kept in the linked list till the tool process
      termination.
      
      When dumping raw records (e.g., perf report -D --header) file offsets of
      events from compressed records are printed as zero.
      
      Committer notes:
      
      Since now we have support for processing PERF_RECORD_COMPRESSED, we see
      none, in raw form, like we saw in the previous patch commiter notes,
      they were decompressed into the usual PERF_RECORD_{FORK,MMAP,COMM,etc}
      records, we only see the stats for those PERF_RECORD_COMPRESSED events,
      and since I used the file generated in the commiter notes for the
      previous patch, there they are, 2 compressed records:
      
        $ perf report --header-only | grep cmdline
        # cmdline : /home/acme/bin/perf record -z2 sleep 1
        $ perf report -D | grep COMPRESS
              COMPRESSED events:          2
              COMPRESSED events:          0
        $ perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 15  of event 'cycles:u'
        # Event count (approx.): 962227
        #
        # Overhead  Command  Shared Object     Symbol
        # ........  .......  ................  ...........................
        #
            46.99%  sleep    libc-2.28.so      [.] _dl_addr
            29.24%  sleep    [unknown]         [k] 0xffffffffaea00a67
            16.45%  sleep    libc-2.28.so      [.] __GI__IO_un_link.part.1
             5.92%  sleep    ld-2.28.so        [.] _dl_setup_hash
             1.40%  sleep    libc-2.28.so      [.] __nanosleep
             0.00%  sleep    [unknown]         [k] 0xffffffffaea00163
      
        #
        # (Tip: To see callchains in a more compact form: perf report -g folded)
        #
        $
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cb62c6f1
    • A
      perf report: Add stub processing of compressed events for -D · 61a7773c
      Alexey Budankov 提交于
      Committer note:
      
      Split from a larger patch, this only dumps PERF_RECORD_COMPRESSED as
      unhandled, so that when we introduce the record part in the next patch,
      we don't see unhandled events when using 'perf record -D'.
      
      Changed it so that we dump the event if the handler is just a stub, i.e.
      for the case where we don't have ZSTD linked but we're processing a
      perf.data file generated by a tool with that linked.
      
      Also when failing to decompress we can't just dump the uncompressed
      event and return 0, we have to propagate the error.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61a7773c
  17. 03 5月, 2019 1 次提交
    • T
      perf report: Report OOM in status line in the GTK UI · 167e418f
      Thomas Richter 提交于
      An -ENOMEM error is not reported in the GTK GUI.  Instead this error
      message pops up on the screen:
      
      [root@m35lp76 perf]# ./perf  report -i perf.data.error68-1
      
      	Processing events... [974K/3M]
      	Error:failed to process sample
      
      	0xf4198 [0x8]: failed to process type: 68
      
      However when I use the same perf.data file with --stdio it works:
      
      [root@m35lp76 perf]# ./perf  report -i perf.data.error68-1 --stdio \
      		| head -12
      
        # Total Lost Samples: 0
        #
        # Samples: 76K of event 'cycles'
        # Event count (approx.): 99056160000
        #
        # Overhead  Command          Shared Object      Symbol
        # ........  ...............  .................  .........
        #
           8.81%  find             [kernel.kallsyms]  [k] ftrace_likely_update
           8.74%  swapper          [kernel.kallsyms]  [k] ftrace_likely_update
           8.34%  sshd             [kernel.kallsyms]  [k] ftrace_likely_update
           2.19%  kworker/u512:1-  [kernel.kallsyms]  [k] ftrace_likely_update
      
      The sample precentage is a bit low.....
      
      The GUI always fails in the FINISHED_ROUND event (68) and does not
      indicate the reason why.
      
      When happened is the following. Perf report calls a lot of functions and
      down deep when a FINISHED_ROUND event is processed, these functions are
      called:
      
        perf_session__process_event()
        + perf_session__process_user_event()
          + process_finished_round()
            + ordered_events__flush()
              + __ordered_events__flush()
      	  + do_flush()
      	    + ordered_events__deliver_event()
      	      + perf_session__deliver_event()
      	        + machine__deliver_event()
      	          + perf_evlist__deliver_event()
      	            + process_sample_event()
      	              + hist_entry_iter_add() --> only called in GUI case!!!
      	                + hist_iter__report__callback()
      	                  + symbol__inc_addr_sample()
      
      	                    Now this functions runs out of memory and
      			    returns -ENOMEM. This is reported all the way up
      			    until function
      
      perf_session__process_event() returns to its caller, where -ENOMEM is
      changed to -EINVAL and processing stops:
      
       if ((skip = perf_session__process_event(session, event, head)) < 0) {
            pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
      	     head, event->header.size, event->header.type);
            err = -EINVAL;
            goto out_err;
       }
      
      This occurred in the FINISHED_ROUND event when it has to process some
      10000 entries and ran out of memory.
      
      This patch indicates the root cause and displays it in the status line
      of ther perf report GUI.
      
      Output before (on GUI status line):
      
        0xf4198 [0x8]: failed to process type: 68
      
      Output after:
      
        0xf4198 [0x8]: failed to process type: 68 [not enough memory]
      
      Committer notes:
      
      the 'skip' variable needs to be initialized to -EINVAL, so that when the
      size is less than sizeof(struct perf_event_attr) we avoid this valid
      compiler warning:
      
        util/session.c: In function ‘perf_session__process_events’:
        util/session.c:1936:7: error: ‘skip’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
           err = skip;
           ~~~~^~~~~~
        util/session.c:1874:6: note: ‘skip’ was declared here
          s64 skip;
              ^~~~
        cc1: all warnings being treated as errors
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Reviewed-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20190423105303.61683-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      167e418f
  18. 20 3月, 2019 1 次提交
    • S
      perf bpf: Save bpf_prog_info in a rbtree in perf_env · e4378f0c
      Song Liu 提交于
      bpf_prog_info contains information necessary to annotate bpf programs.
      
      This patch saves bpf_prog_info for bpf programs loaded in the system.
      
      Some big picture of the next few patches:
      
      To fully annotate BPF programs with source code mapping, 4 different
      informations are needed:
      
          1) PERF_RECORD_KSYMBOL
          2) PERF_RECORD_BPF_EVENT
          3) bpf_prog_info
          4) btf
      
      Before this set, 1) and 2) in the list are already saved to perf.data
      file. For BPF programs that are already loaded before perf run, 1) and 2)
      are synthesized by perf_event__synthesize_bpf_events(). For short living
      BPF programs, 1) and 2) are generated by kernel.
      
      This set handles 3) and 4) from the list. Again, it is necessary to handle
      existing BPF program and short living program separately.
      
      This patch handles 3) for exising BPF programs while synthesizing 1) and
      2) in perf_event__synthesize_bpf_events(). These data are stored in
      perf_env. The next patch saves these data from perf_env to perf.data as
      headers.
      
      Similarly, the two patches after the next saves 4) of existing BPF
      programs to perf_env and perf.data.
      
      Another patch later will handle 3) and 4) for short living BPF programs
      by monitoring 1) and 2) in a dedicate thread.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-7-songliubraving@fb.com
      [ set env->bpf_progs.infos_cnt to zero in perf_env__purge_bpf() as noted by jolsa ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e4378f0c
  19. 11 3月, 2019 1 次提交