1. 25 9月, 2019 2 次提交
  2. 23 9月, 2019 1 次提交
    • A
      perf record: Move restricted maps check to after a possible fallback to not collect kernel samples · c8b567c8
      Arnaldo Carvalho de Melo 提交于
      Before:
      
        [acme@quaco ~]$ perf record -b -e cycles date
        WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
        check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid.
      
        Samples in kernel functions may not be resolved if a suitable vmlinux
        file is not found in the buildid cache or in the vmlinux path.
      
        Samples in kernel modules won't be resolved at all.
      
        If some relocation was applied (e.g. kexec) symbols may be misresolved
        even with a suitable vmlinux or kallsyms file.
      
        Mon 23 Sep 2019 11:00:59 AM -03
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.005 MB perf.data (14 samples) ]
        [acme@quaco ~]$
      
      But we did a fallback and exclude_kernel was set, so no need for
      resolving kernel symbols:
      
        $ perf evlist -v
        cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
        $
      
      After:
      
        [acme@quaco ~]$ perf record -b -e cycles date
        Mon 23 Sep 2019 11:07:18 AM -03
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.007 MB perf.data (16 samples) ]
        [acme@quaco ~]$ perf evlist -v
        cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
        [acme@quaco ~]$
      
      No needless warning is emitted.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lkml.kernel.org/n/tip-5yqnr8xcqwhr15xktj2097ac@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c8b567c8
  3. 21 9月, 2019 1 次提交
  4. 20 9月, 2019 2 次提交
  5. 01 9月, 2019 1 次提交
  6. 30 8月, 2019 1 次提交
  7. 29 8月, 2019 2 次提交
  8. 26 8月, 2019 1 次提交
  9. 14 8月, 2019 1 次提交
  10. 30 7月, 2019 16 次提交
  11. 09 7月, 2019 1 次提交
  12. 11 6月, 2019 1 次提交
    • Y
      perf record: Add support to collect callchains from kernel or user space only · 53651b28
      yuzhoujian 提交于
      One can just record callchains in the kernel or user space with this new
      options.
      
      We can use it together with "--all-kernel" options.
      
      This two options is used just like print_stack(sys) or print_ustack(usr)
      for systemtap.
      
      Shown below is the usage of this new option combined with "--all-kernel"
      options:
      
      1. Configure all used events to run in kernel space and just collect
         kernel callchains.
      
        $ perf record -a -g --all-kernel --kernel-callchains
      
      2. Configure all used events to run in kernel space and just collect
         user callchains.
      
        $ perf record -a -g --all-kernel --user-callchains
      
      Committer notes:
      
      Improved documentation to state that asking for kernel callchains really
      is asking for excluding user callchains, and vice versa.
      
      Further mentioned that using both won't get both, but nothing, as both
      will be excluded.
      Signed-off-by: Nyuzhoujian <yuzhoujian@didichuxing.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      53651b28
  13. 16 5月, 2019 8 次提交
    • K
      perf parse-regs: Split parse_regs · aeea9062
      Kan Liang 提交于
      The available registers for --int-regs and --user-regs may be different,
      e.g. XMM registers.
      
      Split parse_regs into two dedicated functions for --int-regs and
      --user-regs respectively.
      
      Modify the warning message. "--user-regs=?" should be applied to show
      the available registers for --user-regs.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1557865174-56264-1-git-send-email-kan.liang@linux.intel.com
      [ Changed docs as suggested by Ravi and agreed by Kan ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aeea9062
    • A
      perf record: Implement -z,--compression_level[=<n>] option · 504c1ad1
      Alexey Budankov 提交于
      Implemented -z,--compression_level[=<n>] option that enables compression
      of mmaped kernel data buffers content in runtime during perf record mode
      collection. Default option value is 1 (fastest compression).
      
      Compression overhead has been measured for serial and AIO streaming when
      profiling matrix multiplication workload:
      
            -------------------------------------------------------------
            | SERIAL			  | AIO-1                       |
        ----------------------------------------------------------------|
        |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
        |---------------------------------------------------------------|
        | 0 | 1,00   | 1,000    179,424   | 1,00   | 1,000    187,527   |
        | 1 | 1,04   | 8,427    181,148   | 1,01   | 8,474    188,562   |
        | 2 | 1,07   | 8,055    186,953   | 1,03   | 7,912    191,773   |
        | 3 | 1,04   | 8,283    181,908   | 1,03   | 8,220    191,078   |
        | 5 | 1,09   | 8,101    187,705   | 1,05   | 7,780    190,065   |
        | 8 | 1,05   | 9,217    179,191   | 1,12   | 6,111    193,024   |
        -----------------------------------------------------------------
      
      OVH = (Execution time with -z N) / (Execution time with -z 0)
      
      ratio - compression ratio
      size  - number of bytes that was compressed
      
      	size ~= trace size x ratio
      
      Committer notes:
      
      Testing it I noticed that it failed to disable build id processing when
      compression is enabled, and as we'd have to uncompress everything to
      look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids
      to read from DSOs, we better disable build id processing when
      compression is enabled, logging with pr_debug() when doing so:
      
      Original patch:
      
        # perf record -z2
        ^C[ perf record: Woken up 1 times to write data ]
        0x1746e0 [0x76]: failed to process type: 81 [Invalid argument]
        [ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ]
        #
      
      After auto-disabling build id processing when compression is enabled:
      
        $ perf record -z2 sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ]
        $ perf record -v -z2 sleep 1
        Compression enabled, disabling build id collection at the end of the session.
        <SNIP extra -v pr_debug() messages>
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ]
        $
      
      Also, with parts of the patch originally after this one moved to just
      before this one we get:
      
        $ perf record -z2 sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ]
        $ perf report -D | grep COMPRESS
        0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled!
        0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled!
              COMPRESSED events:          2
              COMPRESSED events:          0
        $
      
      I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code
      to process, we just show it as not being handled, skip them and
      continue, while before we had:
      
        $ perf report -D | grep COMPRESS
        0x1b8 [0x169]: failed to process type: 81 [Invalid argument]
        Error:
        failed to process sample
        0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED
        $
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/9ff06518-ae63-a908-e44d-5d9e56dd66d9@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      504c1ad1
    • A
      perf record: Implement compression for AIO trace streaming · ef781128
      Alexey Budankov 提交于
      Compression is implemented using the functions from zstd.c. As the memory
      to operate on the compression uses mmap->aio.data[] buffers. If Zstd
      streaming compression API fails for some reason the data to be compressed
      are just copied into the memory buffers using plain memcpy().
      
      Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
      records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE
      and consists of perf_event_header followed by the compressed chunk
      that is decompressed on the loading stage.
      
      perf_mmap__aio_push() is replaced by perf_mmap__push() which is now used
      in the both serial and AIO streaming cases. perf_mmap__push() is extended
      with positive return values to signify absence of data ready for
      processing.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/77db2b2c-5d03-dbb0-aeac-c4dd92129ab9@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ef781128
    • A
      perf record: Implement compression for serial trace streaming · 5d7f4116
      Alexey Budankov 提交于
      Compression is implemented using the functions from zstd.c. As the
      memory to operate on the compression uses mmap->data buffer.
      
      If Zstd streaming compression API fails for some reason the data to be
      compressed are just copied into the memory buffers using plain memcpy().
      
      Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
      records. Each element of the array is not longer that
      PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the
      compressed chunk that is decompressed on the loading stage.
      
      Comitter notes:
      
      Undo some unnecessary line breaks, remove some unnecessary () around
      zstd_data to then just get its address, and fix conflicts with
      BPF_PROG_INFO/BPF_BTF patchkits.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/744df43f-3932-2594-ddef-1e99a3cad03a@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5d7f4116
    • A
      perf mmap: Implement dedicated memory buffer for data compression · 51255a8a
      Alexey Budankov 提交于
      Implemented mmap data buffer that is used as the memory to operate
      on when compressing data in case of serial trace streaming.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/49b31321-0f70-392b-9a4f-649d3affe090@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      51255a8a
    • A
      perf record: Implement COMPRESSED event record and its attributes · 42e1fd80
      Alexey Budankov 提交于
      Implemented PERF_RECORD_COMPRESSED event, related data types, header
      feature and functions to write, read and print feature attributes from
      the trace header section.
      
      comp_mmap_len preserves the size of mmaped kernel buffer that was used
      during collection. comp_mmap_len size is used on loading stage as the
      size of decomp buffer for decompression of COMPRESSED events content.
      
      Committer notes:
      
      Fixed up conflict with BPF_PROG_INFO and BTF_BTF header features.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/ebbaf031-8dda-3864-ebc6-7922d43ee515@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      42e1fd80
    • A
      perf session: Define 'bytes_transferred' and 'bytes_compressed' metrics · d3c8c08e
      Alexey Budankov 提交于
      Define 'bytes_transferred' and 'bytes_compressed' metrics to calculate
      ratio in the end of the data collection:
      
      	compression ratio = bytes_transferred / bytes_compressed
      
      The 'bytes_transferred' metric accumulates the amount of bytes that was
      extracted from the mmaped kernel buffers for compression, while
      'bytes_compressed' accumulates the amount of bytes that was received
      after applying compression.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1d4bf499-cb03-26dc-6fc6-f14fec7622ce@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d3c8c08e
    • A
      perf record: Fix suggestion to get list of registers usable with --user-regs and --intr-regs · 8e5bc76f
      Arnaldo Carvalho de Melo 提交于
        $ perf record -h -I
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -I, --intr-regs[=<any register>]
                                  sample selected machine registers on interrupt, use -I ? to list register names
      
        $ m
        $ perf record -I ?
        Workload failed: No such file or directory
        $
      
        After:
      
        $ perf record -h -I
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -I, --intr-regs[=<any register>]
                                  sample selected machine registers on interrupt, use '-I?' to list register names
      
        $
        $ perf record -I?
        available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -I, --intr-regs[=<any register>]
                                  sample selected machine registers on interrupt, use '-I?' to list register names
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Fixes: bcc84ec6 ("perf record: Add ability to name registers to record")
      Link: https://lkml.kernel.org/n/tip-r0xhfhy5radmkhhcbcfs5izf@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8e5bc76f
  14. 02 4月, 2019 1 次提交
    • A
      perf record: Implement --mmap-flush=<number> option · 470530bb
      Alexey Budankov 提交于
      Implement a --mmap-flush option that specifies minimal number of bytes
      that is extracted from mmaped kernel buffer to store into a trace. The
      default option value is 1 byte what means every time trace writing
      thread finds some new data in the mmaped buffer the data is extracted,
      possibly compressed and written to a trace.
      
        $ tools/perf/perf record --mmap-flush 1024 -e cycles -- matrix.gcc
        $ tools/perf/perf record --aio --mmap-flush 1K -e cycles -- matrix.gcc
      
      The option is independent from -z setting, doesn't vary with compression
      level and can serve two purposes.
      
      The first purpose is to increase the compression ratio of a trace data.
      Larger data chunks are compressed more effectively so the implemented
      option allows specifying data chunk size to compress. Also at some cases
      executing more write syscalls with smaller data size can take longer
      than executing less write syscalls with bigger data size due to syscall
      overhead so extracting bigger data chunks specified by the option value
      could additionally decrease runtime overhead.
      
      The second purpose is to avoid self monitoring live-lock issue in system
      wide (-a) profiling mode. Profiling in system wide mode with compression
      (-a -z) can additionally induce data into the kernel buffers along with
      the data from monitored processes. If performance data rate and volume
      from the monitored processes is high then trace streaming and
      compression activity in the tool is also high. High tool process
      activity can lead to subtle live-lock effect when compression of single
      new byte from some of mmaped kernel buffer leads to generation of the
      next single byte at some mmaped buffer. So perf tool process ends up in
      endless self monitoring.
      
      Implemented synch parameter is the mean to force data move independently
      from the specified flush threshold value. Despite the provided flush
      value the tool needs capability to unconditionally drain memory buffers,
      at least in the end of the collection.
      
      Committer testing:
      
      Running with the default value, i.e. as soon as there is something to
      read go on consuming, we first write the synthesized events, small
      chunks of about 128 bytes:
      
        # perf trace -m 2048 --call-graph dwarf -e write -- perf record
        <SNIP>
           101.142 ( 0.004 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x210db60, count: 120) = 120
                                               __libc_write (/usr/lib64/libpthread-2.28.so)
                                               ion (/home/acme/bin/perf)
                                               record__write (inlined)
                                               process_synthesized_event (/home/acme/bin/perf)
                                               perf_tool__process_synth_event (inlined)
                                               perf_event__synthesize_mmap_events (/home/acme/bin/perf)
      
      Then we move to reading the mmap buffers consuming the events put there
      by the kernel perf infrastructure:
      
           107.561 ( 0.005 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02000, count: 336) = 336
                                               __libc_write (/usr/lib64/libpthread-2.28.so)
                                               ion (/home/acme/bin/perf)
                                               record__write (inlined)
                                               record__pushfn (/home/acme/bin/perf)
                                               perf_mmap__push (/home/acme/bin/perf)
                                               record__mmap_read_evlist (inlined)
                                               record__mmap_read_all (inlined)
                                               __cmd_record (inlined)
                                               cmd_record (/home/acme/bin/perf)
           12919.953 ( 0.136 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc83150, count: 184984) = 184984
        <SNIP same backtrace as in the 107.561 timestamp>
           12920.094 ( 0.155 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02150, count: 261816) = 261816
        <SNIP same backtrace as in the 107.561 timestamp>
           12920.253 ( 0.093 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befb81120, count: 170832) = 170832
        <SNIP same backtrace as in the 107.561 timestamp>
      
      If we limit it to write only when more than 16MB are available for
      reading, it throttles that to a quarter of the --mmap-pages set for
      'perf record', which by default get to 528384 bytes, found out using
      'record -v':
      
        mmap flush: 132096
        mmap size 528384B
      
      With that in place all the writes coming from
      record__mmap_read_evlist(), i.e. from the mmap buffers setup by the
      kernel perf infrastructure were at least 132096 bytes long.
      
      Trying with a bigger mmap size:
      
         perf trace -e write perf record -v -m 2048 --mmap-flush 16M
         74982.928 ( 2.471 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff94a6cc000, count: 3580888) = 3580888
         74985.406 ( 2.353 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff949ecb000, count: 3453256) = 3453256
         74987.764 ( 2.629 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9496ca000, count: 3859232) = 3859232
         74990.399 ( 2.341 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff948ec9000, count: 3769032) = 3769032
         74992.744 ( 2.064 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9486c8000, count: 3310520) = 3310520
         74994.814 ( 2.619 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff947ec7000, count: 4194688) = 4194688
         74997.439 ( 2.787 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9476c6000, count: 4029760) = 4029760
      
      Was again limited to a quarter of the mmap size:
      
        mmap flush: 2098176
        mmap size 8392704B
      
      A warning about that would be good to have but can be added later,
      something like:
      
        "max flush is a quarter of the mmap size, if wanting to bump the mmap
         flush further, bump the mmap size as well using -m/--mmap-pages"
      
      Also rename the 'sync' parameters to 'synch' to keep tools/perf building
      with older glibcs:
      
        cc1: warnings being treated as errors
        builtin-record.c: In function 'record__mmap_read_evlist':
        builtin-record.c:775: warning: declaration of 'sync' shadows a global declaration
        /usr/include/unistd.h:933: warning: shadowed declaration is here
        builtin-record.c: In function 'record__mmap_read_all':
        builtin-record.c:856: warning: declaration of 'sync' shadows a global declaration
        /usr/include/unistd.h:933: warning: shadowed declaration is here
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/f6600d72-ecfa-2eb7-7e51-f6954547d500@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      470530bb
  15. 21 3月, 2019 1 次提交
    • S
      perf tools: Save bpf_prog_info and BTF of new BPF programs · d56354dc
      Song Liu 提交于
      To fully annotate BPF programs with source code mapping, 4 different
      information are needed:
      
          1) PERF_RECORD_KSYMBOL
          2) PERF_RECORD_BPF_EVENT
          3) bpf_prog_info
          4) btf
      
      This patch handles 3) and 4) for BPF programs loaded after 'perf
      record|top'.
      
      For timely process of these information, a dedicated event is added to
      the side band evlist.
      
      When PERF_RECORD_BPF_EVENT is received via the side band event, the
      polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env.
      
      This information is saved to perf.data at the end of 'perf record'.
      
      Committer testing:
      
      The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a
      unnamed union, so can't be used in a struct designated initialization
      with older gccs, get it out of that, isolating as 'attr.wakeup_watermark
      = 1;' to work with all gcc versions.
      
      We also need to add '--no-bpf-event' to the 'perf record'
      perf_event_attr tests in 'perf test', as the way that that test goes is
      to intercept the events being setup and looking if they match the fields
      described in the control files, since now it finds first the side band
      event used to catch the PERF_RECORD_BPF_EVENT, they all fail.
      
      With these issues fixed:
      
      Same scenario as for testing BPF programs loaded before 'perf record' or
      'perf top' starts, only start the BPF programs after 'perf record|top',
      so that its information get collected by the sideband threads, the rest
      works as for the programs loaded before start monitoring.
      
      Add missing 'inline' to the bpf_event__add_sb_event() when
      HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without
      binutils devel files installed.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d56354dc