1. 23 4月, 2020 1 次提交
    • S
      perf record: Add num-synthesize-threads option · d99c22ea
      Stephane Eranian 提交于
      To control degree of parallelism of the synthesize_mmap() code which
      is scanning /proc/PID/task/PID/maps and can be time consuming.
      Mimic perf top way of handling the option.
      If not specified will default to 1 thread, i.e. default behavior before
      this option.
      
      On a desktop computer the processing of /proc/PID/task/PID/maps isn't
      slow enough to warrant parallel processing and the thread creation has
      some cost - hence the default of 1. On a loaded server with
      >100 cores it is possible to see synthesis times in the order of
      seconds and in this case having the option is desirable.
      
      As the processing is a synchronization point, it is legitimate to worry if
      Amdahl's law will apply to this patch. Profiling with this patch in
      place:
      https://lore.kernel.org/lkml/20200415054050.31645-4-irogers@google.com/
      shows:
      ...
            - 32.59% __perf_event__synthesize_threads
               - 32.54% __event__synthesize_thread
                  + 22.13% perf_event__synthesize_mmap_events
                  + 6.68% perf_event__get_comm_ids.constprop.0
                  + 1.49% process_synthesized_event
                  + 1.29% __GI___readdir64
                  + 0.60% __opendir
      ...
      That is the processing is 1.49% of execution time and there is plenty to
      make parallel. This is shown in the benchmark in this patch:
      
      https://lore.kernel.org/lkml/20200415054050.31645-2-irogers@google.com/
      
        Computing performance of multi threaded perf event synthesis by
        synthesizing events on CPU 0:
         Number of synthesis threads: 1
           Average synthesis took: 127729.000 usec (+- 3372.880 usec)
           Average num. events: 21548.600 (+- 0.306)
           Average time per event 5.927 usec
         Number of synthesis threads: 2
           Average synthesis took: 88863.500 usec (+- 385.168 usec)
           Average num. events: 21552.800 (+- 0.327)
           Average time per event 4.123 usec
         Number of synthesis threads: 3
           Average synthesis took: 83257.400 usec (+- 348.617 usec)
           Average num. events: 21553.200 (+- 0.327)
           Average time per event 3.863 usec
         Number of synthesis threads: 4
           Average synthesis took: 75093.000 usec (+- 422.978 usec)
           Average num. events: 21554.200 (+- 0.200)
           Average time per event 3.484 usec
         Number of synthesis threads: 5
           Average synthesis took: 64896.600 usec (+- 353.348 usec)
           Average num. events: 21558.000 (+- 0.000)
           Average time per event 3.010 usec
         Number of synthesis threads: 6
           Average synthesis took: 59210.200 usec (+- 342.890 usec)
           Average num. events: 21560.000 (+- 0.000)
           Average time per event 2.746 usec
         Number of synthesis threads: 7
           Average synthesis took: 54093.900 usec (+- 306.247 usec)
           Average num. events: 21562.000 (+- 0.000)
           Average time per event 2.509 usec
         Number of synthesis threads: 8
           Average synthesis took: 48938.700 usec (+- 341.732 usec)
           Average num. events: 21564.000 (+- 0.000)
           Average time per event 2.269 usec
      
      Where average time per synthesized event goes from 5.927 usec with 1
      thread to 2.269 usec with 8. This isn't a linear speed up as not all of
      synthesize code has been made parallel. If the synthesis time was about
      10 seconds then using 8 threads may bring this down to less than 4.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tony Jones <tonyj@suse.de>
      Cc: yuzhoujian <yuzhoujian@didichuxing.com>
      Link: http://lore.kernel.org/lkml/20200422155038.9380-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d99c22ea
  2. 03 4月, 2020 1 次提交
    • N
      perf record: Add --all-cgroups option · 8fb4b679
      Namhyung Kim 提交于
      The --all-cgroups option is to enable cgroup profiling support.  It
      tells kernel to record CGROUP events in the ring buffer so that perf
      report can identify task/cgroup association later.
      
        [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ]
        [root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 558  of event 'cycles'
        # Event count (approx.): 458017341
        #
        # Overhead  cgroup id (dev/inode)  Cgroup          Pid:Command
        # ........  .....................  ..........  ...............
        #
            33.15%  4/0xeffffffb           /sub           9615:looper0
            32.83%  4/0xf00002f5           /sub/cgrp2     9620:looper2
            32.79%  4/0xf00002f4           /sub/cgrp1     9619:looper1
             0.35%  4/0xf00002f5           /sub/cgrp2     9618:cgtest
             0.34%  4/0xf00002f4           /sub/cgrp1     9617:cgtest
             0.32%  4/0xeffffffb           /              9615:looper0
             0.11%  4/0xeffffffb           /sub           9617:cgtest
             0.10%  4/0xeffffffb           /sub           9618:cgtest
      
        #
        # (Tip: Sample related events with: perf record -e '{cycles,instructions}:S')
        #
        [root@seventh ~]#
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ Extracted the HAVE_FILE_HANDLE from the followup patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8fb4b679
  3. 26 3月, 2020 1 次提交
  4. 11 3月, 2020 1 次提交
  5. 22 11月, 2019 2 次提交
  6. 07 11月, 2019 2 次提交
    • J
      perf record: Add support for limit perf output file size · 6d575816
      Jiwei Sun 提交于
      The patch adds a new option to limit the output file size, then based on
      it, we can create a wrapper of the perf command that uses the option to
      avoid exhausting the disk space by the unconscious user.
      
      In order to make the perf.data parsable, we just limit the sample data
      size, since the perf.data consists of many headers and sample data and
      other data, the actual size of the recorded file will bigger than the
      setting value.
      
      Testing it:
      
        # ./perf record -a -g --max-size=10M
        Couldn't synthesize bpf events.
        [ perf record: perf size limit reached (10249 KB), stopping session ]
        [ perf record: Woken up 32 times to write data ]
        [ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ]
      
        # ls -lh perf.data
        -rw------- 1 root root 11M Oct 22 14:32 perf.data
      
        # ./perf record -a -g --max-size=10K
        [ perf record: perf size limit reached (10 KB), stopping session ]
        Couldn't synthesize bpf events.
        [ perf record: Woken up 0 times to write data ]
        [ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ]
      
        # ls -l perf.data
        -rw------- 1 root root 1626952 Oct 22 14:36 perf.data
      
      Committer notes:
      
      Fixed the build in multiple distros by using PRIu64 to print u64 struct
      members, fixing this:
      
        builtin-record.c: In function 'record__write':
        builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=]
             rec->bytes_written >> 10);
             ^
          CC       /tmp/build/pe
      Signed-off-by: NJiwei Sun <jiwei.sun@windriver.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Danter <richard.danter@windriver.com>
      Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6d575816
    • A
      perf record: Put a copy of kcore into the perf.data directory · eeb399b5
      Adrian Hunter 提交于
      Add a new 'perf record' option '--kcore' which will put a copy of
      /proc/kcore, kallsyms and modules into a perf.data directory. Note, that
      without the --kcore option, output goes to a file as previously.  The
      tools' -o and -i options work with either a file name or directory name.
      
      Example:
      
        $ sudo perf record --kcore uname
      
        $ sudo tree perf.data
        perf.data
        ├── kcore_dir
        │   ├── kallsyms
        │   ├── kcore
        │   └── modules
        └── data
      
        $ sudo perf script -v
        build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de
        build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541
        Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field.
        Using CPUID GenuineIntel-6-8E-A
        Using perf.data/kcore_dir/kcore for kernel data
        Using perf.data/kcore_dir/kallsyms for symbols
                   perf 19058 506778.423729:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
                   perf 19058 506778.423733:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
                   perf 19058 506778.423734:          7 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
                   perf 19058 506778.423736:        117 cycles:  ffffffffa2caa54a native_write_msr+0xa (vmlinux)
                   perf 19058 506778.423738:       2092 cycles:  ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux)
                   perf 19058 506778.423740:      37380 cycles:  ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux)
                  uname 19058 506778.423751:     582673 cycles:  ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux)
                  uname 19058 506778.423892:    2241841 cycles:  ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux)
                  uname 19058 506778.424430:    2457397 cycles:  ffffffffa3019232 check_memory_region+0x52 (vmlinux)
      
      Committer testing:
      
        # rm -rf perf.data*
        # perf record sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
        # ls -l perf.data
        -rw-------. 1 root root 34772 Oct 21 11:08 perf.data
        # perf record --kcore uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
        ls[root@quaco ~]# ls -lad perf.data*
        drwx------. 3 root root  4096 Oct 21 11:08 perf.data
        -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
        # perf evlist -v -i perf.data/data
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
        #
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eeb399b5
  7. 14 8月, 2019 2 次提交
  8. 11 6月, 2019 1 次提交
    • Y
      perf record: Add support to collect callchains from kernel or user space only · 53651b28
      yuzhoujian 提交于
      One can just record callchains in the kernel or user space with this new
      options.
      
      We can use it together with "--all-kernel" options.
      
      This two options is used just like print_stack(sys) or print_ustack(usr)
      for systemtap.
      
      Shown below is the usage of this new option combined with "--all-kernel"
      options:
      
      1. Configure all used events to run in kernel space and just collect
         kernel callchains.
      
        $ perf record -a -g --all-kernel --kernel-callchains
      
      2. Configure all used events to run in kernel space and just collect
         user callchains.
      
        $ perf record -a -g --all-kernel --user-callchains
      
      Committer notes:
      
      Improved documentation to state that asking for kernel callchains really
      is asking for excluding user callchains, and vice versa.
      
      Further mentioned that using both won't get both, but nothing, as both
      will be excluded.
      Signed-off-by: Nyuzhoujian <yuzhoujian@didichuxing.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      53651b28
  9. 16 5月, 2019 2 次提交
    • K
      perf parse-regs: Split parse_regs · aeea9062
      Kan Liang 提交于
      The available registers for --int-regs and --user-regs may be different,
      e.g. XMM registers.
      
      Split parse_regs into two dedicated functions for --int-regs and
      --user-regs respectively.
      
      Modify the warning message. "--user-regs=?" should be applied to show
      the available registers for --user-regs.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1557865174-56264-1-git-send-email-kan.liang@linux.intel.com
      [ Changed docs as suggested by Ravi and agreed by Kan ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aeea9062
    • A
      perf record: Implement -z,--compression_level[=<n>] option · 504c1ad1
      Alexey Budankov 提交于
      Implemented -z,--compression_level[=<n>] option that enables compression
      of mmaped kernel data buffers content in runtime during perf record mode
      collection. Default option value is 1 (fastest compression).
      
      Compression overhead has been measured for serial and AIO streaming when
      profiling matrix multiplication workload:
      
            -------------------------------------------------------------
            | SERIAL			  | AIO-1                       |
        ----------------------------------------------------------------|
        |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
        |---------------------------------------------------------------|
        | 0 | 1,00   | 1,000    179,424   | 1,00   | 1,000    187,527   |
        | 1 | 1,04   | 8,427    181,148   | 1,01   | 8,474    188,562   |
        | 2 | 1,07   | 8,055    186,953   | 1,03   | 7,912    191,773   |
        | 3 | 1,04   | 8,283    181,908   | 1,03   | 8,220    191,078   |
        | 5 | 1,09   | 8,101    187,705   | 1,05   | 7,780    190,065   |
        | 8 | 1,05   | 9,217    179,191   | 1,12   | 6,111    193,024   |
        -----------------------------------------------------------------
      
      OVH = (Execution time with -z N) / (Execution time with -z 0)
      
      ratio - compression ratio
      size  - number of bytes that was compressed
      
      	size ~= trace size x ratio
      
      Committer notes:
      
      Testing it I noticed that it failed to disable build id processing when
      compression is enabled, and as we'd have to uncompress everything to
      look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids
      to read from DSOs, we better disable build id processing when
      compression is enabled, logging with pr_debug() when doing so:
      
      Original patch:
      
        # perf record -z2
        ^C[ perf record: Woken up 1 times to write data ]
        0x1746e0 [0x76]: failed to process type: 81 [Invalid argument]
        [ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ]
        #
      
      After auto-disabling build id processing when compression is enabled:
      
        $ perf record -z2 sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ]
        $ perf record -v -z2 sleep 1
        Compression enabled, disabling build id collection at the end of the session.
        <SNIP extra -v pr_debug() messages>
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ]
        $
      
      Also, with parts of the patch originally after this one moved to just
      before this one we get:
      
        $ perf record -z2 sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ]
        $ perf report -D | grep COMPRESS
        0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled!
        0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled!
              COMPRESSED events:          2
              COMPRESSED events:          0
        $
      
      I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code
      to process, we just show it as not being handled, skip them and
      continue, while before we had:
      
        $ perf report -D | grep COMPRESS
        0x1b8 [0x169]: failed to process type: 81 [Invalid argument]
        Error:
        failed to process sample
        0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED
        $
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/9ff06518-ae63-a908-e44d-5d9e56dd66d9@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      504c1ad1
  10. 02 4月, 2019 1 次提交
    • A
      perf record: Implement --mmap-flush=<number> option · 470530bb
      Alexey Budankov 提交于
      Implement a --mmap-flush option that specifies minimal number of bytes
      that is extracted from mmaped kernel buffer to store into a trace. The
      default option value is 1 byte what means every time trace writing
      thread finds some new data in the mmaped buffer the data is extracted,
      possibly compressed and written to a trace.
      
        $ tools/perf/perf record --mmap-flush 1024 -e cycles -- matrix.gcc
        $ tools/perf/perf record --aio --mmap-flush 1K -e cycles -- matrix.gcc
      
      The option is independent from -z setting, doesn't vary with compression
      level and can serve two purposes.
      
      The first purpose is to increase the compression ratio of a trace data.
      Larger data chunks are compressed more effectively so the implemented
      option allows specifying data chunk size to compress. Also at some cases
      executing more write syscalls with smaller data size can take longer
      than executing less write syscalls with bigger data size due to syscall
      overhead so extracting bigger data chunks specified by the option value
      could additionally decrease runtime overhead.
      
      The second purpose is to avoid self monitoring live-lock issue in system
      wide (-a) profiling mode. Profiling in system wide mode with compression
      (-a -z) can additionally induce data into the kernel buffers along with
      the data from monitored processes. If performance data rate and volume
      from the monitored processes is high then trace streaming and
      compression activity in the tool is also high. High tool process
      activity can lead to subtle live-lock effect when compression of single
      new byte from some of mmaped kernel buffer leads to generation of the
      next single byte at some mmaped buffer. So perf tool process ends up in
      endless self monitoring.
      
      Implemented synch parameter is the mean to force data move independently
      from the specified flush threshold value. Despite the provided flush
      value the tool needs capability to unconditionally drain memory buffers,
      at least in the end of the collection.
      
      Committer testing:
      
      Running with the default value, i.e. as soon as there is something to
      read go on consuming, we first write the synthesized events, small
      chunks of about 128 bytes:
      
        # perf trace -m 2048 --call-graph dwarf -e write -- perf record
        <SNIP>
           101.142 ( 0.004 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x210db60, count: 120) = 120
                                               __libc_write (/usr/lib64/libpthread-2.28.so)
                                               ion (/home/acme/bin/perf)
                                               record__write (inlined)
                                               process_synthesized_event (/home/acme/bin/perf)
                                               perf_tool__process_synth_event (inlined)
                                               perf_event__synthesize_mmap_events (/home/acme/bin/perf)
      
      Then we move to reading the mmap buffers consuming the events put there
      by the kernel perf infrastructure:
      
           107.561 ( 0.005 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02000, count: 336) = 336
                                               __libc_write (/usr/lib64/libpthread-2.28.so)
                                               ion (/home/acme/bin/perf)
                                               record__write (inlined)
                                               record__pushfn (/home/acme/bin/perf)
                                               perf_mmap__push (/home/acme/bin/perf)
                                               record__mmap_read_evlist (inlined)
                                               record__mmap_read_all (inlined)
                                               __cmd_record (inlined)
                                               cmd_record (/home/acme/bin/perf)
           12919.953 ( 0.136 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc83150, count: 184984) = 184984
        <SNIP same backtrace as in the 107.561 timestamp>
           12920.094 ( 0.155 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02150, count: 261816) = 261816
        <SNIP same backtrace as in the 107.561 timestamp>
           12920.253 ( 0.093 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befb81120, count: 170832) = 170832
        <SNIP same backtrace as in the 107.561 timestamp>
      
      If we limit it to write only when more than 16MB are available for
      reading, it throttles that to a quarter of the --mmap-pages set for
      'perf record', which by default get to 528384 bytes, found out using
      'record -v':
      
        mmap flush: 132096
        mmap size 528384B
      
      With that in place all the writes coming from
      record__mmap_read_evlist(), i.e. from the mmap buffers setup by the
      kernel perf infrastructure were at least 132096 bytes long.
      
      Trying with a bigger mmap size:
      
         perf trace -e write perf record -v -m 2048 --mmap-flush 16M
         74982.928 ( 2.471 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff94a6cc000, count: 3580888) = 3580888
         74985.406 ( 2.353 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff949ecb000, count: 3453256) = 3453256
         74987.764 ( 2.629 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9496ca000, count: 3859232) = 3859232
         74990.399 ( 2.341 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff948ec9000, count: 3769032) = 3769032
         74992.744 ( 2.064 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9486c8000, count: 3310520) = 3310520
         74994.814 ( 2.619 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff947ec7000, count: 4194688) = 4194688
         74997.439 ( 2.787 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9476c6000, count: 4029760) = 4029760
      
      Was again limited to a quarter of the mmap size:
      
        mmap flush: 2098176
        mmap size 8392704B
      
      A warning about that would be good to have but can be added later,
      something like:
      
        "max flush is a quarter of the mmap size, if wanting to bump the mmap
         flush further, bump the mmap size as well using -m/--mmap-pages"
      
      Also rename the 'sync' parameters to 'synch' to keep tools/perf building
      with older glibcs:
      
        cc1: warnings being treated as errors
        builtin-record.c: In function 'record__mmap_read_evlist':
        builtin-record.c:775: warning: declaration of 'sync' shadows a global declaration
        /usr/include/unistd.h:933: warning: shadowed declaration is here
        builtin-record.c: In function 'record__mmap_read_all':
        builtin-record.c:856: warning: declaration of 'sync' shadows a global declaration
        /usr/include/unistd.h:933: warning: shadowed declaration is here
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/f6600d72-ecfa-2eb7-7e51-f6954547d500@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      470530bb
  11. 19 3月, 2019 1 次提交
  12. 11 2月, 2019 1 次提交
  13. 06 2月, 2019 1 次提交
  14. 18 12月, 2018 2 次提交
  15. 16 7月, 2018 1 次提交
  16. 06 6月, 2018 1 次提交
    • A
      perf record: Enable arbitrary event names thru name= modifier · f92da712
      Alexey Budankov 提交于
      Enable complex event names containing [.:=,] symbols to be encoded into Perf
      trace using name= modifier e.g. like this:
      
        perf record -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',\
      		  period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
      
      Below is how it looks like in the report output. Please note explicit escaped
      quoting at cmdline string in the header so that thestring can be directly reused
      for another collection in shell:
      
      perf report --header
      
        # ========
        ...
        # cmdline : /root/abudanko/kernel/tip/tools/perf/perf record -v -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
        # event : name = OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM, , type = 4, size = 112, config = 0x100003c, { sample_period, sample_freq } = 3500000, sample_type = IP|TID|TIME, disabled = 1, inh
        ...
        # ========
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 24K of event 'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM'
        # Event count (approx.): 86492000000
        #
        # Overhead  Command  Shared Object     Symbol
        # ........  .......  ................  ..............................................
        #
            14.75%  futex    [kernel.vmlinux]  [k] __entry_trampoline_start
      ...
      
        perf stat -e cpu/name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
      
        10000000 process context switches in 16678890291ns (1667.9ns/ctxsw)
      
         Performance counter stats for './futex':
      
            88,095,770,571      CPU_CLK_UNHALTED.THREAD:cmask=0x1
      
              16.679542407 seconds time elapsed
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/c194b060-761d-0d50-3b21-bb4ed680002d@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f92da712
  17. 05 3月, 2018 2 次提交
    • A
      perf record: Throttle user defined frequencies to the maximum allowed · b09c2364
      Arnaldo Carvalho de Melo 提交于
        # perf record -F 200000 sleep 1
        warning: Maximum frequency rate (15,000 Hz) exceeded, throttling from 200,000 Hz to 15,000 Hz.
                 The limit can be raised via /proc/sys/kernel/perf_event_max_sample_rate.
                 The kernel will lower it when perf's interrupts take too long.
      	   Use --strict-freq to disable this throttling, refusing to record.
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.019 MB perf.data (15 samples) ]
        # perf evlist -v
        cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
      
      For those wanting that it fails if the desired frequency can't be used:
      
        # perf record --strict-freq -F 200000 sleep 1
        error: Maximum frequency rate (15,000 Hz) exceeded.
               Please use -F freq option with a lower value or consider
               tweaking /proc/sys/kernel/perf_event_max_sample_rate.
        #
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-oyebruc44nlja499nqkr1nzn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b09c2364
    • A
      perf record: Allow asking for the maximum allowed sample rate · 67230479
      Arnaldo Carvalho de Melo 提交于
      Add the handy '-F max' shortcut to reading and using the
      kernel.perf_event_max_sample_rate value as the user supplied
      sampling frequency:
      
        # perf record -F max sleep 1
        info: Using a maximum frequency rate of 15,000 Hz
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ]
        # sysctl kernel.perf_event_max_sample_rate
        kernel.perf_event_max_sample_rate = 15000
        # perf evlist -v
        cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
      
        # perf record -F 10 sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.019 MB perf.data (4 samples) ]
        # perf evlist -v
        cycles:ppp: size: 112, { sample_period, sample_freq }: 10, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        #
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-4y0tiuws62c64gp4cf0hme0m@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      67230479
  18. 22 2月, 2018 1 次提交
  19. 08 1月, 2018 1 次提交
    • J
      perf record: Record the first and last sample time in the header · 68588baf
      Jin Yao 提交于
      In the default 'perf record' configuration, all samples are processed,
      to create the HEADER_BUILD_ID table. So it's very easy to get the
      first/last samples and save the time to perf file header via the
      function write_sample_time().
      
      Later, at post processing time, perf report/script will fetch the time
      from perf file header.
      
      Committer testing:
      
        # perf record -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
        [root@jouet home]# perf report --header | grep "time of "
        # time of first sample : 22947.909226
        # time of last sample : 22948.910704
        #
        # perf report -D | grep PERF_RECORD_SAMPLE\(
        0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa21b1af3 period: 1 addr: 0
        0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa200d204 period: 1 addr: 0
        <SNIP>
        3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 28251/28251: 0xffffffffa22071d8 period: 169518 addr: 0
        0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 198807 addr: 0
        2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 88111 addr: 0
        #
      
      Changelog:
      
      v7: Just update the patch description according to Arnaldo's suggestion.
      
      v6: Currently '--buildid-all' is not enabled at default. So the walking
          on all samples is the default operation. There is no big overhead
          to calculate the timestamp boundary in process_sample_event handler
          once we already go through all samples. So the timestamp boundary
          calculation is enabled by default when '--buildid-all' is not enabled.
      
          While if '--buildid-all' is enabled, we creates a new option
          "--timestamp-boundary" for user to decide if it enables the
          timestamp boundary calculation.
      
      v5: There is an issue that the sample walking can only work when
          '--buildid-all' is not enabled. So we need to let the walking
          be able to work even if '--buildid-all' is enabled and let the
          processing skips the dso hit marking for this case.
      
          At first, I want to provide a new option "--record-time-boundaries".
          While after consideration, I think a new option is not very
          necessary.
      
      v3: Remove the definitions of first_sample_time and last_sample_time
          from struct record and directly save them in perf_evlist.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      68588baf
  20. 17 10月, 2017 1 次提交
  21. 13 9月, 2017 1 次提交
  22. 02 9月, 2017 1 次提交
  23. 19 7月, 2017 1 次提交
  24. 04 5月, 2017 1 次提交
  25. 14 3月, 2017 1 次提交
    • H
      perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info · f3b3614a
      Hari Bathini 提交于
      Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
      by the kernel when fork, clone, setns or unshare are invoked. And update
      perf-record documentation with the new option to record namespace
      events.
      
      Committer notes:
      
      Combined it with a later patch to allow printing it via 'perf report -D'
      and be able to test the feature introduced in this patch. Had to move
      here also perf_ns__name(), that was introduced in another later patch.
      
      Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
      
        util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
           ret  += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
                                               ^
      Testing it:
      
        # perf record --namespaces -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
        #
        # perf report -D
        <SNIP>
        3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
                      [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                       4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      
        0x1151e0 [0x30]: event: 9
        .
        . ... raw event: size 48 bytes
        .  0000:  09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00  ......0..q.h....
        .  0010:  a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00  .9...9...(.c....
        .  0020:  03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00  ................
        <SNIP>
              NAMESPACES events:          1
        <SNIP>
        #
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b3614a
  26. 18 2月, 2017 1 次提交
  27. 12 1月, 2017 2 次提交
  28. 03 1月, 2017 1 次提交
  29. 24 10月, 2016 1 次提交
  30. 29 9月, 2016 1 次提交
  31. 28 9月, 2016 1 次提交
  32. 14 9月, 2016 1 次提交
  33. 03 8月, 2016 1 次提交