- 23 4月, 2020 1 次提交
-
-
由 Stephane Eranian 提交于
To control degree of parallelism of the synthesize_mmap() code which is scanning /proc/PID/task/PID/maps and can be time consuming. Mimic perf top way of handling the option. If not specified will default to 1 thread, i.e. default behavior before this option. On a desktop computer the processing of /proc/PID/task/PID/maps isn't slow enough to warrant parallel processing and the thread creation has some cost - hence the default of 1. On a loaded server with >100 cores it is possible to see synthesis times in the order of seconds and in this case having the option is desirable. As the processing is a synchronization point, it is legitimate to worry if Amdahl's law will apply to this patch. Profiling with this patch in place: https://lore.kernel.org/lkml/20200415054050.31645-4-irogers@google.com/ shows: ... - 32.59% __perf_event__synthesize_threads - 32.54% __event__synthesize_thread + 22.13% perf_event__synthesize_mmap_events + 6.68% perf_event__get_comm_ids.constprop.0 + 1.49% process_synthesized_event + 1.29% __GI___readdir64 + 0.60% __opendir ... That is the processing is 1.49% of execution time and there is plenty to make parallel. This is shown in the benchmark in this patch: https://lore.kernel.org/lkml/20200415054050.31645-2-irogers@google.com/ Computing performance of multi threaded perf event synthesis by synthesizing events on CPU 0: Number of synthesis threads: 1 Average synthesis took: 127729.000 usec (+- 3372.880 usec) Average num. events: 21548.600 (+- 0.306) Average time per event 5.927 usec Number of synthesis threads: 2 Average synthesis took: 88863.500 usec (+- 385.168 usec) Average num. events: 21552.800 (+- 0.327) Average time per event 4.123 usec Number of synthesis threads: 3 Average synthesis took: 83257.400 usec (+- 348.617 usec) Average num. events: 21553.200 (+- 0.327) Average time per event 3.863 usec Number of synthesis threads: 4 Average synthesis took: 75093.000 usec (+- 422.978 usec) Average num. events: 21554.200 (+- 0.200) Average time per event 3.484 usec Number of synthesis threads: 5 Average synthesis took: 64896.600 usec (+- 353.348 usec) Average num. events: 21558.000 (+- 0.000) Average time per event 3.010 usec Number of synthesis threads: 6 Average synthesis took: 59210.200 usec (+- 342.890 usec) Average num. events: 21560.000 (+- 0.000) Average time per event 2.746 usec Number of synthesis threads: 7 Average synthesis took: 54093.900 usec (+- 306.247 usec) Average num. events: 21562.000 (+- 0.000) Average time per event 2.509 usec Number of synthesis threads: 8 Average synthesis took: 48938.700 usec (+- 341.732 usec) Average num. events: 21564.000 (+- 0.000) Average time per event 2.269 usec Where average time per synthesized event goes from 5.927 usec with 1 thread to 2.269 usec with 8. This isn't a linear speed up as not all of synthesize code has been made parallel. If the synthesis time was about 10 seconds then using 8 threads may bring this down to less than 4. Signed-off-by: NStephane Eranian <eranian@google.com> Reviewed-by: NIan Rogers <irogers@google.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Jones <tonyj@suse.de> Cc: yuzhoujian <yuzhoujian@didichuxing.com> Link: http://lore.kernel.org/lkml/20200422155038.9380-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 03 4月, 2020 1 次提交
-
-
由 Namhyung Kim 提交于
The --all-cgroups option is to enable cgroup profiling support. It tells kernel to record CGROUP events in the ring buffer so that perf report can identify task/cgroup association later. [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ] [root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 558 of event 'cycles' # Event count (approx.): 458017341 # # Overhead cgroup id (dev/inode) Cgroup Pid:Command # ........ ..................... .......... ............... # 33.15% 4/0xeffffffb /sub 9615:looper0 32.83% 4/0xf00002f5 /sub/cgrp2 9620:looper2 32.79% 4/0xf00002f4 /sub/cgrp1 9619:looper1 0.35% 4/0xf00002f5 /sub/cgrp2 9618:cgtest 0.34% 4/0xf00002f4 /sub/cgrp1 9617:cgtest 0.32% 4/0xeffffffb / 9615:looper0 0.11% 4/0xeffffffb /sub 9617:cgtest 0.10% 4/0xeffffffb /sub 9618:cgtest # # (Tip: Sample related events with: perf record -e '{cycles,instructions}:S') # [root@seventh ~]# Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 3月, 2020 1 次提交
-
-
由 Tony Jones 提交于
The method of unwinding for kernel space is defined by the kernel config, not by the value of --call-graph. Improve the documentation to reflect this. Signed-off-by: NTony Jones <tonyj@suse.de> Link: http://lore.kernel.org/lkml/20200325164053.10177-1-tonyj@suse.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 3月, 2020 1 次提交
-
-
由 Adrian Hunter 提交于
Add references to Intel PT man page in man pages of associated tools. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200311122034.3697-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 11月, 2019 2 次提交
-
-
由 Adrian Hunter 提交于
To allow individual events to be selected for AUX area sampling, add aux-sample-size config term. attr.aux_sample_size is updated by auxtrace_parse_sample_options() so that the existing validation will see the value. Any event that has a non-zero aux_sample_size will cause AUX area sampling to be configured, irrespective of the --aux-sample option. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-8-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Add a 'perf record' option '--aux-sample' to request AUX area sampling. AUX area sampling uses an overwriting buffer much like snapshot mode, so adjust the AUX buffer mmapping accordingly. To make it easy to queue samples for decoding, synthesize an ID index. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-7-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 11月, 2019 2 次提交
-
-
由 Jiwei Sun 提交于
The patch adds a new option to limit the output file size, then based on it, we can create a wrapper of the perf command that uses the option to avoid exhausting the disk space by the unconscious user. In order to make the perf.data parsable, we just limit the sample data size, since the perf.data consists of many headers and sample data and other data, the actual size of the recorded file will bigger than the setting value. Testing it: # ./perf record -a -g --max-size=10M Couldn't synthesize bpf events. [ perf record: perf size limit reached (10249 KB), stopping session ] [ perf record: Woken up 32 times to write data ] [ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ] # ls -lh perf.data -rw------- 1 root root 11M Oct 22 14:32 perf.data # ./perf record -a -g --max-size=10K [ perf record: perf size limit reached (10 KB), stopping session ] Couldn't synthesize bpf events. [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ] # ls -l perf.data -rw------- 1 root root 1626952 Oct 22 14:36 perf.data Committer notes: Fixed the build in multiple distros by using PRIu64 to print u64 struct members, fixing this: builtin-record.c: In function 'record__write': builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=] rec->bytes_written >> 10); ^ CC /tmp/build/pe Signed-off-by: NJiwei Sun <jiwei.sun@windriver.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Danter <richard.danter@windriver.com> Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Add a new 'perf record' option '--kcore' which will put a copy of /proc/kcore, kallsyms and modules into a perf.data directory. Note, that without the --kcore option, output goes to a file as previously. The tools' -o and -i options work with either a file name or directory name. Example: $ sudo perf record --kcore uname $ sudo tree perf.data perf.data ├── kcore_dir │ ├── kallsyms │ ├── kcore │ └── modules └── data $ sudo perf script -v build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541 Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field. Using CPUID GenuineIntel-6-8E-A Using perf.data/kcore_dir/kcore for kernel data Using perf.data/kcore_dir/kallsyms for symbols perf 19058 506778.423729: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423733: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423734: 7 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423736: 117 cycles: ffffffffa2caa54a native_write_msr+0xa (vmlinux) perf 19058 506778.423738: 2092 cycles: ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux) perf 19058 506778.423740: 37380 cycles: ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux) uname 19058 506778.423751: 582673 cycles: ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux) uname 19058 506778.423892: 2241841 cycles: ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux) uname 19058 506778.424430: 2457397 cycles: ffffffffa3019232 check_memory_region+0x52 (vmlinux) Committer testing: # rm -rf perf.data* # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ] # ls -l perf.data -rw-------. 1 root root 34772 Oct 21 11:08 perf.data # perf record --kcore uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ] ls[root@quaco ~]# ls -lad perf.data* drwx------. 3 root root 4096 Oct 21 11:08 perf.data -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # perf evlist -v -i perf.data/data cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 8月, 2019 2 次提交
-
-
由 Adrian Hunter 提交于
Expose the aux_output attribute flag to the user to configure, by adding a config term 'aux-output'. For events that support it, selection of 'aux-output' causes the generation of AUX records instead of event records. This requires that an AUX area event is also provided. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806084606.4021-7-alexander.shishkin@linux.intel.comSigned-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexander Shishkin 提交于
It is sometimes useful to generate a snapshot when perf record exits; I've been using a wrapper script around the workload that would do a killall -USR2 perf when the workload exits. This patch makes it easier and also works when perf record is attached to a pre-existing task. A new snapshot option 'e' can be specified in -S to enable this behavior: root@elsewhere:~# perf record -e intel_pt// -Se sleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.085 MB perf.data ] Co-developed-by: NAdrian Hunter <adrian.hunter@intel.com> Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806144101.62892-1-alexander.shishkin@linux.intel.com [ Fixed up !HAVE_AUXTRACE_SUPPORT build in builtin-record.c, adding 2 missing __maybe_unused ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 6月, 2019 1 次提交
-
-
由 yuzhoujian 提交于
One can just record callchains in the kernel or user space with this new options. We can use it together with "--all-kernel" options. This two options is used just like print_stack(sys) or print_ustack(usr) for systemtap. Shown below is the usage of this new option combined with "--all-kernel" options: 1. Configure all used events to run in kernel space and just collect kernel callchains. $ perf record -a -g --all-kernel --kernel-callchains 2. Configure all used events to run in kernel space and just collect user callchains. $ perf record -a -g --all-kernel --user-callchains Committer notes: Improved documentation to state that asking for kernel callchains really is asking for excluding user callchains, and vice versa. Further mentioned that using both won't get both, but nothing, as both will be excluded. Signed-off-by: Nyuzhoujian <yuzhoujian@didichuxing.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 5月, 2019 2 次提交
-
-
由 Kan Liang 提交于
The available registers for --int-regs and --user-regs may be different, e.g. XMM registers. Split parse_regs into two dedicated functions for --int-regs and --user-regs respectively. Modify the warning message. "--user-regs=?" should be applied to show the available registers for --user-regs. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557865174-56264-1-git-send-email-kan.liang@linux.intel.com [ Changed docs as suggested by Ravi and agreed by Kan ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Implemented -z,--compression_level[=<n>] option that enables compression of mmaped kernel data buffers content in runtime during perf record mode collection. Default option value is 1 (fastest compression). Compression overhead has been measured for serial and AIO streaming when profiling matrix multiplication workload: ------------------------------------------------------------- | SERIAL | AIO-1 | ----------------------------------------------------------------| |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) | |---------------------------------------------------------------| | 0 | 1,00 | 1,000 179,424 | 1,00 | 1,000 187,527 | | 1 | 1,04 | 8,427 181,148 | 1,01 | 8,474 188,562 | | 2 | 1,07 | 8,055 186,953 | 1,03 | 7,912 191,773 | | 3 | 1,04 | 8,283 181,908 | 1,03 | 8,220 191,078 | | 5 | 1,09 | 8,101 187,705 | 1,05 | 7,780 190,065 | | 8 | 1,05 | 9,217 179,191 | 1,12 | 6,111 193,024 | ----------------------------------------------------------------- OVH = (Execution time with -z N) / (Execution time with -z 0) ratio - compression ratio size - number of bytes that was compressed size ~= trace size x ratio Committer notes: Testing it I noticed that it failed to disable build id processing when compression is enabled, and as we'd have to uncompress everything to look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids to read from DSOs, we better disable build id processing when compression is enabled, logging with pr_debug() when doing so: Original patch: # perf record -z2 ^C[ perf record: Woken up 1 times to write data ] 0x1746e0 [0x76]: failed to process type: 81 [Invalid argument] [ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ] # After auto-disabling build id processing when compression is enabled: $ perf record -z2 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ] $ perf record -v -z2 sleep 1 Compression enabled, disabling build id collection at the end of the session. <SNIP extra -v pr_debug() messages> [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ] $ Also, with parts of the patch originally after this one moved to just before this one we get: $ perf record -z2 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ] $ perf report -D | grep COMPRESS 0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled! 0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled! COMPRESSED events: 2 COMPRESSED events: 0 $ I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code to process, we just show it as not being handled, skip them and continue, while before we had: $ perf report -D | grep COMPRESS 0x1b8 [0x169]: failed to process type: 81 [Invalid argument] Error: failed to process sample 0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED $ Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/9ff06518-ae63-a908-e44d-5d9e56dd66d9@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 4月, 2019 1 次提交
-
-
由 Alexey Budankov 提交于
Implement a --mmap-flush option that specifies minimal number of bytes that is extracted from mmaped kernel buffer to store into a trace. The default option value is 1 byte what means every time trace writing thread finds some new data in the mmaped buffer the data is extracted, possibly compressed and written to a trace. $ tools/perf/perf record --mmap-flush 1024 -e cycles -- matrix.gcc $ tools/perf/perf record --aio --mmap-flush 1K -e cycles -- matrix.gcc The option is independent from -z setting, doesn't vary with compression level and can serve two purposes. The first purpose is to increase the compression ratio of a trace data. Larger data chunks are compressed more effectively so the implemented option allows specifying data chunk size to compress. Also at some cases executing more write syscalls with smaller data size can take longer than executing less write syscalls with bigger data size due to syscall overhead so extracting bigger data chunks specified by the option value could additionally decrease runtime overhead. The second purpose is to avoid self monitoring live-lock issue in system wide (-a) profiling mode. Profiling in system wide mode with compression (-a -z) can additionally induce data into the kernel buffers along with the data from monitored processes. If performance data rate and volume from the monitored processes is high then trace streaming and compression activity in the tool is also high. High tool process activity can lead to subtle live-lock effect when compression of single new byte from some of mmaped kernel buffer leads to generation of the next single byte at some mmaped buffer. So perf tool process ends up in endless self monitoring. Implemented synch parameter is the mean to force data move independently from the specified flush threshold value. Despite the provided flush value the tool needs capability to unconditionally drain memory buffers, at least in the end of the collection. Committer testing: Running with the default value, i.e. as soon as there is something to read go on consuming, we first write the synthesized events, small chunks of about 128 bytes: # perf trace -m 2048 --call-graph dwarf -e write -- perf record <SNIP> 101.142 ( 0.004 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x210db60, count: 120) = 120 __libc_write (/usr/lib64/libpthread-2.28.so) ion (/home/acme/bin/perf) record__write (inlined) process_synthesized_event (/home/acme/bin/perf) perf_tool__process_synth_event (inlined) perf_event__synthesize_mmap_events (/home/acme/bin/perf) Then we move to reading the mmap buffers consuming the events put there by the kernel perf infrastructure: 107.561 ( 0.005 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02000, count: 336) = 336 __libc_write (/usr/lib64/libpthread-2.28.so) ion (/home/acme/bin/perf) record__write (inlined) record__pushfn (/home/acme/bin/perf) perf_mmap__push (/home/acme/bin/perf) record__mmap_read_evlist (inlined) record__mmap_read_all (inlined) __cmd_record (inlined) cmd_record (/home/acme/bin/perf) 12919.953 ( 0.136 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc83150, count: 184984) = 184984 <SNIP same backtrace as in the 107.561 timestamp> 12920.094 ( 0.155 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02150, count: 261816) = 261816 <SNIP same backtrace as in the 107.561 timestamp> 12920.253 ( 0.093 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befb81120, count: 170832) = 170832 <SNIP same backtrace as in the 107.561 timestamp> If we limit it to write only when more than 16MB are available for reading, it throttles that to a quarter of the --mmap-pages set for 'perf record', which by default get to 528384 bytes, found out using 'record -v': mmap flush: 132096 mmap size 528384B With that in place all the writes coming from record__mmap_read_evlist(), i.e. from the mmap buffers setup by the kernel perf infrastructure were at least 132096 bytes long. Trying with a bigger mmap size: perf trace -e write perf record -v -m 2048 --mmap-flush 16M 74982.928 ( 2.471 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff94a6cc000, count: 3580888) = 3580888 74985.406 ( 2.353 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff949ecb000, count: 3453256) = 3453256 74987.764 ( 2.629 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9496ca000, count: 3859232) = 3859232 74990.399 ( 2.341 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff948ec9000, count: 3769032) = 3769032 74992.744 ( 2.064 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9486c8000, count: 3310520) = 3310520 74994.814 ( 2.619 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff947ec7000, count: 4194688) = 4194688 74997.439 ( 2.787 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9476c6000, count: 4029760) = 4029760 Was again limited to a quarter of the mmap size: mmap flush: 2098176 mmap size 8392704B A warning about that would be good to have but can be added later, something like: "max flush is a quarter of the mmap size, if wanting to bump the mmap flush further, bump the mmap size as well using -m/--mmap-pages" Also rename the 'sync' parameters to 'synch' to keep tools/perf building with older glibcs: cc1: warnings being treated as errors builtin-record.c: In function 'record__mmap_read_evlist': builtin-record.c:775: warning: declaration of 'sync' shadows a global declaration /usr/include/unistd.h:933: warning: shadowed declaration is here builtin-record.c: In function 'record__mmap_read_all': builtin-record.c:856: warning: declaration of 'sync' shadows a global declaration /usr/include/unistd.h:933: warning: shadowed declaration is here Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/f6600d72-ecfa-2eb7-7e51-f6954547d500@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 3月, 2019 1 次提交
-
-
由 Andi Kleen 提交于
When doing long term recording and waiting for some event to snapshot on, we often only care about the last minute or so. The --switch-output command line option supports rotating the perf.data file when the size exceeds a threshold. But the disk would still be filled with unnecessary old files. Add a new option to only keep a number of rotated files, so that the disk space usage can be limited. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-3-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-y5u2lik0ragt4vlktz6qc9ks@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 2月, 2019 1 次提交
-
-
由 Alexey Budankov 提交于
Implement --affinity=node|cpu option for the record mode defaulting to system affinity mask bouncing. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/083f5422-ece9-10dd-8305-bf59c860f10f@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 2月, 2019 1 次提交
-
-
由 Changbin Du 提交于
Add documentation for how to pass a BPF program as a perf event. Signed-off-by: NChangbin Du <changbin.du@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190201134651.12373-1-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 12月, 2018 2 次提交
-
-
由 Alexey Budankov 提交于
Multi AIO trace writing allows caching more kernel data into userspace memory postponing trace writing for the sake of overall profiling data thruput increase. It could be seen as kernel data buffer extension into userspace memory. With an --aio option value different from 0 (default value is 1) the tool has capability to cache more and more data into user space along with delegating spill to AIO. That allows avoiding to suspend at record__aio_sync() between calls of record__mmap_read_evlist() and increases profiling data thruput at the cost of userspace memory. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/050bb053-e7f3-aa83-fde7-f27ff90be7f6@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
The trace file offset is read once before mmaps iterating loop and written back after all performance data is enqueued for aio writing. The trace file offset is incremented linearly after every successful aio write operation. record__aio_sync() blocks till completion of the started AIO operation and then proceeds. record__aio_mmap_read_sync() implements a barrier for all incomplete aio write requests. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ce2d45e9-d236-871c-7c8f-1bed2d37e8ac@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 7月, 2018 1 次提交
-
-
由 Tobias Tefke 提交于
Some of the comments in the perf events code use articles incorrectly, using 'a' for words beginning with a vowel sound, where 'an' should be used. Signed-off-by: NTobias Tefke <tobias.tefke@tutanota.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: alexander.shishkin@linux.intel.com Cc: jolsa@redhat.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/20180709105715.22938-1-tobias.tefke@tutanota.com [ Fix a few more perf related 'a event' typo fixes from all around the kernel and tooling tree. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 6月, 2018 1 次提交
-
-
由 Alexey Budankov 提交于
Enable complex event names containing [.:=,] symbols to be encoded into Perf trace using name= modifier e.g. like this: perf record -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',\ period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex Below is how it looks like in the report output. Please note explicit escaped quoting at cmdline string in the header so that thestring can be directly reused for another collection in shell: perf report --header # ======== ... # cmdline : /root/abudanko/kernel/tip/tools/perf/perf record -v -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex # event : name = OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM, , type = 4, size = 112, config = 0x100003c, { sample_period, sample_freq } = 3500000, sample_type = IP|TID|TIME, disabled = 1, inh ... # ======== # # # Total Lost Samples: 0 # # Samples: 24K of event 'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM' # Event count (approx.): 86492000000 # # Overhead Command Shared Object Symbol # ........ ....... ................ .............................................. # 14.75% futex [kernel.vmlinux] [k] __entry_trampoline_start ... perf stat -e cpu/name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex 10000000 process context switches in 16678890291ns (1667.9ns/ctxsw) Performance counter stats for './futex': 88,095,770,571 CPU_CLK_UNHALTED.THREAD:cmask=0x1 16.679542407 seconds time elapsed Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c194b060-761d-0d50-3b21-bb4ed680002d@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 05 3月, 2018 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
# perf record -F 200000 sleep 1 warning: Maximum frequency rate (15,000 Hz) exceeded, throttling from 200,000 Hz to 15,000 Hz. The limit can be raised via /proc/sys/kernel/perf_event_max_sample_rate. The kernel will lower it when perf's interrupts take too long. Use --strict-freq to disable this throttling, refusing to record. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (15 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 For those wanting that it fails if the desired frequency can't be used: # perf record --strict-freq -F 200000 sleep 1 error: Maximum frequency rate (15,000 Hz) exceeded. Please use -F freq option with a lower value or consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. # Suggested-by: NIngo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-oyebruc44nlja499nqkr1nzn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Add the handy '-F max' shortcut to reading and using the kernel.perf_event_max_sample_rate value as the user supplied sampling frequency: # perf record -F max sleep 1 info: Using a maximum frequency rate of 15,000 Hz [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ] # sysctl kernel.perf_event_max_sample_rate kernel.perf_event_max_sample_rate = 15000 # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # perf record -F 10 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (4 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 10, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # Suggested-by: NIngo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4y0tiuws62c64gp4cf0hme0m@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 2月, 2018 1 次提交
-
-
由 weiping zhang 提交于
When using -G with one cgroup and -e with multiple events, only the first event gets the correct cgroup setting, all events from the second onwards will track system-wide events. If the user wants to track multiple events for a specific cgroup, the user must give parameters like the following: $ perf stat -e e1 -e e2 -e e3 -G test,test,test This patch simplify this case, just type one cgroup: $ perf stat -e e1 -e e2 -e e3 -G test $ mkdir -p /sys/fs/cgroup/perf_event/empty_cgroup $ perf stat -e cycles -e cache-misses -a -I 1000 -G empty_cgroup Before: 1.001007226 <not counted> cycles empty_cgroup 1.001007226 7,506 cache-misses After: 1.000834097 <not counted> cycles empty_cgroup 1.000834097 <not counted> cache-misses empty_cgroup Signed-off-by: Nweiping zhang <zhangweiping@didichuxing.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180129154805.GA6284@localhost.didichuxing.com [ Improved the doc text a bit, providing an example for cgroup + system wide counting ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 1月, 2018 1 次提交
-
-
由 Jin Yao 提交于
In the default 'perf record' configuration, all samples are processed, to create the HEADER_BUILD_ID table. So it's very easy to get the first/last samples and save the time to perf file header via the function write_sample_time(). Later, at post processing time, perf report/script will fetch the time from perf file header. Committer testing: # perf record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ] [root@jouet home]# perf report --header | grep "time of " # time of first sample : 22947.909226 # time of last sample : 22948.910704 # # perf report -D | grep PERF_RECORD_SAMPLE\( 0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa21b1af3 period: 1 addr: 0 0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa200d204 period: 1 addr: 0 <SNIP> 3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 28251/28251: 0xffffffffa22071d8 period: 169518 addr: 0 0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 198807 addr: 0 2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 88111 addr: 0 # Changelog: v7: Just update the patch description according to Arnaldo's suggestion. v6: Currently '--buildid-all' is not enabled at default. So the walking on all samples is the default operation. There is no big overhead to calculate the timestamp boundary in process_sample_event handler once we already go through all samples. So the timestamp boundary calculation is enabled by default when '--buildid-all' is not enabled. While if '--buildid-all' is enabled, we creates a new option "--timestamp-boundary" for user to decide if it enables the timestamp boundary calculation. v5: There is an issue that the sample walking can only work when '--buildid-all' is not enabled. So we need to let the walking be able to work even if '--buildid-all' is enabled and let the processing skips the dso hit marking for this case. At first, I want to provide a new option "--record-time-boundaries". While after consideration, I think a new option is not very necessary. v3: Remove the definitions of first_sample_time and last_sample_time from struct record and directly save them in perf_evlist. Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 17 10月, 2017 1 次提交
-
-
由 Taeung Song 提交于
'perf record' had a '-l' option that meant "scale counter values" a very long time ago, but it currently belongs to 'perf stat' as '-c'. So remove it. I found this problem in the below case. $ perf record -e cycles -l sleep 3 Error: unknown switch `l Signed-off-by: NTaeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1507907412-19813-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 9月, 2017 1 次提交
-
-
由 Andi Kleen 提交于
USER_REGS can currently only collected implicitely with call graph recording. Sometimes it is useful to see them separately, and filter them. Add a new --user-regs option to record that is similar to --intr-regs, but acts on user regs. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170905170029.19722-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 9月, 2017 1 次提交
-
-
由 Kan Liang 提交于
Support new sample type PERF_SAMPLE_PHYS_ADDR for physical address. Add new option --phys-data to record sample physical address. Signed-off-by: NKan Liang <kan.liang@intel.com> Tested-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NStephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-2-git-send-email-kan.liang@intel.com [ Added missing printing in evsel.c patch sent by Jiri Olsa ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 7月, 2017 1 次提交
-
-
由 Jin Yao 提交于
The option indicates the kernel to save branch type during sampling. One example: perf record -g --branch-filter any,save_type <command> Signed-off-by: NYao Jin <yao.jin@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500379995-6449-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 5月, 2017 1 次提交
-
-
由 Kim Phillips 提交于
Mostly in the documentation. Signed-off-by: NKim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170503131350.cebeecd8bd0f2968417626ab@arm.com [ Fix spelling of "parameter" in one of the spell-checked lines ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 3月, 2017 1 次提交
-
-
由 Hari Bathini 提交于
Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 2月, 2017 1 次提交
-
-
由 Jiri Olsa 提交于
Running 'perf record' with no target (-a, -p, -t, etc) will now collect system wide data. Commiter notes: Testing it: [root@jouet ~]# perf record ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.351 MB perf.data (366 samples) ] # is equivalent to: # perf record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.411 MB perf.data (978 samples) ] # Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170217170018.GA15389@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 1月, 2017 2 次提交
-
-
由 Jiri Olsa 提交于
It's now possible to specify the threshold time for perf.data like: $ perf record --switch-output=30s ... Once it's reached, the current data are dumped in to the perf.data.<timestamp> file and session does on. $ perf record --switch-output=30s ... [ perf record: dump data: Woken up 44 times ] [ perf record: Dump perf.data.2017010213043746 ] ... The time is expected to be a number with appended unit character - s/m/h/d. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NWang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-7-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
It's now possible to specify the threshold size for perf.data like: $ perf record --switch-output=2G ... Once it's reached, the current data are dumped in to the perf.data.<timestamp> file and session does on. $ perf record --switch-output=2G ... [ perf record: dump data: Woken up 7244 times ] [ perf record: Dump perf.data.2017010214093746 ] ... The size is expected to be a number with appended unit character - B/K/M/G. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NWang Nan <wangnan0@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-5-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 03 1月, 2017 1 次提交
-
-
由 Jiri Olsa 提交于
There's no --signal-trigger option, also adding the code comment into record man page. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NWang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483431600-19887-4-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 10月, 2016 1 次提交
-
-
由 Andi Kleen 提交于
- Some editing (params -> parameters) - Point to the now more complete list of parameters in the perf list manpage. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1476381433-22959-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 9月, 2016 1 次提交
-
-
由 Adrian Hunter 提交于
Symbols come from either the DSO or /proc/kallsyms for the kernel. Details of the functionality can be found in Documentation/perf-record.txt. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-8-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 28 9月, 2016 1 次提交
-
-
由 Adrian Hunter 提交于
Change '/sys/bus/event_sources' to the correct path which is '/sys/bus/event_source'. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 9月, 2016 1 次提交
-
-
由 Mathieu Poirier 提交于
This patch adds PMU driver specific configuration to the parser infrastructure by preceding any term with the '@' letter. As such doing something like: perf record -e some_event/@cfg1,@cfg2=config/ ... will see 'cfg1' and 'cfg2=config' being added to the list of evsel config terms. Token 'cfg1' and 'cfg2=config' are not processed in user space and are meant to be interpreted by the PMU driver. First the lexer/parser are supplemented with the required definitions to recognise the driver specific configuration. From there they are simply added to the list of event terms. The bulk of the work is done in function "parse_events_add_pmu()" where driver config event terms are added to a new list of driver config terms, which in turn spliced with the event's new driver configuration list. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1473179837-3293-4-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 03 8月, 2016 1 次提交
-
-
由 Jiri Olsa 提交于
Adding --sample-cpu option to be able to explicitly enable CPU sample type. Currently it's only enable implicitly in case the target is cpu related. It will be useful for following c2c record tool. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1470074555-24889-8-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-