- 25 1月, 2018 9 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The bpf__setup_stdout() function uses that evlist argument, remove the misleading __maybe_unused attribute. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-7vbhhzbd33nvdm7l35gdfryt@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
Once decoded from trace packets information on trace range needs to be communicated to the perf synthesis infrastructure so that it is available to the perf tools built-in rendering tools and scripts. Co-authored-by: NTor Jeremiassen <tor@ti.com> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-10-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
This patch adds support for complete packet decoding, allowing traces collected during a trace session to be decoder from the "report" infrastructure. Co-authored-by: NTor Jeremiassen <tor@ti.com> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-9-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
Add functionatlity to setup trace queues so that traces associated with CoreSight auxtrace events found in the perf.data file can be classified properly. The decoder and memory callback associated with each queue are then used to decode the traces that have been assigned to that queue. Co-authored-by: NTor Jeremiassen <tor@ti.com> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-8-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
This patch adds functions to communicate with the openCSD trace decoder, more specifically to access program memory, fetch trace packets and reset the decoder. Co-authored-by: NTor Jeremiassen <tor@ti.com> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-7-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
Adding functionality to create a CoreSight trace decoder capable of decoding trace data pushed by a client application. Co-authored-by: NTor Jeremiassen <tor@ti.com> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-6-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
This patch adds the required interface to the openCSD library to support dumping CoreSight trace packet using the "report --dump" command. The information conveyed is related to the type of packets gathered by a trace session rather than full decoding. Co-authored-by: NTor Jeremiassen <tor@ti.com> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-5-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tor Jeremiassen 提交于
The auxtrace_info section contains metadata that describes the number of trace capable CPUs, their ETM version and trace configuration, including trace id values. This information is required by the trace decoder in order to properly decode the compressed trace packets. This patch adds code to read and parse this metadata, and store it for use in configuring instances of the cs-etm trace decoder. Co-authored-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NTor Jeremiassen <tor@ti.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-4-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
This patch adds the entry point for CoreSight trace decoding, serving as a jumping board for furhter expansions. Co-authored-by: NTor Jeremiassen <tor@ti.com> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-3-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 1月, 2018 4 次提交
-
-
由 Hendrik Brueckner 提交于
Change the Makefile and build process to no longer require audit-libs interfaces when the architecture provides system call tables. Committer notes: Its not enough to hook into the NO_LIBAUDIT makefile block, we need to define a CONFIG_TRACE that gets selected by both architectures generating the syscall tables from the kernel headers and from detecting the availability of libaudit. With that in place we will not link against libaudit even if the necessary files are available for that, in fact we will not even try to detect its availability, speeding up a bit the feature detection phase. Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: linux-s390@vger.kernel.org LPU-Reference: 1516352177-11106-6-git-send-email-brueckner@linux.vnet.ibm.com Link: https://lkml.kernel.org/n/tip-j68lub6ipm8apvy52vd3l4cm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
Commit (93d10af2 perf tools: Optimize sample parsing for ordered events) breaks intelPT trace decoding by invariably returning an error if the event type isn't a PERF_SAMPLE_TIME. With this patch the timestamp is initialised and processing is allowed to continue if the error returned by function perf_evlist__parse_sample_timestamp() is not a fault. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 93d10af2 ("perf tools: Optimize sample parsing for ordered events") Link: http://lkml.kernel.org/r/1515616312-27645-1-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang YanQing 提交于
I've meet a strange behavior with these commands on my gentoo box: 1: perf kmem record 2: CTRL-C to stop 1 3: perf report 4: "Enter", "Enter", "Run scripts for all samples", "event_analyzing_sample". Then 'perf report' says: " No kallsyms or vmlinux with build-id xxxx was found /lib/modules/4.10.0+/build/vmlinux with build id xxxx not found, continuing without symbols ". It is strange because I am sure /lib/modules/4.10.0+/build/vmlinux is right for perf.data. After digging, I found out the reason is that "perf report" generates many open fds, then "script_browse" uses popen to run "perf script" which run out of open files. The gentoo box has a small default value for "max open files", 1024. Yes, "ulimit -n " with a bigger number could fix it, but I think that using O_CLOEXEC in do_open is a better way. Signed-off-by: NWang YanQing <udknight@gmail.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180115050448.GA20759@udknight [ Make sure O_CLOEXEC is available in old systems by adding a patch just before this one, to keep this bisectable in such systems ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
To be more generally available and get the build on centos:5 to work after we use O_CLOEXEC in the next patch, in the util/dso.c file. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Wang YanQing <udknight@gmail.com> Link: https://lkml.kernel.org/n/tip-vsjbiydh15pfqomxw1kx64ex@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 1月, 2018 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
When clang is not linked with 'perf' we should just add a debug message about that before doing the fallback to calling the external compiler. I.e. just the "-95" warning below gets turned into a debug message: # cat sys_enter_open.c #include "bpf.h" SEC("syscalls:sys_enter_open") int func(void *ctx) { struct { char *ptr; char path[256]; } filename = { .ptr = *((char **)(ctx + 16)), }; int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr); if (len > 0) { if (len == 1) perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr)); else if (len < 256) perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr)); } return 0; } # trace -e open,sys_enter_open.c bpf: builtin compilation failed: -95, try external compiler 0.000 ( ): __bpf_stdout__:@......./proc/self/task/11160/comm..) 0.014 ( 0.116 ms): qemu-system-x8/6721 open(filename: /proc/self/task/11160/comm, flags: RDWR) = 91 2335.411 ( ): __bpf_stdout__:FB..~.../etc/resolv.conf....) 2335.421 ( 0.030 ms): chronyd/883 open(filename: /etc/resolv.conf, flags: CLOEXEC) = 5 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-z5aak9oay448ffj37giz94yr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 1月, 2018 4 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
So that we can get it working for TUI, where using just pr_err() would end up making the message emitted to stderr to be erased by the TUI exit routine restoring the terminal to its previous state. Now we can see that trying to use a tracepoint field as one of the --field entries isn't working: # perf top --stdio --no-children -e syscalls:sys_enter_write --fields pid,sym,count Error: Unknown --fields key: `count' Usage: perf top [<options>] --fields <key[,keys...]> output field(s): overhead, period, sample plus all of sort keys # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-usy9hhy7umdd4bbblkn63t8w@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
There is never a need to synthesize a 'swapped' sample, so all callers to perf_event__synthesize_sample() pass 'false' as the value to 'swapped'. So get rid of the unused 'swapped' parameter. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1516108492-21401-4-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
PERF_SAMPLE_CPU contains the cpu number in the first 4 bytes and the second 4 bytes are reserved. Ensure the reserved bytes are zero in perf_event__synthesize_sample(). Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1516108492-21401-3-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Both 'perf inject' and internal tools consume cpu endian samples, so there is never a need to do any swapping when synthesizing samples. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1516108492-21401-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 17 1月, 2018 7 次提交
-
-
由 Jin Yao 提交于
Previously we use a magic number 10 to limit the number of time slices. It's not very good. This patch creates a new function perf_time__range_alloc() to allocate time slices buffer. The number of buffer entries is determined by the number of comma in string but at least it will allocate one entry even if no comma is found. Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1515596433-24653-7-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
Previously, the time percent slice needs an index to specify which one the user wants. It may be easier to use if the index can be omitted. So with this patch, for example, perf report --stdio --time 10%/1 should be equivalent to perf report --stdio --time 10% Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1515596433-24653-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
The command line like 'perf report --stdio --time 1abc%/1' could be accepted by perf. It looks not very good. This patch uses strtod() to replace original atof() and check the entire string. Now for the same command line, it would return error message "Invalid time string". root@skl:/tmp# perf report --stdio --time 1abc%/1 Invalid time string Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1515596433-24653-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
When we use a global DWARF setting as in: perf record --call-graph dwarf According to 5c0cf224 ("perf record: Store data mmaps for dwarf unwind") we need to set up some extra perf_event_attr bits. But when we instead do a per event dwarf setting: perf record -e cycles/call-graph=dwarf/ This was not being done, make them equivalent. This didn't produce any output changes in my tests while fixing up loose ends in the per-event settings, I found it just by comparing the perf_event_attr fields trying to find an explanation for those problems. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Noel Grandin <noelgrandin@gmail.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-6476r53h2o38skbs9qa4ust4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
When setting up DWARF callchains on specific events, without using 'record' or 'trace' --call-graph, but instead doing it like: perf trace -e cycles/call-graph=dwarf/ The unwind__prepare_access() call in thread__insert_map() when we process PERF_RECORD_MMAP(2) metadata events were not being performed, precluding us from using per-event DWARF callchains, handling them just when we asked for all events to be DWARF, using "--call-graph dwarf". We do it in the PERF_RECORD_MMAP because we have to look at one of the executable maps to figure out the executable type (64-bit, 32-bit) of the DSO laid out in that mmap. Also to look at the architecture where the perf.data file was recorded. All this probably should be deferred to when we process a sample for some thread that has callchains, so that we do this processing only for the threads with samples, not for all of them. For now, fix using DWARF on specific events. Before: # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms 0.000 probe_libc:inet_pton:(7fe9597bb350)) Problem processing probe_libc:inet_pton callchain, skipping... # After: # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms 0.000 probe_libc:inet_pton:(7fd4aa930350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa804e51af3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa804e51b379] (/usr/bin/ping) # # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms 0.000 probe_libc:inet_pton:(7f9363b9e350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffa9e8a14e0f3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffa9e8a14e1379] (/usr/bin/ping) # # perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms 0.000 probe_libc:inet_pton:(7f4947e1c350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa716d88ef3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa716d88f379] (/usr/bin/ping) # # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms 0.000 probe_libc:inet_pton:(7fa157696350)) __GI___inet_pton (/usr/lib64/libc-2.26.so) getaddrinfo (/usr/lib64/libc-2.26.so) [0xffffa9ba39c74f40] (/usr/bin/ping) # Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/r/20180116182650.GE16107@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
When setting the "dwarf" unwinder for a specific event and not specifying the max-stack, the attr.sample_max_stack ended up using an uninitialized callchain_param.max_stack, fix it by using designated initializers for that callchain_param variable, zeroing all non explicitely initialized struct members. Here is what happened: # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 callchain: type DWARF callchain: stack dump size 8192 perf_event_attr: type 2 size 112 config 0x730 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC exclude_callchain_user 1 { wakeup_events, wakeup_watermark } 1 sample_regs_user 0xff0fff sample_stack_user 8192 sample_max_stack 50656 sys_perf_event_open failed, error -75 Value too large for defined data type # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 callchain: type DWARF callchain: stack dump size 8192 perf_event_attr: type 2 size 112 config 0x730 sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC exclude_callchain_user 1 sample_regs_user 0xff0fff sample_stack_user 8192 sample_max_stack 30448 sys_perf_event_open failed, error -75 Value too large for defined data type # Now the attr.sample_max_stack is set to zero and the above works as expected: # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms 0.000 probe_libc:inet_pton:(7feb7a998350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa39b6108f3f] (/usr/bin/ping) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-is9tramondqa9jlxxsgcm9iz@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kim Phillips 提交于
'perf record' and 'perf report --dump-raw-trace' supported in this release. Example usage: # perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000 # perf report --dump-raw-trace Note that the perf.data file is portable, so the report can be run on another architecture host if necessary. Output will contain raw SPE data and its textual representation, such as: 0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1 . . ... ARM SPE data: size 2097152 bytes . 00000000: 49 00 LD . 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0 . 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1 . 00000014: 9a 00 00 LAT 0 XLAT . 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS . 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1 . 00000022: 98 00 00 LAT 0 TOT . 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558 . 0000002e: 49 00 LD . 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80 . 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1 . 00000042: 9a 00 00 LAT 0 XLAT . 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS . 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1 . 00000050: 98 00 00 LAT 0 TOT . 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558 . 0000005c: 48 00 INSN-OTHER . 0000005e: 42 02 EV RETIRED . 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1 . 00000069: 98 00 00 LAT 0 TOT . 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481 ... Other release notes: - applies to acme's perf/{core,urgent} branches, likely elsewhere - Report is self-contained within the tool. Record requires enabling the kernel SPE driver by setting CONFIG_ARM_SPE_PMU. - The intel-bts implementation was used as a starting point; its min/default/max buffer sizes and power of 2 pages granularity need to be revisited for ARM SPE - Recording across multiple SPE clusters/domains not supported - Snapshot support (record -S), and conversion to native perf events (e.g., via 'perf inject --itrace'), are also not supported - Technically both cs-etm and spe can be used simultaneously, however disabled for simplicity in this release Signed-off-by: NKim Phillips <kim.phillips@arm.com> Reviewed-by: NDongjiu Geng <gengdongjiu@huawei.com> Acked-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 1月, 2018 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The construct: if (callchain_param) perf_evsel__config_callchain(evsel, opts, &callchain_param); happens in several places, so make perf_evsel__config_callchain() work just like free(NULL), do nothing if param->enabled is not set. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ykk0qzxnxwx3o611ctjnmxav@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
We need to increase output offset in each iteration, not decrease it as we currently do. I guess we were lucky to finish in most cases in first iteration, so the bug never showed. However it shows a lot when working with big (~4GB) size data. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 9c9f5a2f ("perf tools: Introduce copyfile_offset() function") Link: http://lkml.kernel.org/r/20180109133923.25406-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 1月, 2018 2 次提交
-
-
由 Kan Liang 提交于
There could be different types of memory in the system. E.g normal System Memory, Persistent Memory. To understand how the workload maps to those memories, it's important to know the I/O statistics of them. Perf can collect physical addresses, but those are raw data. It still needs extra work to resolve the physical addresses. Provide a script to facilitate the physical addresses resolving and I/O statistics. Profile with MEM_INST_RETIRED.ALL_LOADS or MEM_UOPS_RETIRED.ALL_LOADS event if any of them is available. Look up the /proc/iomem and resolve the physical address. Provide memory type summary. Here is an example output: # perf script report mem-phys-addr Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ----------- ----------- System RAM 74 53.2% Persistent Memory 55 39.6% N/A --- Changes since V2: - Apply the new license rules. - Add comments for globals Changes since V1: - Do not mix DLA and Load Latency. Do not compare the loads and stores. Only profile the loads. - Use event name to replace the RAW event Signed-off-by: NKan Liang <Kan.liang@intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/r/1515099595-34770-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Luis de Bethencourt 提交于
The trailing semicolon is an empty statement that does no operation. Removing it since it doesn't do anything. Signed-off-by: NLuis de Bethencourt <luisbg@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180111155020.9782-1-luisbg@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 1月, 2018 1 次提交
-
-
由 Mathieu Poirier 提交于
Commit ("d0565132 perf evsel: Enable type checking for perf_evsel_config_term types") assumes PERF_EVSEL__CONFIG_TERM_DRV_CFG isn't used and as such adds a BUG_ON(). Since the enumeration type is used in macro ADD_CONFIG_TERM() the change break CoreSight trace acquisition. This patch restores the original code. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: d0565132 ("perf evsel: Enable type checking for perf_evsel_config_term types") Link: http://lkml.kernel.org/r/1515617211-32024-1-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 10 1月, 2018 2 次提交
-
-
由 Jiri Olsa 提交于
I want to display the pure events status coming in the next patch and the tool's warnings are superfluous in the output. Making it optional, enabled by default. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180107160356.28203-11-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding option to display lost events: $ perf script --show-lost-events ... mplayer 13810 [002] 468011.402396: 100 cycles:ppp: ff.. mplayer 13810 [002] 468011.402396: PERF_RECORD_LOST lost 3880 mplayer 13810 [002] 468011.402397: 100 cycles:ppp: ff.. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180107160356.28203-10-jolsa@kernel.org [ Use PRIu64 when printing u64 values, fixing the build in some arches ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 1月, 2018 6 次提交
-
-
由 Jiri Olsa 提交于
Adding support to display sample misc field in form of letter for each bit: # perf script -F +misc ... sched-messaging 1414 K 28690.636582: 4590 cycles ... sched-messaging 1407 U 28690.636600: 325620 cycles ... sched-messaging 1414 K 28690.636608: 19473 cycles ... misc field __________/ The misc bits are assigned to following letters: PERF_RECORD_MISC_KERNEL K PERF_RECORD_MISC_USER U PERF_RECORD_MISC_HYPERVISOR H PERF_RECORD_MISC_GUEST_KERNEL G PERF_RECORD_MISC_GUEST_USER g PERF_RECORD_MISC_MMAP_DATA* M PERF_RECORD_MISC_COMM_EXEC E PERF_RECORD_MISC_SWITCH_OUT S Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180107160356.28203-9-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Display namespaces bit in -vv debug display: $ perf record -vv --namespaces ... ... perf_event_attr: size 112 ... namespaces 1 Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180107160356.28203-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
Previous patch supports the multiple time range. For example, select the first and second 10% time slices. perf report --time 10%/1,10%/2 We need a function to check if a timestamp is in the ranges of [0, 10%) and [10%, 20%]. Note that it includes the last element in [10%, 20%] but it doesn't include the last element in [0, 10%). It's to avoid the overlap. This patch implments a new function perf_time__ranges_skip_sample for this checking. Change log: v4: Let perf_time__ranges_skip_sample be compatible with perf_time__skip_sample when only one time range. Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512738826-2628-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
Current perf report/script/... have a --time option to limit the time range of output. But right now it only supports absolute time, add support for time percentage. For example: 1. Select the second 10% time slice perf report --time 10%/2 2. Select from 0% to 10% time slice perf report --time 0%-10% It also support the multiple time ranges. 3. Select the first and second 10% time slices perf report --time 10%/1,10%/2 4. Select from 0% to 10% and 30% to 40% slices perf report --time 0%-10%,30%-40% Changelog: v4: An issue is found. Following passes. perf script --time 10%/10x12321xsdfdasfdsafdsafdsa Now it uses strtol to replace atoi. Committer notes: This just puts in place the infrastructure, so the examples in this cset comment will only work later, after more patches in this series are applied. Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512738826-2628-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
perf report/script/... have a --time option to limit the time range of output. That's very useful to slice large traces, e.g. when processing the output of perf script for some analysis. But right now --time only supports absolute time. Also there is no fast way to get the start/end times of a given trace except for looking at it. This makes it hard to e.g. only decode the first half of the trace, which is useful for parallelization of scripts Another problem is that perf records are variable size and there is no synchronization mechanism. So the only way to find the last sample reliably would be to walk all samples. But we want to avoid that in perf report/... because it is already quite expensive. That is why storing the first sample time and last sample time in perf record is better. This patch creates a new header feature type HEADER_SAMPLE_TIME and related ops. Save the first sample time and the last sample time to the feature section in perf file header. That will be done when, for instance, processing build-ids, where we already have to process all samples to create the build-id table, take advantage of that to further amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf report/script' faster when using --time. Committer testing: After this patch is applied the header is written with zeroes, we need the next patch, for "perf record" to actually write the timestamps: # perf report -D | grep PERF_RECORD_SAMPLE\( 22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21be8c5 period: 1 addr: 0 <SNIP> 22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21ffd50 period: 2828043 addr: 0 # perf report --header | grep "time of " # time of first sample : 0.000000 # time of last sample : 0.000000 # Changelog: v7: 1. Rebase to latest perf/core branch. 2. Add following clarification in patch description according to Arnaldo's suggestion. "That will be done when, for instance, processing build-ids, where we already have to process all samples to create the build-id table, take advantage of that to further amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf report/script' faster when using --time." v4: Use perf script time style for timestamp printing. Also add with the printing of sample duration. v3: Remove the definitions of first_sample_time/last_sample_time from perf_session. Just define them in perf_evlist Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
When a valid vmlinux is not found, 'perf report' falls back to look at /proc/kcore. In this case, it will report the impossible large offset. For example: # perf record -b -e cycles:k find /etc/ > /dev/null # perf report --stdio --branch-history 22.77% _vm_normal_page+18446603336221188162 | ---page_remove_rmap +18446603336221188324 page_remove_rmap +18446603336221188487 (cycles:5) unlock_page_memcg +18446603336221188096 page_remove_rmap +18446603336221188327 (cycles:1) The issue is the value which is passed to parameter 'addr' in __get_srcline() is the objdump address. It's not correct if we calculate the offset by using 'addr - sym->start'. This patch creates a new parameter 'ip' in __get_srcline(). It is not converted to objdump address. With this patch, the perf report output is: 22.77% _vm_normal_page+66 | ---page_remove_rmap +228 page_remove_rmap +391 (cycles:5) unlock_page_memcg +0 page_remove_rmap +231 (cycles:1) page_remove_rmap +236 Committer testing: Make sure you get any valid vmlinux out of the way, using '-v' on the 'perf report' case and deleting it from places where perf searches them, like your kernel build dir and the build-id cache, in ~/.debug/. Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 27 12月, 2017 2 次提交
-
-
由 Mengting Zhang 提交于
While monitoring a multithread process with pid option, perf sometimes may return sys_perf_event_open failure with 3(No such process) if any of the process's threads die before we open the event. However, we want perf continue monitoring the remaining threads and do not exit with error. Here, the patch enables perf_evsel::ignore_missing_thread for -p option to ignore complete failure if any of threads die before we open the event. But it may still return sys_perf_event_open failure with 22(Invalid) if we monitors several event groups. sys_perf_event_open: pid 28960 cpu 40 group_fd 118202 flags 0x8 sys_perf_event_open: pid 28961 cpu 40 group_fd 118203 flags 0x8 WARNING: Ignored open failure for pid 28962 sys_perf_event_open: pid 28962 cpu 40 group_fd [118203] flags 0x8 sys_perf_event_open failed, error -22 That is because when we ignore a missing thread, we change the thread_idx without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd() may return a wrong group_fd for the next thread and sys_perf_event_open() return with 22. sys_perf_event_open(){ ... if (group_fd != -1) perf_fget_light()//to get corresponding group_leader by group_fd ... if (group_leader) if (group_leader->ctx->task != ctx->task)//should on the same task goto err_context ... } This patch also fixes this bug by introducing perf_evsel__remove_fd() and update_fds to allow removing fds for the missing thread. Changes since v1: - Change group_fd__remove() into a more genetic way without changing code logic - Remove redundant condition Changes since v2: - Use a proper function name and add some comment. - Multiline comment style fixes. Committer testing: Before this patch the recently added 'perf stat --per-thread' for system wide counting would race while enumerating all threads using /proc: [root@jouet ~]# perf stat --per-thread failed to parse CPUs map: No such file or directory Usage: perf stat [<options>] [<command>] -C, --cpu <cpu> list of cpus to monitor in system-wide -a, --all-cpus system-wide collection from all CPUs [root@jouet ~]# perf stat --per-thread failed to parse CPUs map: No such file or directory Usage: perf stat [<options>] [<command>] -C, --cpu <cpu> list of cpus to monitor in system-wide -a, --all-cpus system-wide collection from all CPUs [root@jouet ~]# When, say, the kernel was being built, so lots of shortlived threads, after this patch this doesn't happen. Signed-off-by: NMengting Zhang <zhangmengting@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Cheng Jian <cj.chengjian@huawei.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com [ Remove one use 'evlist' alias variable ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
When we detect a different endianity we swap event before processing. It's tricky for samples because we have no idea what's inside. We treat it as an array of u64s, swap them and later on we swap back parts which are different. We mangle this way also the tracepoint raw data, which ends up in report showing wrong data: 1.95% comm=Q^B pid=29285 prio=16777216 target_cpu=000 1.67% comm=l^B pid=0 prio=16777216 target_cpu=000 Luckily the traceevent library handles the endianity by itself (thank you Steven!), so we can pass the RAW data directly in the other endianity. 2.51% comm=beah-rhts-task pid=1175 prio=120 target_cpu=002 2.23% comm=kworker/0:0 pid=11566 prio=120 target_cpu=000 The fix is basically to swap back the raw data if different endianity is detected. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20171129184346.3656-1-jolsa@kernel.org [ Add util/memswap.c to python-ext-sources to link missing mem_bswap_64() ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-