- 15 2月, 2019 16 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Since we need it to resolve the AIO symbols, otherwise we fail with: $ cat /tmp/build/perf/feature/test-all.make.output /usr/bin/ld: /tmp/ccEqrj36.o: undefined reference to symbol 'aio_return64@@GLIBC_2.2.5' /usr/bin/ld: //usr/lib64/librt.so.1: error adding symbols: DSO missing from command line collect2: error: ld returned 1 exit status $ When we added the aio support in 'perf record' only the test-libaio.bin target got the -lrt, i.e. the feature detection slow path. Fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 2a07d814 ("tools build feature: Check if libaio is available") Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
When introducing the possibility for selecting if the common prefix to options such as the waitid ones, i.e. all 'waitid' options start with 'W', so, to make it make it more compact if configured to suppress it, 'perf trace' will do so, other examples include mmap's PROT_ prefix for its 'prot' argument, etc, which, when showing the syscall argument name ends up producing duplicated info that clutters the screen, i.e.: # perf trace -e mmap --max-events 2 sleep 1 0.000 ( 0.014 ms): sleep/20886 mmap(len: 112595, prot: PROT_READ, flags: MAP_PRIVATE, fd: 3) = 0x7f3e986d2000 0.041 ( 0.005 ms): sleep/20886 mmap(len: 8192, prot: PROT_READ|PROT_WRITE, flags: MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f3e986d0000 # So it is possible to suppress that and make it more compact by having this in your ~/.perfconfig: # cat ~/.perfconfig [trace] show_prefix = no # # perf trace -e mmap --max-events 2 sleep 1 0.000 ( 0.014 ms): sleep/8009 mmap(len: 112595, prot: READ, flags: PRIVATE, fd: 3) = 0x7ff2373de000 0.040 ( 0.005 ms): sleep/8009 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7ff2373dc000 # To have it look more like strace's output, we instead want to suppress the arg name and show the prefix, so use: # cat ~/.perfconfig [trace] show_prefix = yes show_arg_names = no # # perf trace -e mmap --max-events 2 sleep 1 0.000 ( 0.006 ms): sleep/15513 mmap(NULL, 112595, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7a9b6d3000 0.020 ( 0.002 ms): sleep/15513 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f7a9b6d1000 # When this logic was introduced a bug came with it when processing the waitid 'option' arg that ended up expecting 3 strings when just two were being provided, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: c65c83ff ("perf trace: Allow asking for not suppressing common string prefixes") Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We were crashing when processing a negative fd: Program received signal SIGSEGV, Segmentation fault. 0x0000000000609bbf in syscall_arg__scnprintf_ioctl_cmd (bf=0x1172eca "", size=2038, arg=0x7fffffff8360) at trace/beauty/ioctl.c:182 182 if (file->dev_maj == USB_DEVICE_MAJOR) Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.6-28.fc29.x86_64 elfutils-libelf-0.174-5.fc29.x86_64 elfutils-libs-0.174-5.fc29.x86_64 glib2-2.58.3-1.fc29.x86_64 libbabeltrace-1.5.6-1.fc29.x86_64 libunwind-1.2.1-6.fc29.x86_64 libuuid-2.32.1-1.fc29.x86_64 libxcrypt-4.4.3-2.fc29.x86_64 numactl-libs-2.0.12-1.fc29.x86_64 openssl-libs-1.1.1a-1.fc29.x86_64 pcre-8.42-6.fc29.x86_64 perl-libs-5.28.1-427.fc29.x86_64 popt-1.16-15.fc29.x86_64 python2-libs-2.7.15-11.fc29.x86_64 slang-2.3.2-4.fc29.x86_64 xz-libs-5.2.4-3.fc29.x86_64 (gdb) bt #0 0x0000000000609bbf in syscall_arg__scnprintf_ioctl_cmd (bf=0x1172eca "", size=2038, arg=0x7fffffff8360) at trace/beauty/ioctl.c:182 #1 0x000000000048e295 in syscall__scnprintf_val (sc=0x123b500, bf=0x1172eca "", size=2038, arg=0x7fffffff8360, val=21519) at builtin-trace.c:1594 #2 0x000000000048e60d in syscall__scnprintf_args (sc=0x123b500, bf=0x1172ec6 "-1, ", size=2042, args=0x7ffff6a7c034 "\377\377\377\377", augmented_args=0x7ffff6a7c064, augmented_args_size=4, trace=0x7fffffffa8d0, thread=0x1175cd0) at builtin-trace.c:1661 #3 0x000000000048f04e in trace__sys_enter (trace=0x7fffffffa8d0, evsel=0xb260b0, event=0x7ffff6a7bfe8, sample=0x7fffffff84f0) at builtin-trace.c:1880 #4 0x00000000004915a4 in trace__handle_event (trace=0x7fffffffa8d0, event=0x7ffff6a7bfe8, sample=0x7fffffff84f0) at builtin-trace.c:2590 #5 0x0000000000491eed in __trace__deliver_event (trace=0x7fffffffa8d0, event=0x7ffff6a7bfe8) at builtin-trace.c:2818 #6 0x0000000000492030 in trace__deliver_event (trace=0x7fffffffa8d0, event=0x7ffff6a7bfe8) at builtin-trace.c:2845 #7 0x0000000000492896 in trace__run (trace=0x7fffffffa8d0, argc=0, argv=0x7fffffffdb58) at builtin-trace.c:3040 #8 0x000000000049603a in cmd_trace (argc=0, argv=0x7fffffffdb58) at builtin-trace.c:3952 #9 0x00000000004d5103 in main (argc=1, argv=0x7fffffffdb58) at perf.c:474 (gdb) p fd $1 = -1 (gdb) p file $7 = (struct file *) 0xfffffffffffffff0 (gdb) p ((struct thread_trace *)arg->thread)->files.table + fd $8 = (struct file *) 0xfffffffffffffff0 (gdb) Check for that and return NULL instead. This problem was introduced recently, the other codepaths leading to thread_trace__files_entry() check for negative fds, like thread__fd_path(), but we need to do it at thread_trace__files_entry() as more users are now calling it directly. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 2d473389 ("perf trace beauty: Export function to get the files for a thread") Link: https://lkml.kernel.org/n/tip-oq7bvaaf07gsd4yqty3107u2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
It is possible to pass a negative number as the fd and that has to be handled, so stop using 'unsigned int fd' in the ioctl syscall 'cmd' beautifier. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-b7qwa0l19dswa09h3s41akfu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Since we get all the tests in a single .c file for a first test, tools/build/feature/test-all.c, if individual tests set that define and fail to undef it at its end, then it the test-all.c build will fail due to defining _GNU_SOURCE multiple times, getting us to the slow path, so undef it at the end in tests that define it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-w6s00jfo1xabgphzczadl59b@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Song Liu 提交于
Synthesizing BPF events is only supported for root. Silent warning msg when non-root user runs perf-record. Reported-by: NDavid Carrillo-Cisneros <davidca@fb.com> Signed-off-by: NSong Liu <songliubraving@fb.com> Tested-by: NDavid Carrillo-Cisneros <davidca@fb.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190204193140.719740-1-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Paul Clarke 提交于
Descriptions of metrics for POWER9 processors can be found in the "POWER9 Performance Monitor Unit User’s Guide", which is currently available on the "IBM Portal for OpenPOWER" (https://www-355.ibm.com/systems/power/openpower/welcome.xhtml) at https://www-355.ibm.com/systems/power/openpower/posting.xhtml?postingId=4948CDE1963C9BCA852582F800718190 This patch is for metric groups: - general and other metrics not in a metric group. Signed-off-by: NPaul Clarke <pc@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Carl Love <cel@us.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20190209181429.23950-5-pc@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Paul Clarke 提交于
perf vendor events power9: Branch_prediction, instruction_stats, latency, lsu_rejects, memory, prefetch & translation metrics Descriptions of metrics for POWER9 processors can be found in the "POWER9 Performance Monitor Unit User’s Guide", which is currently available on the "IBM Portal for OpenPOWER" (https://www-355.ibm.com/systems/power/openpower/welcome.xhtml) at https://www-355.ibm.com/systems/power/openpower/posting.xhtml?postingId=4948CDE1963C9BCA852582F800718190 This patch is for metric groups: - branch_prediction - instruction_stats_percent_per_ref - latency - lsu_rejects - memory - prefetch - translation Plus, some whitespace changes. Signed-off-by: NPaul Clarke <pc@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Carl Love <cel@us.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20190209181429.23950-4-pc@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Paul Clarke 提交于
Descriptions of metrics for POWER9 processors can be found in the "POWER9 Performance Monitor Unit User’s Guide", which is currently available on the "IBM Portal for OpenPOWER" (https://www-355.ibm.com/systems/power/openpower/welcome.xhtml) at https://www-355.ibm.com/systems/power/openpower/posting.xhtml?postingId=4948CDE1963C9BCA852582F800718190 This patch is for metric groups: - dl1_reloads_percent_per_inst - dl1_reloads_percent_per_ref - instruction_misses_percent_per_inst - l2_stats - l3_stats - pteg_reloads_percent_per_inst - pteg_reloads_percent_per_ref Signed-off-by: NPaul Clarke <pc@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Carl Love <cel@us.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20190209181429.23950-3-pc@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Paul Clarke 提交于
Descriptions of metrics for POWER9 processors can be found in the "POWER9 Performance Monitor Unit User’s Guide", which is currently available on the "IBM Portal for OpenPOWER" (https://www-355.ibm.com/systems/power/openpower/welcome.xhtml) at https://www-355.ibm.com/systems/power/openpower/posting.xhtml?postingId=4948CDE1963C9BCA852582F800718190 This patch is for metric groups: - cpi_breakdown - estimated_dcache_miss_cpi Signed-off-by: NPaul Clarke <pc@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Carl Love <cel@us.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20190209181429.23950-2-pc@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Paul Clarke 提交于
POWER8 metrics are not well publicized. Some are here: https://www.ibm.com/support/knowledgecenter/en/SSFK5S_2.2.0/com.ibm.cluster.pedev.v2r2.pedev100.doc/bl7ug_derivedmetricspower8.htm This patch is for metric groups: - translation - general and other metrics not in a metric group. Signed-off-by: NPaul Clarke <pc@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Carl Love <cel@us.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20190207175314.31813-5-pc@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Paul Clarke 提交于
perf vendor events power8: Branch_prediction, latency, bus_stats, instruction_mix & instruction_stats metrics POWER8 metrics are not well publicized. Some are here: https://www.ibm.com/support/knowledgecenter/en/SSFK5S_2.2.0/com.ibm.cluster.pedev.v2r2.pedev100.doc/bl7ug_derivedmetricspower8.htm This patch is for metric groups: - branch_prediction - latency - bus_stats - instruction_mix - instruction_stats_percent_per_ref Signed-off-by: NPaul Clarke <pc@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Carl Love <cel@us.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20190207175314.31813-4-pc@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Paul Clarke 提交于
perf vendor events power8: Dl1_reload, instruction_misses, l2_stats, lsu_rejects, memory & pteg_reloads metrics POWER8 metrics are not well publicized. Some are here: https://www.ibm.com/support/knowledgecenter/en/SSFK5S_2.2.0/com.ibm.cluster.pedev.v2r2.pedev100.doc/bl7ug_derivedmetricspower8.htm This patch is for metric groups: - dl1_reloads_percent_per_inst - dl1_reloads_percent_per_ref - instruction_misses_percent_per_inst - l2_stats - lsu_rejects - memory - pteg_reloads_percent_per_inst - pteg_reloads_percent_per_ref Signed-off-by: NPaul Clarke <pc@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Carl Love <cel@us.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20190207175314.31813-3-pc@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Paul Clarke 提交于
POWER8 metrics are not well publicized. Some are here: https://www.ibm.com/support/knowledgecenter/en/SSFK5S_2.2.0/com.ibm.cluster.pedev.v2r2.pedev100.doc/bl7ug_derivedmetricspower8.htm This patch is for metric groups: - cpi_breakdown - estimated_dcache_miss_cpi Signed-off-by: NPaul Clarke <pc@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Carl Love <cel@us.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20190207175314.31813-2-pc@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Thomas Richter 提交于
On IBM z13 machine types 2964 and 2965 the descriptor sizes for sampling and diagnostic sampling entries might be missing in the trailer entry and are set to zero. This leads to a perf report failure when processing diagnostic sampling entries. This patch adds missing descriptor sizes when the trailer entry contains zero for these fields. Output before: [root@s38lp82 perf]# ./perf report --stdio | fgrep Samples 0xabbf0 [0x8]: failed to process type: 68 Error: failed to process sample [root@s38lp82 perf]# Output after: [root@s38lp82 perf]# ./perf report --stdio | fgrep Samples # Total Lost Samples: 0 # Samples: 3K of event 'SF_CYCLES_BASIC_DIAG' # Samples: 162 of event 'CF_DIAG' [root@s38lp82 perf]# Fixes: 2b1444f2 ("perf report: Add raw report support for s390 auxiliary trace") Signed-off-by: NThomas Richter <tmricht@linux.ibm.com> Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20190211100627.85714-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mathieu Poirier 提交于
After 'commit e22c1c75 ("perf thread: Don't include symbol.h, symbol_conf.h is enough")' Compilation of the perf tools is broken when using the functionality provided by the openCSD library: [...] ... timerfd: [ on ] ... sched_getcpu: [ on ] ... sdt: [ OFF ] ... setns: [ on ] ... libopencsd: [ on ] [...] CC util/arm-spe.o CC util/arm-spe-pkt-decoder.o CC util/s390-cpumsf.o CC util/cs-etm.o CC util/parse-branch-options.o util/cs-etm.c: In function ‘cs_etm__mem_access’: util/cs-etm.c:297:24: error: storage size of ‘al’ isn’t known struct addr_location al; And rightly so since file cs-etm.c doesn't include symbol.h, something that is rectified in this patch. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190208223543.31836-1-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 2月, 2019 7 次提交
-
-
由 Alexey Budankov 提交于
Implement --affinity=node|cpu option for the record mode defaulting to system affinity mask bouncing. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/083f5422-ece9-10dd-8305-bf59c860f10f@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
A microcode patch is also needed for Goldmont while counter freezing feature is enabled. Otherwise, there will be some issues, e.g. PMI lost. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-5-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
Clean up counter freezing quirk to use the new facility to check for min microcode revisions. Rename the counter freezing quirk related functions. Because other platforms, e.g. Goldmont, also needs to call the quirk. Only check the boot CPU, assuming models and features are consistent over all CPUs. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-4-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
Clean up SNB PEBS quirk to use the new facility to check for min microcode revisions. Only check the boot CPU, assuming models and features are consistent over all CPUs. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-3-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
KVM added a workaround for PEBS events leaking into guests with commit: 26a4f3c0 ("perf/x86: disable PEBS on a guest entry.") This uses the VT entry/exit list to add an extra disable of the PEBS_ENABLE MSR. Intel also added a fix for this issue to microcode updates on Haswell/Broadwell/Skylake. It turns out using the MSR entry/exit list makes VM exits significantly slower. The list is only needed for disabling PEBS, because the GLOBAL_CTRL change gets optimized by KVM into changing the VMCS. Check for the microcode updates that have the microcode fix for leaking PEBS, and disable the extra entry/exit list entry for PEBS_ENABLE. In addition we always clear the GLOBAL_CTRL for the PEBS counter while running in the guest, which is enough to make them never fire at the wrong side of the host/guest transition. The overhead for VM exits with the filtering active with the patch is reduced from 8% to 4%. The microcode patch has already been merged into future platforms. This patch is one-off thing. The quirks is used here. For other old platforms which doesn't have microcode patch and quirks, extra disable of the PEBS_ENABLE MSR is still required. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
For bug workarounds or checks, it is useful to check for specific microcode revisions. Add a new generic function to match the CPU with stepping. Add the other function to check the min microcode revisions for the matched CPU. A new table format is introduced to facilitate the quirk to fill the related information. This does not change the existing x86_cpu_id because it's an ABI shared with modules, and also has quite different requirements, as in no wildcards, but everything has to be matched exactly. Originally-by: NAndi Kleen <ak@linux.intel.com> Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NBorislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Link: https://lkml.kernel.org/r/1549319013-4522-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 09 2月, 2019 3 次提交
-
-
由 Ingo Molnar 提交于
Merge tag 'perf-core-for-mingo-5.1-20190206' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: Hardware tracing: Adrian Hunter: - Handle calls optimized into jumps to a different symbol in the thread stack routines used to process hardware traces (Adrian Hunter) Intel PT: Adrian Hunter: - Fix overlap calculation for padding. - Fix CYC timestamp calculation after OVF. - Packet splitting can only happen in 32-bit. - Add timestamp to auxtrace errors. ARM CoreSight: Leo Yan: - Add last instruction information in packet - Set sample flags for instruction range, exception and return packets and for a trace discontinuity. - Add exception number in exception packet - Change tuple from traceID-CPU# to traceID-metadata - Add traceID in packet Mathieu Poirier: - Add "sinks" group to PMU directory - Use event attributes to send sink information to kernel - Remove set_drv_config() API, no longer used. perf annotate: Jiri Olsa: - Delay symbol annotation to the resort phase, speeding up 'perf report' startup. perf record: Alexey Budankov: - Allow binding userspace buffers to NUMA nodes. Symbols: Adrian Hunter: - Fix calculating of symbol sizes when splitting kallsyms into maps for kcore processing. Vendor events: William Cohen: - Intel: Fix Load_Miss_Real_Latency on CLX Misc: Arnaldo Carvalho de Melo: - Streamline headers, removing includes when all that is needed are just forward declarations, fixup the fallout for cases where headers should have been explicitely included but were instead obtained indirectly, by sheer luck. - Add fallback versions for CPU_{OR,EQUAL}(), so that code using it continue to build on older systems where those were not yet introduced or in systems using some other libc than the GNU one where those helpers aren't present. Documentation: Changbin Du: - Add documentation for BPF event selection. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
Merge tag 'perf-urgent-for-mingo-5.0-20190205' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: perf trace: Arnaldo Carvalho de Melo: Fix handling of probe:vfs_getname when the probed routine is inlined in multiple places, fixing the collection of the 'filename' parameter in open syscalls. perf test: Gustavo A. R. Silva: Fix bitwise operator usage in evsel-tp-sched test, which made tat test always detect fields as signed. Jiri Olsa: Filter out hidden symbols from labels, added in systems where the annobin plugin is used, such as RHEL8, which, if left in place make the DWARF unwind 'perf test' to fail on PPC. Tony Jones: Fix 'perf_event_attr' tests when building with python3. perf mem/c2c: Ravi Bangoria: Fix perf_mem_events on PowerPC. tools headers UAPI: Arnaldo Carvalho de Melo: Sync linux/in.h copy from the kernel sources, silencing a perf build warning. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 2月, 2019 14 次提交
-
-
由 Adrian Hunter 提交于
The timestamp can use useful to find part of a trace that has an error without outputting all of the trace e.g. using the itrace 's' option to skip initial number of events. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190206103947.15750-6-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Data is copied when the trace is stopped, so packets are never split between buffers except when processing if the buffer cannot fit in the address space which can only happen on 32-bit systems. Change the logic to reflect that. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190206103947.15750-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
CYC packet timestamp calculation depends upon CBR which was being cleared upon overflow (OVF). That can cause errors due to failing to synchronize with sideband events. Even if a CBR change has been lost, the old CBR is still a better estimate than zero. So remove the clearing of CBR. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Auxtrace records might have up to 7 bytes of padding appended. Adjust the overlap accordingly. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Define auxtrace record alignment so that it can be referenced elsewhere. Note this is preparation for patch "perf intel-pt: Fix overlap calculation for padding" Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
The compiler might optimize a call/ret combination by making it a jmp. However the thread-stack does not presently cater for that, so that such control flow is not visible in the call graph. Make it visible by recording on the stack a branch to the start of a different symbol. Note, that means when a ret pops the stack, all jmps must be popped off first. Example: $ cat jmp-to-fn.c __attribute__((noinline)) int bar(void) { return -1; } __attribute__((noinline)) int foo(void) { return bar() + 1; } int main() { return foo(); } $ gcc -ggdb3 -Wall -Wextra -O2 -o jmp-to-fn jmp-to-fn.c $ objdump -d jmp-to-fn <SNIP> 0000000000001040 <main>: 1040: 31 c0 xor %eax,%eax 1042: e9 09 01 00 00 jmpq 1150 <foo> <SNIP> 0000000000001140 <bar>: 1140: b8 ff ff ff ff mov $0xffffffff,%eax 1145: c3 retq <SNIP> 0000000000001150 <foo>: 1150: 31 c0 xor %eax,%eax 1152: e8 e9 ff ff ff callq 1140 <bar> 1157: 83 c0 01 add $0x1,%eax 115a: c3 retq <SNIP> $ perf record -o jmp-to-fn.perf.data -e intel_pt/cyc/u ./jmp-to-fn [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0,017 MB jmp-to-fn.perf.data ] $ perf script -i jmp-to-fn.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py jmp-to-fn.db branches calls 2019-01-08 13:24:58.783069 Creating database... 2019-01-08 13:24:58.794650 Writing records... 2019-01-08 13:24:59.008050 Adding indexes 2019-01-08 13:24:59.015802 Done $ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py jmp-to-fn.db Before: main -> bar After: main -> foo -> bar Committer testing: Install the python2-pyside package, then select these menu options on the GUI: "Reports" "Context sensitive callgraphs" Then go on expanding the symbols, to get, full picture when doing this on a fedora:29 with gcc version 8.2.1 20181215 (Red Hat 8.2.1-6) (GCC): jmp-to-fn PID:TID _start (ld-2.28.so) __libc_start_main main foo bar To verify that indeed, this fixes the problem. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190109091835.5570-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Make thread_stack__no_call_return() more readable by adding more local variables. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190109091835.5570-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
If 'cp' is checked in thread_stack__push_cp() a number of error checks can be removed, reducing code size and improving readability. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190109091835.5570-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Kallsyms symbols do not have a size, so the size becomes the distance to the next symbol. Consequently the recently added trampoline symbols end up with large sizes because the trampolines are some distance from one another and the main kernel map. However, symbols that end outside their map can disrupt the symbol tree because, after mapping, it can appear incorrectly that they overlap other symbols. Add logic to truncate symbol size to the end of the corresponding map. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Fixes: d83212d5 ("kallsyms, x86: Export addresses of PTI entry trampolines") Link: http://lkml.kernel.org/r/20190109091835.5570-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 William Cohen 提交于
Fix incorrect event names for the Load_Miss_Real_Latency metric for Cascadelake server in the same manner as commit 91b2b970 for SKL/SKX. Signed-off-by: NWilliam Cohen <wcohen@redhat.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190129170536.22510-1-wcohen@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Leo Yan 提交于
When return from exception, we need to distinguish if it's system call return or for other type exceptions for setting sample flags. Due to the exception return packet doesn't contain exception number, so we cannot decide sample flags based on exception number. On the other hand, the exception return packet is followed by an instruction range packet; this range packet deliveries the start address after exception handling, we can check if it is a SVC instruction just before the start address. If there has one SVC instruction is found ahead the return address, this means it's an exception return for system call; otherwise it is an normal return for other exceptions. This patch is to set sample flags for exception return packet, firstly it simply set sample flags as PERF_IP_FLAG_INTERRUPT for all exception returns since at this point it doesn't know what's exactly the exception type. We will defer to decide if it's an exception return for system call when the next instruction range packet comes, it checks if there has one SVC instruction prior to the start address and if so we will change sample flags to PERF_IP_FLAG_SYSCALLRET for system call return. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Robert Walker <robert.walker@arm.com> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: coresight ml <coresight@lists.linaro.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190129122842.32041-9-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Leo Yan 提交于
The exception taken and returning are typical flow for instruction jump but it needs to be handled with exception packets. This patch is to set sample flags for exception packet. Since the exception packet contains the exception number, according to the exception number this patch makes decision for belonging to which exception types. The decoder have defined different exception number for ETMv3 and ETMv4 separately, hence this patch needs firstly decide the ETM version by using the metadata magic number, and this patch adds helper function cs_etm__get_magic() for easily getting magic number. Based on different ETM version, the exception packet contains the exception number, according to the exception number this patch makes decision for the exception belonging to which exception types. In this patch, it introduces helper function cs_etm__is_svc_instr(); for ETMv4 CS_ETMV4_EXC_CALL covers SVC, SMC and HVC cases in the single exception number, thus need to use cs_etm__is_svc_instr() to decide an exception taken for system call. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: NRobert Walker <robert.walker@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: coresight ml <coresight@lists.linaro.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190129122842.32041-8-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Leo Yan 提交于
Add traceID in packet, thus we can use traceID to retrieve metadata pointer from traceID-metadata tuple. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Robert Walker <robert.walker@arm.com> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: coresight ml <coresight@lists.linaro.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190129122842.32041-7-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Leo Yan 提交于
If packet processing wants to know the packet is bound with which ETM version, it needs to access metadata to decide that based on metadata magic number; but we cannot simply to use CPU logic ID number as index to access metadata sequential array, especially when system have hotplugged off CPUs, the metadata array are only allocated for online CPUs but not offline CPUs, so the CPU logic number doesn't match with its index in the array. This patch is to change tuple from traceID-CPU# to traceID-metadata, thus it can use the tuple to retrieve metadata pointer according to traceID. For safe accessing metadata fields, this patch provides helper function cs_etm__get_cpu() which is used to return CPU number according to traceID; cs_etm_decoder__buffer_packet() is the first consumer for this helper function. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Robert Walker <robert.walker@arm.com> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: coresight ml <coresight@lists.linaro.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190129122842.32041-6-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-