- 27 3月, 2017 3 次提交
-
-
由 Jin Yao 提交于
It would be useful for perf to support a mode to query the inline stack for a given callgraph address. This would simplify finding the right code in code that does a lot of inlining. The srcline.c has contained the code which supports to translate the address to filename:line_nr. This patch just extends the function to let it support getting the inline stacks. It introduces the inline_list which will store the inline function result (filename:line_nr and funcname). If BFD lib is not supported, the result is only filename:line_nr. Signed-off-by: NYao Jin <yao.jin@linux.intel.com> Tested-by: NMilian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
Introduce dso__name() and filename_split() out of existing code because these codes will be used in several places in next patch. For filename_split(), it may also solve a potential memory leak in existing code. In existing addr2line(), sep = strchr(filename, ':'); if (sep) { *sep++ = '\0'; *file = filename; *line_nr = strtoul(sep, NULL, 0); ret = 1; } out: pclose(fp); return ret; If sep is NULL, filename is not freed or returned via file. Signed-off-by: NYao Jin <yao.jin@linux.intel.com> Tested-by: NMilian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Address filtering with kernel symbols incorrectly resulted in the error "Cannot determine size of symbol" because the no_size logic was the wrong way around. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NAndi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org # v4.9+ Link: http://lkml.kernel.org/r/1490357752-27942-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 3月, 2017 6 次提交
-
-
由 Andi Kleen 提交于
Move the printing of perf expressions and internal events to a new clearer --details flag, instead of lumping it together with other debug options in --debug. This makes it clearer to use. Before perf list --debug ... unc_m_power_critical_throttle_cycles [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc] uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100. after perf list --details ... unc_m_power_critical_throttle_cycles [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc] uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20170320201711.14142-14-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Add support for a new JSON event attribute to name MetricExpr for better output in perf stat. If the event has no MetricName it uses the normal event name instead to describe the metric. Before % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only time unc_p_freq_max_os_cycles 1.000149775 15.7 2.000344807 19.3 3.000502544 16.7 4.000640656 6.6 5.000779955 9.9 After % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only time freq_max_os_cycles % 1.000149775 15.7 2.000344807 19.3 3.000502544 16.7 4.000640656 6.6 5.000779955 9.9 Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-13-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Output the metric expr in perf list when --debug is specified, so that the user can check the formula. Before: % perf list ... unc_m_power_channel_ppd [Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd. Unit: uncore_imc] uncore_imc_2/event=0x85/ After: % perf list --debug ... unc_m_power_channel_ppd [Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd. Unit: uncore_imc] Perf: uncore_imc_2/event=0x85/ MetricExpr: (unc_m_power_channel_ppd / unc_m_clockticks) * 100. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-12-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Add generic infrastructure to perf stat to output ratios for "MetricExpr" entries in the event lists. Many events are more useful as ratios than in raw form, typically some count in relation to total ticks. Transfer the MetricExpr information from the alias to the evsel. We mark the events that need to be collected for MetricExpr, and also link the events using them with a pointer. The code is careful to always prefer the right event in the same group to minimize multiplexing errors. At the moment only a single relation is supported. Then add a rblist to the stat shadow code that remembers stats based on the cpu and context. Then finally update and retrieve and print these values similarly to the existing hardcoded perf metrics. We use the simple expression parser added earlier to evaluate the expression. Normally we just output the result without further commentary, but for --metric-only this would lead to empty columns. So for this case use the original event as description. There is no attempt to automatically add the MetricExpr event, if it is missing, however we suggest it to the user, because the user tool doesn't have enough information to reliably construct a group that is guaranteed to schedule. So we leave that to the user. % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' 1.000147889 800,085,181 unc_p_clockticks 1.000147889 93,126,241 unc_p_freq_max_os_cycles # 11.6 2.000448381 800,218,217 unc_p_clockticks 2.000448381 142,516,095 unc_p_freq_max_os_cycles # 17.8 3.000639852 800,243,057 unc_p_clockticks 3.000639852 162,292,689 unc_p_freq_max_os_cycles # 20.3 % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only # time freq_max_os_cycles % 1.000127077 0.9 2.000301436 0.7 3.000456379 0.0 v2: Change from DivideBy to MetricExpr v3: Use expr__ prefix. Support more than one other event. v4: Update description v5: Only print warning message once for multiple PMUs. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-11-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Add support for parsing the MetricExpr header in the JSON event lists and storing them in the alias structure. Used in the next patch. v2: Change DividedBy to MetricExpr v3: Really catch all uses of DividedBy Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-10-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Add a simple expression parser good enough to parse JSON relation expressions. The parser is implemented using bison. This is just intended as an simple parser for internal usage in the event lists, not the beginning of a "perf scripting language" v2: Use expr__ prefix instead of expr_ Support multiple free variables for parser Committer note: The v2 patch had: %define api.pure full In expr.y, that is a feature introduced in bison 2.7, to have reentrant parsers, not using global variables, which would make tools/perf stop building with the bison version shipped in older distros, so Andi realised that the other parsers (e.g. parse-events.y) were using: %pure-parser Which is present in older versions of bison and fits the bill. I added: CFLAGS_expr-bison.o += -DYYENABLE_NLS=0 -DYYLTYPE_IS_TRIVIAL=0 -w To finally make it build, copying what was there for pmu-bison.o, another parser. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-8-andi@firstfloor.org [ stdlib.h is needed in tests/expr.c for free() fixing build in systems such as ubuntu:16.04-x-s390 ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 3月, 2017 6 次提交
-
-
由 Andi Kleen 提交于
Special case uncore_ prefix in PMU match, to allow for shorter event uncore specifications. Before: perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 After perf stat -a -e cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 Committer tests: # perf list uncore List of pre-defined events (to be used in -e): uncore_cbox_0/clockticks/ [Kernel PMU event] uncore_cbox_1/clockticks/ [Kernel PMU event] uncore_imc/data_reads/ [Kernel PMU event] uncore_imc/data_writes/ [Kernel PMU event] # perf stat -a -e cbox_0/clockticks/ sleep 1 Performance counter stats for 'system wide': 281,474,976,653,084 cbox_0/clockticks/ 1.000870129 seconds time elapsed # Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20170320201711.14142-7-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
When the user specifies a pmu directly, expand it automatically with a prefix match for all available PMUs, similar as we do for the normal aliases now. This allows to specify attributes for duplicated boxes quickly. For example uncore_cbox_{0,6}/.../ can be now specified as uncore_cbox/.../ and it gets automatically expanded for all boxes. This generally makes it more concise to write uncore specifications, and also avoids the need to know the exact topology of the system. Before: % perf stat -a -e uncore_cbox_0/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_1/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_2/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_3/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_4/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_5/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 After: % perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 v2: Handle all bison rules. Move multi add code to separate function. Handle uncore_ prefix correctly. v3: Move parse_events_multi_pmu_add to separate patch. Move uncore prefix check to separate patch. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-6-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Factor out the PMU name matching in the event parser into a separate function, to use the same code for other grammar rules later. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-5-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
The uncore PMU has a lot of duplicated PMUs for different subsystems. When expanding an uncore alias we usually end up with a large number of identically named aliases, which makes perf stat output difficult to read. Automatically sum them up in perf stat, unless --no-merge is specified. This can be default because only the uncores generally have duplicated aliases. Other PMUs have unique names. Before: % perf stat --no-merge -a -e unc_c_llc_lookup.any sleep 1 Performance counter stats for 'system wide': 694,976 Bytes unc_c_llc_lookup.any 706,304 Bytes unc_c_llc_lookup.any 956,608 Bytes unc_c_llc_lookup.any 782,720 Bytes unc_c_llc_lookup.any 605,696 Bytes unc_c_llc_lookup.any 442,816 Bytes unc_c_llc_lookup.any 659,328 Bytes unc_c_llc_lookup.any 509,312 Bytes unc_c_llc_lookup.any 263,936 Bytes unc_c_llc_lookup.any 592,448 Bytes unc_c_llc_lookup.any 672,448 Bytes unc_c_llc_lookup.any 608,640 Bytes unc_c_llc_lookup.any 641,024 Bytes unc_c_llc_lookup.any 856,896 Bytes unc_c_llc_lookup.any 808,832 Bytes unc_c_llc_lookup.any 684,864 Bytes unc_c_llc_lookup.any 710,464 Bytes unc_c_llc_lookup.any 538,304 Bytes unc_c_llc_lookup.any 1.002577660 seconds time elapsed After: % perf stat -a -e unc_c_llc_lookup.any sleep 1 Performance counter stats for 'system wide': 2,685,120 Bytes unc_c_llc_lookup.any 1.002648032 seconds time elapsed v2: Split collect_aliases. Rename alias flag. v3: Make sure unsupported/not counted is always printed. v4: Factor out callback change into separate patch. v5: Move check for bad results here Move merged check into collect_data Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-3-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
The source code line number (lineno) needs to be kept in accross calls to symbol__parse_objdump_line() when parsing the output of 'objdump -l -dS', so that it can associate it with the instructions till the next line. See disasm_line__new() and struct disasm_line::line_nr. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-7hpx8f8ybdpiujceysaj229w@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Taeung Song 提交于
The 'grep -v "filename"' applied to the objdump command output cause a side effect eliminating filename:linenr of output of 'objdump -l' if the object file name and source file name are the same, fix it. E.g. the output of the following objdump command in symbol__disassemble(): $ objdump -l -d -S -C /home/taeung/hello --start-address=... /home/taeung/hello: file format elf64-x86-64 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:4 void main() { 400526: 55 push %rbp 400527: 48 89 e5 mov %rsp,%rbp /home/taeung/hello.c:5 ... But it uses grep -v "filename" e.g. "/home/taeung/hello" in the objdump command to remove the first line containing file name and file format ("/home/taeung/hello: file format elf64-x86-64"): Before: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello But this causes a side effect, removing filename:linenr too, because the object file and source file have the same name e.g. "/home/taueng/hello", "/home/taeung/hello.c" So more do a better match by using grep -v as below to correctly remove that first line: "/home/taeung/hello: file format elf64-x86-64" After: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello: Signed-off-by: NTaeung Song <treeze.taeung@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1489978617-31396-5-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 21 3月, 2017 4 次提交
-
-
由 Alexis Berlemont 提交于
An sdt probe can be associated with arguments but they were not passed to the user probe tracing interface (uprobe_events); this patch adapts the sdt argument descriptors according to the uprobe input format. As the uprobe parser does not support scaled address mode, perf will skip arguments which cannot be adapted to the uprobe format. Here are the results: $ perf buildid-cache -v --add test_sdt $ perf probe -x test_sdt sdt_libfoo:table_frob $ perf probe -x test_sdt sdt_libfoo:table_diddle $ perf record -e sdt_libfoo:table_frob -e sdt_libfoo:table_diddle test_sdt $ perf script test_sdt ... 666.255678: sdt_libfoo:table_frob: (4004d7) arg0=0 arg1=0 test_sdt ... 666.255683: sdt_libfoo:table_diddle: (40051a) arg0=0 arg1=0 test_sdt ... 666.255686: sdt_libfoo:table_frob: (4004d7) arg0=1 arg1=2 test_sdt ... 666.255689: sdt_libfoo:table_diddle: (40051a) arg0=3 arg1=4 test_sdt ... 666.255692: sdt_libfoo:table_frob: (4004d7) arg0=2 arg1=4 test_sdt ... 666.255694: sdt_libfoo:table_diddle: (40051a) arg0=6 arg1=8 Signed-off-by: NAlexis Berlemont <alexis.berlemont@gmail.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20161214000732.1710-3-alexis.berlemont@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexis Berlemont 提交于
During a "perf buildid-cache --add" command, the section ".note.stapsdt" of the "added" binary is scanned in order to list the available SDT markers available in a binary. The parts containing the probes arguments were left unscanned. The whole section is now parsed; the probe arguments are extracted for later use. Signed-off-by: NAlexis Berlemont <alexis.berlemont@gmail.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20161214000732.1710-2-alexis.berlemont@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ravi Bangoria 提交于
There are many SDT markers in powerpc whose uprobe definition goes beyond current MAX_CMDLEN, especially when target filename is long and sdt marker has long list of arguments. For example, definition of sdt marker method__compile__end: 8@17 8@9 8@10 -4@8 8@7 -4@6 8@5 -4@4 1@37(28) from file /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so is p:sdt_hotspot/method__compile__end /usr/lib/jvm/java-1.8.0-openjdk-\ 1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so:0x4c4e00\ arg1=%gpr17:u64 arg2=%gpr9:u64 arg3=%gpr10:u64 arg4=%gpr8:s32\ arg5=%gpr7:u64 arg6=%gpr6:s32 arg7=%gpr5:u64 arg8=%gpr4:s32\ arg9=+37(%gpr28):u8 'perf probe' fails with segfault for such markers. As the uprobe_events file accepts definitions up to 4094 characters(4096 - 2 (\n\0)), increase value of MAX_CMDLEN match that. Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170207054547.3690-1-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ravi Bangoria 提交于
'*ntevs' contains number of elements present in 'tevs' array. If there are no elements in array, 'tevs2' can be directly assigned to 'tevs' without allocating more space. So the condition should be '*ntevs == 0' not 'ntevs == 0'. Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 42bba263 ("perf probe: Allow wildcard for cached events") Link: http://lkml.kernel.org/r/20170308065908.4128-1-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 17 3月, 2017 1 次提交
-
-
由 Alexander Shishkin 提交于
This patch decodes the 'partial' flag in AUX records and prints a warning to the user, so that they don't have to guess why their PT traces contain gaps (or missing altogether): Warning: AUX data had gaps in it 8 times out of 8! Are you running a KVM guest in the background? Trying to be even more helpful, we will detect if the user's kvm driver sets up exclusive VMX root mode for the entire lifespan of the kvm process: Reloading kvm_intel module with vmm_exclusive=0 will reduce the gaps to only guest's timeslices. Note however, that you'll still have gaps in cpu-wide traces even with vmm_exclusive=0, but the number of gaps will be below 100% (as opposed to the above example). Currently this is the only reason for partial records. Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vince@deater.net> Link: http://lkml.kernel.org/r/8760j941ig.fsf@ashishki-desk.ger.corp.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 3月, 2017 3 次提交
-
-
由 Andi Kleen 提交于
Implement printing instruction sequences as hex dump for branch stacks. This relies on the x86 instruction decoder used by the PT decoder to find the lengths of instructions to dump them individually. This is good enough for pattern matching. This allows to study hot paths for individual samples, together with branch misprediction and cycle count / IPC information if available (on Skylake systems). % perf record -b ... % perf script -F brstackinsn ... read_hpet+67: ffffffff9905b843 insn: 74 ea # PRED ffffffff9905b82f insn: 85 c9 ffffffff9905b831 insn: 74 12 ffffffff9905b833 insn: f3 90 ffffffff9905b835 insn: 48 8b 0f ffffffff9905b838 insn: 48 89 ca ffffffff9905b83b insn: 48 c1 ea 20 ffffffff9905b83f insn: 39 f2 ffffffff9905b841 insn: 89 d0 ffffffff9905b843 insn: 74 ea # PRED Only works when no special branch filters are specified. Occasionally the path does not reach up to the sample IP, as the LBRs may be frozen before executing a final jump. In this case we print a special message. The instruction dumper piggy backs on the existing infrastructure from the IP PT decoder. An earlier iteration of this patch relied on a disassembler, but this version only uses the existing instruction decoder. Committer note: Added hint about how to get suitable perf.data files for use with '-F brstackinsm': $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ] $ $ perf script -F brstackinsn Display of branch stack assembler requested, but non all-branch filter set Hint: run 'perf record -b ...' $ Signed-off-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Stephane Eranian 提交于
This patch significantly improves the execution time of perf_event__synthesize_mmap_events() when running perf record on systems where processes have lots of threads. It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to generate each map line in the maps file. If you have 1000 threads, then you have necessarily 1000 stacks. For each vma, you need to check if it corresponds to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. As of today, perf does not use the fact that a mapping is a stack, therefore we can work around the issue by using /proc/pid/tasks/pid/maps. This entry does not try to map a vma to stack and is thus much faster with no loss of functonality. The proc-map-timeout logic is kept in case users still want some upper limit. In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual /proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this. Committer note: This problem seems to have been elliminated in the kernel since commit : b18cb64e ("fs/proc: Stop trying to report thread stacks"). Signed-off-by: NStephane Eranian <eranian@google.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ravi Bangoria 提交于
Factor out the SDT event name checking routine as is_sdt_event(). Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170314150658.7065-2-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 3月, 2017 4 次提交
-
-
由 Naveen N. Rao 提交于
We indicate support for accepting sym+offset with kretprobes through a line in ftrace README. Parse the same to identify support and choose the appropriate format for kprobe_events. As an example, without this perf patch, but with the ftrace changes: naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/tracing/README | grep kretprobe place (kretprobe): [<module>:]<symbol>[+<offset>]|<memaddr> naveen@ubuntu:~/linux/tools/perf$ naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return probe-definition(0): do_open%return symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /boot/vmlinux for symbols Open Debuginfo file: /boot/vmlinux Try to find probe point from debuginfo. Matched function: do_open [2d0c7d8] Probe point found: do_open+0 Matched function: do_open [35d76b5] found inline addr: 0xc0000000004ba984 Failed to find "do_open%return", because do_open is an inlined function and has no return point. An error occurred in debuginfo analysis (-22). Trying to use symbols. Opening /sys/kernel/debug/tracing//kprobe_events write=1 Writing event: r:probe/do_open do_open+0 Writing event: r:probe/do_open_1 do_open+0 Added new events: probe:do_open (on do_open%return) probe:do_open_1 (on do_open%return) You can now use it in all perf tools, such as: perf record -e probe:do_open_1 -aR sleep 1 naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list c000000000041370 k kretprobe_trampoline+0x0 [OPTIMIZED] c0000000004433d0 r do_open+0x0 [DISABLED] c0000000004433d0 r do_open+0x0 [DISABLED] And after this patch (and the subsequent powerpc patch): naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return probe-definition(0): do_open%return symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /boot/vmlinux for symbols Open Debuginfo file: /boot/vmlinux Try to find probe point from debuginfo. Matched function: do_open [2d0c7d8] Probe point found: do_open+0 Matched function: do_open [35d76b5] found inline addr: 0xc0000000004ba984 Failed to find "do_open%return", because do_open is an inlined function and has no return point. An error occurred in debuginfo analysis (-22). Trying to use symbols. Opening /sys/kernel/debug/tracing//README write=0 Opening /sys/kernel/debug/tracing//kprobe_events write=1 Writing event: r:probe/do_open _text+4469712 Writing event: r:probe/do_open_1 _text+4956248 Added new events: probe:do_open (on do_open%return) probe:do_open_1 (on do_open%return) You can now use it in all perf tools, such as: perf record -e probe:do_open_1 -aR sleep 1 naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list c000000000041370 k kretprobe_trampoline+0x0 [OPTIMIZED] c0000000004433d0 r do_open+0x0 [DISABLED] c0000000004ba058 r do_open+0x8 [DISABLED] Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/496ef9f33c1ab16286ece9dd62aa672807aef91c.1488961018.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Naveen N. Rao 提交于
Simplify and separate out the ftrace README scanning logic into a separate helper. This is used subsequently to scan for all patterns of interest and to cache the result. Since we are only interested in availability of probe argument type x, we will only scan for that. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/6dc30edc747ba82a236593be6cf3a046fa9453b5.1488961018.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Hari Bathini 提交于
This patch introduces a cgroup identifier entry field in perf report to identify or distinguish data of different cgroups. It uses the device number and inode number of cgroup namespace, included in perf data with the new PERF_RECORD_NAMESPACES event, as cgroup identifier. With the assumption that each container is created with it's own cgroup namespace, this allows assessment/analysis of multiple containers at once. A simple test for this would be to clone a few processes passing SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run different workloads on each of those contexts, while running perf record command with --namespaces option. Shown below is the output of perf report, sorted with cgroup identifier, on perf.data generated with the above test scenario, clearly indicating one context's considerable use of kernel memory in comparison with others: $ perf report -s cgroup_id,sample --stdio # # Total Lost Samples: 0 # # Samples: 5K of event 'kmem:kmalloc' # Event count (approx.): 5965 # # Overhead cgroup id (dev/inode) Samples # ........ ..................... ............ # 81.27% 3/0xeffffffb 4848 16.24% 3/0xf00000d0 969 1.16% 3/0xf00000ce 69 0.82% 3/0xf00000cf 49 0.50% 0/0x0 30 While this is a start, there is further scope of improving this. For example, instead of cgroup namespace's device and inode numbers, dev and inode numbers of some or all namespaces may be used to distinguish which processes are running in a given container context. Also, scripts to map device and inode info to containers sounds plausible for better tracing of containers. Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Hari Bathini 提交于
Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 3月, 2017 1 次提交
-
-
由 Hari Bathini 提交于
Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 3月, 2017 1 次提交
-
-
由 Changbin Du 提交于
Skip the sample which doesn't have branch_info to avoid segmentation fault: The fault can be reproduced by: perf record -a perf report -F cycles Signed-off-by: NChangbin Du <changbin.du@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 0e332f03 ("perf tools: Add support for cycles, weight branch_info field") Link: http://lkml.kernel.org/r/20170313083148.23568-1-changbin.du@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 3月, 2017 11 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Instead of trying to go on adding more ifdef conditions, do a feature test and define HAVE_SCHED_GETCPU_SUPPORT instead, then use it to provide the prototype. No need to change the stub, as it is already a __weak symbol. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-yge89er9g90sc0v6k0a0r5tr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Make system wide (-a) the default option if no target was specified and one of following conditions is met: - there's no workload specified (current behaviour) - there is workload specified but all requested events are system wide ones Mixed events core/uncore with workload: $ perf stat -e 'uncore_cbox_0/clockticks/,cycles' sleep 1 Performance counter stats for 'sleep 1': <not supported> uncore_cbox_0/clockticks/ 980,489 cycles 1.000897406 seconds time elapsed Uncore event with workload: $ perf stat -e 'uncore_cbox_0/clockticks/' sleep 1 Performance counter stats for 'system wide': 281,473,897,192,670 uncore_cbox_0/clockticks/ 1.000833784 seconds time elapsed Committer note: When testing I realized the default case for !root, i.e. no events passed via -e, was broke by v2 of this patch, reported and after a patch provided by Jiri it is back working: [acme@jouet linux]$ perf stat usleep 1 Performance counter stats for 'usleep 1': 0.401335 task-clock:u (msec) # 0.297 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 48 page-faults:u # 0.120 M/sec 458,146 cycles:u # 1.142 GHz 245,113 instructions:u # 0.54 insn per cycle 47,991 branches:u # 119.578 M/sec 4,022 branch-misses:u # 8.38% of all branches 0.001350029 seconds time elapsed [acme@jouet linux]$ Suggested-and-Tested-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170227094818.GA12764@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
$ perf test decoder 57: x86 instruction decoder - new instructions : FAILED! $ Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 80 78 56 34 12 bndstx %bnd0,0x12345678(%rax) Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 85 78 56 34 12 bndstx %bnd0,0x12345678(%rbp) Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 01 78 56 34 12 bndstx %bnd0,0x12345678(%rcx,%rax,1) Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 05 78 56 34 12 bndstx %bnd0,0x12345678(%rbp,%rax,1) Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 08 78 56 34 12 bndstx %bnd0,0x12345678(%rax,%rcx,1) There is missing initialization. It only affects the test because it is checking 'rel' even in cases where there is no value. Fix it. Reported-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/08c6ad07-7994-3e56-b20e-d75727ca7765@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Naveen N. Rao 提交于
Generalize probe event file open routine into a generic function for opening trace files. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/b580465c7a4dcd5d3b40fdf8568e6be45d0a6333.1487849577.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The cpu_map__snprint_mask() generates a string representation of a cpumask bitmap. For cpu 0 to 11, it'll return "fff". Committer notes: Fix compiler warning on some toolchains: 19 fedora:24-x-ARC-uClibc: FAIL CC /tmp/build/perf/util/cpumap.o util/cpumap.c: In function 'hex_char': util/cpumap.c:679:2: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (0 <= val && val <= 9) ^ cc1: all warnings being treated as errors Applying patch from Namhyung that makes function receive an 'unsigned char', that is what the callers are passing to this function. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170224011251.14946-2-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Charles Baylis 提交于
Add new sort key 'symbol_size' to allow user to sort by symbol size, or (more usefully) display the symbol size using --fields=...,symbol_size. Committer note: Testing it together with the recently added -q, to remove the headers, and using the '+' sign with -s, to add the symbol_size sort order to the default, which is '-s/--sort comm,dso,symbol': # perf report -q -s +symbol_size | head -10 10.39% swapper [kernel.vmlinux] [k] intel_idle 270 3.45% swapper [kernel.vmlinux] [k] update_blocked_averages 1546 2.61% swapper [kernel.vmlinux] [k] update_load_avg 1292 2.36% swapper [kernel.vmlinux] [k] update_cfs_shares 240 1.83% swapper [kernel.vmlinux] [k] __hrtimer_run_queues 606 1.74% swapper [kernel.vmlinux] [k] update_cfs_rq_load_avg. 1187 1.66% swapper [kernel.vmlinux] [k] apic_timer_interrupt 152 1.60% CPU 0/KVM [kvm] [k] kvm_set_msr_common 3046 1.60% gnome-shell libglib-2.0.so.0 [.] g_slist_find 37 1.46% gnome-termina libglib-2.0.so.0 [.] g_hash_table_lookup 370 # Signed-off-by: NCharles Baylis <charles.baylis@linaro.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1487943176-13840-1-git-send-email-charles.baylis@linaro.org [ Use symbol__size(), remove needless %lld + (long long) casting ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
This is an odd refcount use case, so add some more comments to help understand that when it hits zero it really means that the mmap()ed area (on a perf_event_open() returned fd) has been munmap()ed. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170223162344.GD3595@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Elena Reshetova 提交于
The refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Kook <keescook@chromium.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Windsor <dwindsor@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: alsa-devel@alsa-project.org Link: http://lkml.kernel.org/r/1487691303-31858-10-git-send-email-elena.reshetova@intel.com [ Did missing tests/thread-map.c conversion ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Elena Reshetova 提交于
The refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Kook <keescook@chromium.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Windsor <dwindsor@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: alsa-devel@alsa-project.org Link: http://lkml.kernel.org/r/1487691303-31858-9-git-send-email-elena.reshetova@intel.com [ Did missing conversion in __machine__remove_thread() ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Elena Reshetova 提交于
The refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Kook <keescook@chromium.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Windsor <dwindsor@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: alsa-devel@alsa-project.org Link: http://lkml.kernel.org/r/1487691303-31858-8-git-send-email-elena.reshetova@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Elena Reshetova 提交于
The refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Kook <keescook@chromium.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Windsor <dwindsor@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: alsa-devel@alsa-project.org Link: http://lkml.kernel.org/r/1487691303-31858-7-git-send-email-elena.reshetova@intel.com [ Did the missing conversion of tests/thread-mg-share.c too ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-