- 01 10月, 2015 13 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
If the user doesn't specify any event, try the most precise "cycles" available, i.e. start by "cycles:ppp" and go on removing "p" till it works. E.g. $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (11 samples) ] $ perf evlist cycles:pp $ perf evlist -v cycles:pp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 $ grep 'model name' /proc/cpuinfo | head -1 model name : Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz $ When 'cycles' appears explicitely is specified this will not be tried, i.e. the user has full control of the level of precision to be used: $ perf record -e cycles usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data (9 samples) ] $ perf evlist cycles $ perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://www.youtube.com/watch?v=nXaxk27zwlk Link: http://lkml.kernel.org/n/tip-b1ywebmt22pi78vjxau01wth@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
So that one can, for instance, use it with wc -l: # perf list *:*write* | wc -l 60 Or to look for the "bio" tracepoints, without 'perf list' headers: # perf list *:*bio* | head block:block_bio_backmerge [Tracepoint event] block:block_bio_bounce [Tracepoint event] block:block_bio_complete [Tracepoint event] block:block_bio_frontmerge [Tracepoint event] block:block_bio_queue [Tracepoint event] block:block_bio_remap [Tracepoint event] # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ts7sc0x8u4io4cifzkup4j44@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Masami Hiramatsu 提交于
perf probe shows more precisely message when it finds given %return target function is inlined. Without this fix: ---- # ./perf probe -V getname_flags%return Return probe must be on the head of a real function. Debuginfo analysis failed. Error: Failed to show vars. ---- With this fix: ---- # ./perf probe -V getname_flags%return Failed to find "getname_flags%return", because getname_flags is an inlined function and has no return point. Debuginfo analysis failed. Error: Failed to show vars. ---- Suggested-by: NArnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20150930164137.3733.55055.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Masami Hiramatsu 提交于
perf probe --list will get a segfault if the first kprobe event is on a module and the second or latter one is on the kernel. e.g. ---- # ./perf probe -q -m pcspkr pcspkr_event # ./perf probe -q vfs_read # ./perf probe -l Segmentation fault (core dumped) ---- This is because the debuginfo_cache fails to handle NULL module name, which causes segfault on strcmp. (Note that strcmp("something", NULL) always causes segfault) To fix this debuginfo_cache__open always translates the NULL module name to "kernel" (this is correct, because NULL module name means opening the debuginfo for the kernel) ---- # ./perf probe -l probe:pcspkr_event (on pcspkr_event@drivers/input/misc/pcspkr.c in pcspkr) probe:vfs_read (on vfs_read@ksrc/linux-3/fs/read_write.c) ---- Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20150930164135.3733.23993.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Masami Hiramatsu 提交于
Perf probe always failed to find appropriate line numbers because of failing to find .text start address offset from debuginfo. e.g. ---- # ./perf probe -m pcspkr pcspkr_event:5 Added new events: probe:pcspkr_event (on pcspkr_event:5 in pcspkr) probe:pcspkr_event_1 (on pcspkr_event:5 in pcspkr) You can now use it in all perf tools, such as: perf record -e probe:pcspkr_event_1 -aR sleep 1 # ./perf probe -l Failed to find debug information for address ffffffffa031f006 Failed to find debug information for address ffffffffa031f016 probe:pcspkr_event (on pcspkr_event+6 in pcspkr) probe:pcspkr_event_1 (on pcspkr_event+22 in pcspkr) ---- This fixes the above issue as below. 1. Get the relative address of the symbol in .text by using map->start. 2. Adjust the address by adding the offset of .text section in the kernel module binary. With this fix, perf probe -l shows lines correctly. ---- # ./perf probe -l probe:pcspkr_event (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr) probe:pcspkr_event_1 (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr) ---- Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20150930164132.3733.24643.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Masami Hiramatsu 提交于
Fix a trival bug about libdwfl usage of the report session, it should explicitly begin and end a report session around dwfl_report_offline(). Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20150930164128.3733.59876.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Masami Hiramatsu 提交于
Fix to remove dot suffix (e.g. .const, .isra) from the second or latter events which has suffix numbers. Since the previous commit 35a23ff9 ("perf probe: Cut off the gcc optimization postfixes from function name") didn't care about the suffix numbered events, therefore we'll have an error when we add additional events on the same dot suffix functions. e.g. ---- # ./perf probe -f -a get_sigframe.isra.2.constprop.3 \ -a get_sigframe.isra.2.constprop.3 Failed to write event: Invalid argument Error: Failed to add events. ---- This fixes above issue as below: ---- # ./perf probe -f -a get_sigframe.isra.2.constprop.3 \ -a get_sigframe.isra.2.constprop.3 Added new events: probe:get_sigframe (on get_sigframe.isra.2.constprop.3) probe:get_sigframe_1 (on get_sigframe.isra.2.constprop.3) You can now use it in all perf tools, such as: perf record -e probe:get_sigframe_1 -aR sleep 1 ---- Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20150930164130.3733.26573.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
It is about binding, not type, we have just a letter in kallsyms that should map both for the ELF type (STT_FUNC, etc) and to the ELF symbol binding (STB_WEAK, STB_GLOBAL, etc), so rename it now before introducing kallsyms2_elf_type() Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uu5vj343ms1q2wm55690on6v@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
And it is also a step in the direction of killing the separation of data and text maps in map_groups. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-rrds86kb3wx5wk8v38v56gw8@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
In places where we were using its open coded equivalent. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-khkdugcdoqy3tkszm3jdxgbe@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Sukadev Bhattiprolu 提交于
The perf_regs.c file does not get built on Powerpc as CONFIG_PERF_REGS is false. So the weak definition for 'sample_regs_masks' doesn't get picked up. Adding perf_regs.o to util/Build unconditionally, exposes a redefinition error for 'perf_reg_value()' function (due to the static inline version in util/perf_regs.h). So use #ifdef HAVE_PERF_REGS_SUPPORT' around that function. Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20150930182836.GA27858@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Out of map_groups__find_symbol_by_name(), so that we can turn this later one first into a call to maps__find_symbol_by_name(MAP__FUNCTION) + MAP__VARIABLE, and then to just one call, we'll merge MAP__FUNCTION with MAP__VARIABLE maps, to simplify the code. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-pvkar0jacqn92g148u9sqttt@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The error variable breaks build on CentOS 6.7, due to a collision with a global error symbol: CC util/parse-events.o cc1: warnings being treated as errors util/parse-events.c:419: error: declaration of ‘error’ shadows a global declaration util/util.h:135: error: shadowed declaration is here util/parse-events.c: In function ‘add_tracepoint_multi_event’: ... Using different argument names instead to fix it. Reported-by: NVinson Lee <vlee@twopensource.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linux-tip-commits@vger.kernel.org Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Raphael Beamonte <raphael.beamonte@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150929150531.GI27383@krava.redhat.com [ Fix one more case, at line 770 ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 9月, 2015 18 次提交
-
-
由 He Kuang 提交于
This patch enables config terms for tracepoint perf events. Valid terms for tracepoint events are 'call-graph' and 'stack-size', so we can use different callgraph settings for each event and eliminate unnecessary overhead. Here is an example for using different call-graph config for each tracepoint. $ perf record -e syscalls:sys_enter_write/call-graph=fp/ -e syscalls:sys_exit_write/call-graph=no/ dd if=/dev/zero of=test bs=4k count=10 $ perf report --stdio # # Total Lost Samples: 0 # # Samples: 13 of event 'syscalls:sys_enter_write' # Event count (approx.): 13 # # Children Self Command Shared Object Symbol # ........ ........ ....... .................. ...................... # 76.92% 76.92% dd libpthread-2.20.so [.] __write_nocancel | ---__write_nocancel 23.08% 23.08% dd libc-2.20.so [.] write | ---write | |--33.33%-- 0x2031342820736574 | |--33.33%-- 0xa6e69207364726f | --33.33%-- 0x34202c7320393039 ... # Samples: 13 of event 'syscalls:sys_exit_write' # Event count (approx.): 13 # # Children Self Command Shared Object Symbol # ........ ........ ....... .................. ...................... # 76.92% 76.92% dd libpthread-2.20.so [.] __write_nocancel 23.08% 23.08% dd libc-2.20.so [.] write 7.69% 0.00% dd [unknown] [.] 0x0a6e69207364726f 7.69% 0.00% dd [unknown] [.] 0x2031342820736574 7.69% 0.00% dd [unknown] [.] 0x34202c7320393039 Signed-off-by: NHe Kuang <hekuang@huawei.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1443412336-120050-4-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 He Kuang 提交于
Adds rules for parsing tracepoint names. Change rules of tracepoint which derives from PE_NAMEs into tracepoint names directly, so adding more rules based on tracepoint names will be easier. Changes v2-v3: - Change __event_legacy_tracepoint label in bison file to tracepoint_name - Fix formats error. Signed-off-by: NHe Kuang <hekuang@huawei.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1443412336-120050-3-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 He Kuang 提交于
Show proper error message and show valid terms when wrong config terms is specified for hw/sw type perf events. This patch makes the original error format function formats_error_string() more generic, which only outputs the static config terms for hw/sw perf events, and prepends pmu formats for pmu events. Before this patch: $ perf record -e 'cpu-clock/freqx=200/' -a sleep 1 invalid or unsupported event: 'cpu-clock/freqx=200/' Run 'perf list' for a list of valid events usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events After this patch: $ perf record -e 'cpu-clock/freqx=200/' -a sleep 1 event syntax error: 'cpu-clock/freqx=200/' \___ unknown term valid terms: config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size Run 'perf list' for a list of valid events usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: NHe Kuang <hekuang@huawei.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1443412336-120050-2-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 He Kuang 提交于
Currently, function config_term() is used for checking config terms of all types of events, while unknown terms is not reported as an error because pmu events have valid terms in sysfs. But this is wrong when unknown terms are specificed to hw/sw events. This patch Adds the config_term callback so we can use separate check routines for each type of events. Signed-off-by: NHe Kuang <hekuang@huawei.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1443412336-120050-1-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
autofdo incorrectly expects branch flags to include either mispred or predicted. In fact mispred = predicted = 0 is valid and means the flags are not supported, which they aren't by Intel PT. To make autofdo work, add a config option which will cause Intel PT decoder to set the mispred flag on all branches. Below is an example of using Intel PT with autofdo. The example is also added to the Intel PT documentation. It requires autofdo (https://github.com/google/autofdo) and gcc version 5. The bubble sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial) amended to take the number of elements as a parameter. $ gcc-5 -O3 sort.c -o sort_optimized $ ./sort_optimized 30000 Bubble sorting array of 30000 elements 2254 ms $ cat ~/.perfconfig [intel-pt] mispred-all $ perf record -e intel_pt//u ./sort 3000 Bubble sorting array of 3000 elements 58 ms [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 3.939 MB perf.data ] $ perf inject -i perf.data -o inj --itrace=i100usle --strip $ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1 $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo $ ./sort_autofdo 30000 Bubble sorting array of 30000 elements 2155 ms Note there is currently no advantage to using Intel PT instead of LBR, but that may change in the future if greater use is made of the data. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-26-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Add a counterpart to perf_evlist__add() that does the opposite and deletes the evsel. This will be used by perf inject to remove unwanted evsels. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-23-git-send-email-adrian.hunter@intel.com [ Renamed it from perf_evlist__del() to perf_evlist__remove() and removed the perf_evsel__delete() call ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
perf_evlist__id2evsel_strict() is the same as perf_evlist__id2evsel() except that it ensures that the id must match. This will be used by perf inject to find a specific evsel that is to be deleted, hence the need to match exactly. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-22-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Use the scripting_max_stack value to allow for values greater than PERF_MAX_STACK_DEPTH. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-20-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Add a setting for maximum stack depth in preparation for allowing for synthesized callchains. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-19-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Use the max_stack value instead of PERF_MAX_STACK_DEPTH so that arbitrary-sized callchains can be supported. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-17-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Add support for generating branch stack context for PT samples. The decoder reports a configurable number of branches as branch context for each sample. Internally it keeps track of them by using a simple sliding window. We also flush the last branch buffer on each sample to avoid overlapping intervals. This is useful for: - Reporting accurate basic block edge frequencies through the perf report branch view - Using with --branch-history to get the wider context of samples - Other users of LBRs Also the Documentation is updated. Examples: Record with Intel PT: perf record -e intel_pt//u ls Branch stacks are used by default if synthesized so: perf report --itrace=ile is the same as: perf report --itrace=ile -b Branch history can be requested also: perf report --itrace=igle --branch-history Based-on-patch-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-15-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
intel_pt_synth_branch_sample() skips synthesizing if the branch does not match the branch filter. That logic was sitting in the middle of the function but is more efficiently placed at the start of the function, so move it. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-14-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Add AUX area tracing option 'l' to synthesize branch stacks on samples just like sample type PERF_SAMPLE_BRANCH_STACK. This is taken into use by Intel PT in a subsequent patch. Based-on-patch-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-9-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
By default 'perf record' will postprocess the perf.data file to determine build-ids. When that happens, the number of lost perf events is displayed. Make that also happen for AUX events. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-7-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Logging is only used for debugging. Use macros to save calling into the functions only to return immediately when logging is not enabled. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-5-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
TSC packets contain only 7 bytes of TSC. The 8th byte is assumed to change so infrequently that its value can be inferred. However the logic must cater for a 7 byte wraparound, which it does by adding 1 to the top byte. The existing code was doing that with a while loop even though the addition should only need to be done once. That logic won't work (will loop forever) if TSC wraps around at the 8th byte. Theoretically that would take at least 10 years, unless something else went wrong. And what else could go wrong. Well, if the chunks of trace data are processed out of order, it will make it look like the 7-byte TSC has gone backwards (i.e. wrapped). If that happens 256 times then stuck in the while loop it will be. Fix that by getting rid of the unnecessary while loop. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-4-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Instruction tracing options (i.e. --itrace) include an option for sampling instructions at an arbitrary period. e.g. --itrace=i10us means make an 'instructions' sample for every 10us of trace. Currently the logic does not distinguish between a period of zero and no period being specified at all, so it gets treated as the default period which is 100000. That doesn't really make sense. Fix it so that zero period is accepted and treated as meaning "as often as possible". In the case of Intel PT that is the same as a period of 1 and a unit of 'instructions' (i.e. --itrace=i1i). Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-2-git-send-email-adrian.hunter@intel.com [ Add a few lines describing this in the Documentation/intel-pt.txt file ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Equivalent and removes one more case of using dso->kernel. # perf record -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.768 MB perf.data (30 samples) ] Before: [root@zoo ~]# perf script --show-task --show-mmap | head -3 swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa0000000(0xa000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/acpi/video.ko swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa000a000(0x5000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/i2c/algos/i2c-algo-bit.ko # # perf script --show-task --show-mmap | head -3 swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa0000000(0xa000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/acpi/video.ko swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa000a000(0x5000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/i2c/algos/i2c-algo-bit.ko # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-b65xe578dwq22mzmmj5y94wr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 9月, 2015 2 次提交
-
-
由 Adrian Hunter 提交于
A copy of /proc/kcore containing the kernel text can be made to the buildid cache. e.g. perf buildid-cache -v -k /proc/kcore To workaround objdump limitations, a copy is also made when annotating against /proc/kcore. The copying process stops working from libelf about v1.62 onwards (the problem was found with v1.63). The cause is that a call to gelf_getphdr() in kcore__add_phdr() fails because additional validation has been added to gelf_getphdr(). The use of gelf_getphdr() is a misguided attempt to get default initialization of the Gelf_Phdr structure. That should not be necessary because every member of the Gelf_Phdr structure is subsequently assigned. So just remove the call to gelf_getphdr(). Similarly, a call to gelf_getehdr() in gelf_kcore__init() can be removed also. Committer notes: Note to stable@kernel.org, from Adrian in the cover letter for this patchkit: The "Fix copying of /proc/kcore" problem goes back to v3.13 if you think it is important enough for stable. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1443089122-19082-3-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We have map_groups__find_by_name() to look at the list of modules that are in place for a given machine, so use it instead of traversing the machine dso list, which also includes DSOs for userspace. When merging the user and kernel DSO lists a bug was introduced where 'perf probe' stopped being able to add probes to modules using its short name: # perf probe -m usbnet --add usbnet_start_xmit usbnet_start_xmit is out of .text, skip it. Error: Failed to add events. # With this fix it works again: # perf probe -m usbnet --add usbnet_start_xmit Added new event: probe:usbnet_start_xmit (on usbnet_start_xmit in usbnet) You can now use it in all perf tools, such as: perf record -e probe:usbnet_start_xmit -aR sleep 1 # Reported-by: NWang Nan <wangnan0@huawei.com> Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Fixes: 3d39ac53 ("perf machine: No need to have two DSOs lists") Link: http://lkml.kernel.org/r/20150924015008.GE1897@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 9月, 2015 1 次提交
-
-
由 Namhyung Kim 提交于
When perf creates a new child to profile, the events are enabled on exec(). And in this case, it doesn't synthesize any event for the child since they'll be generated during exec(). But there's an window between the enabling and the event generation. It used to be overcome since samples are only in kernel (so we always have the map) and the comm is overridden by a later COMM event. However it won't work if events are processed and displayed before the COMM event overrides like in 'perf script'. This leads to those early samples (like native_write_msr_safe) not having a comm but pid (like ':15328'). So it needs to synthesize COMM event for the child explicitly before enabling so that it can have a correct comm. But at this time, the comm will be "perf" since it's not exec-ed yet. Committer note: Before this patch: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task-events :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429 usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4. usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so) usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429) After: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task perf 0 0.000000: PERF_RECORD_COMM: perf:8446/8446 perf 8446 30154.038944: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038948: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038949: 9 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038950: 230 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038951: 6772 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 8446 30154.038952: PERF_RECORD_COMM exec: usleep:8446/8446 usleep 8446 30154.038954: 196923 cycles: ffffffff81766440 _raw_spin_lock (/lib/modules/4.3.0-rc1 usleep 8446 30154.039021: 2292130 cycles: 7f609a173dc4 memcpy (/usr/lib64/ld-2.20.so) usleep 8446 30154.039349: PERF_RECORD_EXIT(8446:8446):(8446:8446) # Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1442881495-2928-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 9月, 2015 1 次提交
-
-
由 Wang Nan 提交于
Don't blindly retrieve and use a last element in the lists returned by parse_events__scanner(), as it may have collected no entries, i.e. return an empty list. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1441523623-152703-2-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 9月, 2015 3 次提交
-
-
由 Mark Rutland 提交于
If a session contains no events, we can get stuck in an infinite loop in __perf_session__process_events, with a non-zero file_size and data_offset, but a zero data_size. In this case, we can mmap the entirety of the file (consisting of the file and attribute headers), and fetch_mmaped_event will correctly refuse to read any (unmapped and non-existent) event headers. This causes __perf_session__process_events to unmap the file and retry with the exact same parameters, getting stuck in an infinite loop. This has been observed to result in an exit-time hang when counting rare/unschedulable events with perf record, and can be triggered artificially with the script below: ---- #!/bin/sh printf "REPRO: launching perf\n"; ./perf record -e software/config=9/ sleep 1 & PERF_PID=$!; sleep 0.002; kill -2 $PERF_PID; printf "REPRO: waiting for perf (%d) to exit...\n" "$PERF_PID"; wait $PERF_PID; printf "REPRO: perf exited\n"; ---- To avoid this, have __perf_session__process_events bail out early when the file has no data (i.e. it has no events). Commiter note: I only managed to reproduce this when setting /proc/sys/kernel/kptr_restrict to '1' and changing the code to purposefully not process any samples and no synthesized samples, i.e. kptr_restrict prevents 'record' from synthesizing the kernel mmaps for vmlinux + modules and since it is a workload started from perf, we don't synthesize mmap/comm records for existing threads. Adrian Hunter managed to reproduce it in his environment tho. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Tested-by: NArnaldo Carvalho de Melo <acme@kernel.org> Tested-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1442423929-12253-1-git-send-email-mark.rutland@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Peter Senna Tschudin 提交于
Returning a negative value for a boolean function seem to have the undesired effect of returning true. Replace -1 by false in a bool-returning function. The diff of the .s file before and after the change (for x86_64): 3907c3907 < movl $1, %ebx --- > xorl %ebx, %ebx while if -1 is replaced by true, the diff is empty. This issue was found by the following Coccinelle semantic patch: <smpl> @@ identifier f; constant C; typedef bool; @@ bool f (...){ <+... * return -C; ...+> } </smpl> Signed-off-by: NPeter Senna Tschudin <peter.senna@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Milos Vyletel <milos@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1442484533-19742-1-git-send-email-peter.senna@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
This reverts commit f785f235. We have a test to check if elf_getphdrnum() is present, so, if it fails, we'll get: [acme@rhel5 linux]$ cat /tmp/build/perf/feature/test-libelf-getphdrnum.make.output cc1: warnings being treated as errors test-libelf-getphdrnum.c: In function ‘main’: test-libelf-getphdrnum.c:7: warning: implicit declaration of function ‘elf_getphdrnum’ [acme@rhel5 linux]$ And this block will not be compiled: #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT static int elf_getphdrnum(Elf *elf, size_t *dst) ... #endif So, if elf_getphdrnum() is being defined somewhere, there is a problem with the test that is not detecting that function, go fix it. Reported-by: NVinson Lee <vlee@twopensource.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Victor Kamensky <victor.kamensky@linaro.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qn459fal6acvcvm50i8zxx9k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 17 9月, 2015 1 次提交
-
-
由 Stephane Eranian 提交于
Per-pkg events need to be captured once per processor socket. The code in check_per_pkg() ensures only one value per processor package is used. However there is a problem with this function in case the first CPU of the package does not measure anything for the per-pkg event, but other CPUs do. Consider the following: $ create cgroup FOO; echo $$ >FOO/tasks; taskset -c 1 noploop & $ perf stat -a -I 1000 -e intel_cqm/llc_occupancy/ -G FOO sleep 100 1.00000 <not counted> Bytes intel_cqm/llc_occupancy/ FOO The reason for this is that CPU0 in the cgroup has nothing running on it. Yet check_per_plg() will mark socket0 as processed and no other event value will be considered for the socket. This patch fixes the problem by having check_per_pkg() only consider events which actually ran. Signed-off-by: NStephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1441286620-10117-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 9月, 2015 1 次提交
-
-
由 Adrian Hunter 提交于
Fix it by making it call perf_evlist__set_maps() instead of setting the maps itself. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1441699142-18905-13-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-