- 27 6月, 2017 2 次提交
-
-
由 Adrian Hunter 提交于
Add a field to display the content the raw_data of a synthesized event. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-22-git-send-email-adrian.hunter@intel.com [ Resolved conflict with 106dacd8 ("perf script: Support -F brstackoff,dso") ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Instruction trace decoders such as Intel PT may have additional information recorded in the trace. For example, Intel PT has power information and a there is a new instruction 'ptwrite' that can write a value into a PTWRITE trace packet. Such information may be associated with an IP and so can be treated as a sample (PERF_RECORD_SAMPLE). Custom data can be incorporated in the sample as raw_data (PERF_SAMPLE_RAW). However a means of identifying the raw data format is needed. That will be done by synthesizing an attribute for it. So add an attribute type for custom synthesized events. Different synthesized events will be identified by the attribute 'config'. Committer notes: Start those PERF_TYPE_ after the PMU range, i.e. after (INT_MAX + 1U), i.e. after perf_pmu_register() -> idr_alloc(end=0). Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1498040239-32418-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 21 6月, 2017 1 次提交
-
-
由 Adrian Hunter 提交于
Fix message because field list option is -F not -f. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-20-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 6月, 2017 3 次提交
-
-
由 Mark Santaniello 提交于
The idea here is to make AutoFDO easier in cloud environment with ASLR. It's easiest to show how this is useful by example. I built a small test akin to "while(1) { do_nothing(); }" where the do_nothing function is loaded from a dso: $ cat burncpu.cpp #include <dlfcn.h> int main() { void* handle = dlopen("./dso.so", RTLD_LAZY); if (!handle) return -1; typedef void (*fp)(); fp do_nothing = (fp) dlsym(handle, "do_nothing"); while(1) { do_nothing(); } } $ cat dso.cpp extern "C" void do_nothing() {} $ cat build.sh #!/bin/bash g++ -shared dso.cpp -o dso.so g++ burncpu.cpp -o burncpu -ldl I sampled the execution of this program with perf record -b. Using the existing "brstack,dso", we get absolute addresses that are affected by ASLR, and could be different on different hosts. The address does not uniquely identify a branch/target in the binary: $ perf script -F brstack,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1 0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0 Using the existing "brstacksym,dso" is a little better, because the symbol plus offset and dso name *does* uniquely identify a branch/target in the binary. Ultimately, however, AutoFDO wants a simple offset into the binary, so we'd have to undo all the work perf did to symbolize in the first place: $ perf script -F brstacksym,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1 do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0 With the new "brstackoff,dso" we get what we need: a simple offset into a specific dso/binary that uniquely identifies a branch/target: $ perf script -F brstackoff,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1 0x6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0 Signed-off-by: NMark Santaniello <marksan@fb.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170619163825.2012979-2-marksan@fb.com [ Updated documentation about 'brstackoff' using text from above ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mark Santaniello 提交于
Perf script can report the dso for "addr" and "ip" fields. This adds the same support for the "brstack" and "brstacksym" fields. This can be helpful for AutoFDO: we can ignore LBR entries unless the source and target address are both in the target module we are about to build. I built a small test akin to "while(1) { do_nothing(); }" where the do_nothing function is loaded from a dso: $ cat burncpu.cpp #include <dlfcn.h> int main() { void* handle = dlopen("./dso.so", RTLD_LAZY); if (!handle) return -1; typedef void (*fp)(); fp do_nothing = (fp) dlsym(handle, "do_nothing"); while(1) { do_nothing(); } } $ cat dso.cpp extern "C" void do_nothing() {} $ cat build.sh #!/bin/bash g++ -shared dso.cpp -o dso.so g++ burncpu.cpp -o burncpu -ldl I sampled the execution with perf record -b. Using the new perf script functionality I can easily find cases where there was a transition from one dso to another: $ perf record -a -b -- sleep 5 [ perf record: Woken up 55 times to write data ] [ perf record: Captured and wrote 18.815 MB perf.data (43593 samples) ] $ perf script -F brstack,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1 0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0 $ perf script -F brstacksym,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1 do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0 Signed-off-by: NMark Santaniello <marksan@fb.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170619163825.2012979-1-marksan@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
With 'perf script' it is common that we just want to add or remove a field. Currently this requires figuring out the long list of default fields and specifying them first, and then adding/removing the new field. This patch adds a new + - syntax to merely add or remove fields, that allows more succint and clearer command lines For example to remove the comm field from PMU samples: Previously $ perf script -F tid,cpu,time,event,sym,ip,dso,period | head -1 swapper 0 [000] 504345.383126: 1 cycles: ffffffff90060c66 native_write_msr ([kernel.kallsyms]) with the new syntax perf script -F -comm | head -1 0 [000] 504345.383126: 1 cycles: ffffffff90060c66 native_write_msr ([kernel.kallsyms]) The new syntax cannot be mixed with normal overriding. v2: Fix example in description. Use tid vs pid. No functional changes. v3: Don't skip initialization when user specified explicit type. v4: Rebase. Remove empty line. Committer testing: # perf record -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.748 MB perf.data (14 samples) ] Without a explicit field list specified via -F, defaults to: # perf script | head -2 perf 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) swapper 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) # Which is equivalent to: # perf script -F comm,tid,cpu,time,period,event,ip,sym,dso | head -2 perf 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) swapper 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) # So if we want to remove the comm, as in your original example, we would have to figure out the default field list and remove ' comm' from it: # perf script -F tid,cpu,time,period,event,ip,sym,dso | head -2 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) # With your patch this becomes simpler, one can remove fields by prefixing them with '-': # perf script -F -comm | head -2 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) # Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Tested-by: NMilian Wolff <milian.wolff@kdab.com> Link: http://lkml.kernel.org/r/20170602154810.15875-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 5月, 2017 1 次提交
-
-
由 Namhyung Kim 提交于
The --inline option is to show inlined functions in callchains. For example: $ perf script a.out 5644 11611.467597: 309961 cycles:u: 790 main (/home/namhyung/tmp/perf/a.out) 20511 __libc_start_main (/usr/lib/libc-2.25.so) 8ba _start (/home/namhyung/tmp/perf/a.out) ... $ perf script --inline a.out 5644 11611.467597: 309961 cycles:u: 790 main (/home/namhyung/tmp/perf/a.out) std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > main 20511 __libc_start_main (/usr/lib/libc-2.25.so) 8ba _start (/home/namhyung/tmp/perf/a.out) ... Reviewed-and-tested-by: NMilian Wolff <milian.wolff@kdab.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-5-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 4月, 2017 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Not needed in this header, added to the places that need FILE, putchar(), access() and a few other prototypes. Link: http://lkml.kernel.org/n/tip-xxtdsl6nsna82j7puwbdjqhs@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
As it is going away from util.h, where it is not needed. This is mostly for things like MAXPATHLEN, MAX() and MIN(), these later two probably should go away in favor of its kernel sources replacements. Link: http://lkml.kernel.org/n/tip-z1666f3fl3fqobxvjr5o2r39@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 21 4月, 2017 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
And remove it from util.h, disentangling it a bit more. Link: http://lkml.kernel.org/n/tip-2zg9s5nx90yde64j3g4z2uhk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 4月, 2017 7 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The files using the dirent.h routines should instead include it, reducing the includes hell that lead to longer build times. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-42g2f4z6nfg7mdb2ae97n7tj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Removing it from util.h, part of an effort to disentangle the includes hell, that makes changes to util.h or something included by it to cause a complete rebuild of the tools. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ztrjy52q1rqcchuy3rubfgt2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Moving them from util.h, where they don't belong. Since libc already have string.h, name it slightly differently, as string2.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-eh3vz5sqxsrdd8lodoro4jrw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Continuing the split of util.[ch] into more manageable bits. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5eu367rwcwnvvn7fz09l7xpb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
More stuff that came from git, out of the hodge-podge that is util.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-e3lana4gctz3ub4hn4y29hkw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Needed to use the PRI[xu](32,64) formatting macros. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wkbho8kaw24q67dd11q0j39f@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
To pave the way for further cleanups where linux/kernel.h may stop being included in some header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qqxan6tfsl6qx3l0v3nwgjvk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 4月, 2017 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Just avoiding non-reentrant functions. Cc: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/n/tip-eqytykipd74epzl9aexvppcg@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 27 3月, 2017 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
We got it from the git sources but never used it for anything, with the place where this would be somehow used remaining: static int run_builtin(struct cmd_struct *p, int argc, const char **argv) { prefix = NULL; if (p->option & RUN_SETUP) prefix = NULL; /* setup_perf_directory(); */ Ditch it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uw5swz05vol0qpr32c5lpvus@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 3月, 2017 1 次提交
-
-
由 Andi Kleen 提交于
Implement printing instruction sequences as hex dump for branch stacks. This relies on the x86 instruction decoder used by the PT decoder to find the lengths of instructions to dump them individually. This is good enough for pattern matching. This allows to study hot paths for individual samples, together with branch misprediction and cycle count / IPC information if available (on Skylake systems). % perf record -b ... % perf script -F brstackinsn ... read_hpet+67: ffffffff9905b843 insn: 74 ea # PRED ffffffff9905b82f insn: 85 c9 ffffffff9905b831 insn: 74 12 ffffffff9905b833 insn: f3 90 ffffffff9905b835 insn: 48 8b 0f ffffffff9905b838 insn: 48 89 ca ffffffff9905b83b insn: 48 c1 ea 20 ffffffff9905b83f insn: 39 f2 ffffffff9905b841 insn: 89 d0 ffffffff9905b843 insn: 74 ea # PRED Only works when no special branch filters are specified. Occasionally the path does not reach up to the sample IP, as the LBRs may be frozen before executing a final jump. In this case we print a special message. The instruction dumper piggy backs on the existing infrastructure from the IP PT decoder. An earlier iteration of this patch relied on a disassembler, but this version only uses the existing instruction decoder. Committer note: Added hint about how to get suitable perf.data files for use with '-F brstackinsm': $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ] $ $ perf script -F brstackinsn Display of branch stack assembler requested, but non all-branch filter set Hint: run 'perf record -b ...' $ Signed-off-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 3月, 2017 1 次提交
-
-
由 Hari Bathini 提交于
Introduce a new option to display events of type PERF_RECORD_NAMESPACES and update perf-script documentation accordingly. Shown below is output (trimmed) of perf script command with the newly introduced option, on perf.data generated with perf record command using --namespaces option. $ perf script --show-namespace-events swapper 0 [000] 0.000000: PERF_RECORD_NAMESPACES 1/1 - nr_namespaces: 7 [0/net: 3/0xf000001c, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] swapper 0 [000] 0.000000: PERF_RECORD_NAMESPACES 2/2 - nr_namespaces: 7 [0/net: 3/0xf000001c, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] Commiter notes: Testing it: Investigating that double PERF_RECORD_NAMESPACES for the 19155 pid/tid... Its more than that, there are two PERF_RECORD_COMM as well, and with zeroed timestamps, so probably a synthesizing artifact... # perf script --show-task --show-namespace <SNIP> perf 0 [000] 0.000000: PERF_RECORD_COMM: perf:19154/19154 perf 0 [000] 0.000000: PERF_RECORD_FORK(19155:19155):(19154:19154) perf 0 [000] 0.000000: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] perf 0 [000] 0.000000: PERF_RECORD_COMM: perf:19155/19155 perf 0 [000] 0.000000: PERF_RECORD_COMM: perf:19155/19155 perf 0 [000] 0.000000: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] swapper 0 [000] 3110.881834: 1 cycles: ffffffffa7060bf6 native_write_msr (/lib/modules/4.11.0-rc1+/build/vmlinux) <SNIP> Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891932627.25309.1941587059154176221.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 3月, 2017 1 次提交
-
-
由 Hari Bathini 提交于
Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 17 1月, 2017 1 次提交
-
-
由 Yannick Brosseau 提交于
In 2059fc7a ("perf symbols: Allow forcing reading of non-root owned files by root") 'perf report' was added the option of forcing reading of non-root owned symbol file. This add the same behavior for perf script. Reported-by: NMark Drayton <mbd@fb.com> Signed-off-by: NYannick Brosseau <scientist@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20170113182527.18625-1-scientist@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 12月, 2016 1 次提交
-
-
由 David Ahern 提交于
Add option to allow user to control analysis window. e.g., collect data for some amount of time and analyze a segment of interest within that window. Committer notes: Testing it: # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # # perf script --hide-call-graph | head -15 swapper 0 [0] 9693.370039: 1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370044: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370046: 7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370048: 126 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370049: 2701 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370051: 58823 cycles:ppp: ffffffffb90cd2e0 idle_cpu (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370059: 1 cycles:ppp: ffffffffb91a713a ctx_resched (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370062: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370064: 13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370065: 250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370067: 5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux) perf 5124 [2] 9693.370076: 1 cycles:ppp: ffffffffb91a76c1 __perf_event_enable (.../4.8.8-300.fc25.x86_64/vmlinux) perf 5124 [2] 9693.370091: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux) perf 5124 [2] 9693.370095: 3 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) # # perf script --hide-call-graph --time ,9693.370048 swapper 0 [0] 9693.370039: 1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370044: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370046: 7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) # perf script --hide-call-graph --time 9693.370064,9693.370076 swapper 0 [1] 9693.370064: 13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370065: 250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370067: 5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux) # Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-4-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 30 11月, 2016 1 次提交
-
-
由 David Ahern 提交于
Allow user to specify list of symbols which cause the dump of callchains to stop at that symbol. Committer notes: Testing it: # perf record -ag usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ] # # # Without it: # # perf script swapper 0 [000] 9693.370039: 1 cycles:ppp: 2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 137ffeb start_kernel ([kernel.vmlinux].init.text) 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text) 137f419 x86_64_start_kernel ([kernel.vmlinux].init.text) swapper 0 [000] 9693.370044: 1 cycles:ppp: 20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 137ffeb start_kernel ([kernel.vmlinux].init.text) 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text) # # # Using it to see just what are the calls from the 'remote_function' function: # # perf script --stop-bt remote_function swapper 0 [000] 9693.370039: 1 cycles:ppp: 2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [000] 9693.370044: 1 cycles:ppp: 20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480104021-36275-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 28 10月, 2016 1 次提交
-
-
由 Namhyung Kim 提交于
Joonwoo reported that there's a mismatch between timestamps in script and sched commands. This was because of difference in printing the timestamp. Factor out the code and share it so that they can be in sync. Also I found that sched map has similar problem, fix it too. Committer notes: Fixed the max_lat_at bug introduced by Namhyung's original patch, as pointed out by Joonwoo, and made it a function following the scnprintf() model, i.e. returning the number of bytes formatted, and receiving as the first parameter the object from where the data to the formatting is obtained, renaming it from: char *timestamp_in_usec(char *bf, size_t size, u64 timestamp) to int timestamp__scnprintf_usec(u64 timestamp, char *bf, size_t size) Reported-by: NJoonwoo Park <joonwoop@codeaurora.org> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20161024020246.14928-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 10月, 2016 1 次提交
-
-
由 Andi Kleen 提交于
When looking at Intel PT traces with perf script it is useful to have some indication of the instruction. Dump the instruction bytes and instruction length, which can be used for simple pattern analysis in scripts. % perf record -e intel_pt// foo % perf script --itrace=i0ns -F ip,insn,insnlen ffffffff8101232f ilen: 5 insn: 0f 1f 44 00 00 ffffffff81012334 ilen: 1 insn: 5b ffffffff81012335 ilen: 1 insn: 5d ffffffff81012336 ilen: 1 insn: c3 ffffffff810123e3 ilen: 1 insn: 5b ffffffff810123e4 ilen: 2 insn: 41 5c ffffffff810123e6 ilen: 1 insn: 5d ffffffff810123e7 ilen: 1 insn: c3 ffffffff810124a6 ilen: 2 insn: 31 c0 ffffffff810124a8 ilen: 9 insn: 41 83 bc 24 a8 01 00 00 01 ffffffff810124b1 ilen: 2 insn: 75 87 ... Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NAdrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/1475847747-30994-4-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 05 9月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
We're not using it anymore, few users were, but we really could do without it, simplify lots of functions by removing it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-1zng8wdznn00iiz08bb7q3vn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 8月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
And remove it from tools/perf/{perf,util}.h, making code that needs these macros to include linux/time64.h instead, to match how this is used in the kernel sources. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-e69fc1pvkgt57yvxqt6eunyg@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 8月, 2016 2 次提交
-
-
由 He Kuang 提交于
Because perf data from pipe do not have a header with evsel attr, we should not check that and disable symbol_conf.use_callchain. Otherwise, perf script won't show callchains even if the data stream contains callchain. Before: $ perf record -g -o - uname |perf script Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] uname 1828 182630.186578: 250000 cpu-clock: ..b9499 setup_arg_pages uname 1828 182630.186850: 250000 cpu-clock: ..83b20 ___might_sleep uname 1828 182630.187153: 250000 cpu-clock: ..4b6be file_map_prot_ch ... After: $ perf record -g -o - uname |perf script Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] uname 1833 182675.927099: 250000 cpu-clock: ba5520 _raw_spin_lock+0xfe200040 ([kernel.kallsyms]) 389dd4 expand_downwards+0xfe200154 ([kernel.kallsyms]) 389f34 expand_stack+0xfe200024 ([kernel.kallsyms]) 3b957e setup_arg_pages+0xfe20019e ([kernel.kallsyms]) 40c80f load_elf_binary+0xfe20042f ([kernel.kallsyms]) ... Signed-off-by: NHe Kuang <hekuang@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1470309943-153909-2-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 He Kuang 提交于
Perf shows the usage message when perf scripts folder failed to open, which misleads users to let them think the command is being mistyped. This patch shows a proper message and guides users to check the PERF_EXEC_PATH environment variable in that case. Before: $ perf script --list Usage: perf script [<options>] or: perf script [<options>] record <script> [<record-options>] <command> or: perf script [<options>] report <script> [script-args] or: perf script [<options>] <script> [<record-options>] <command> or: perf script [<options>] <top-script> [script-args] -l, --list list available scripts After: $ perf script --list open(/home/user/perf-core/scripts) failed. Check for "PERF_EXEC_PATH" env to set scripts dir. Signed-off-by: NHe Kuang <hekuang@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1470309943-153909-1-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 8月, 2016 1 次提交
-
-
由 Brendan Gregg 提交于
This adds the 'bpf-output' field to the perf script usage message, and docs. Signed-off-by: NBrendan Gregg <bgregg@netflix.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1470192469-11910-4-git-send-email-bgregg@netflix.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 6月, 2016 2 次提交
-
-
由 Adrian Hunter 提交于
Based on patches from Andi Kleen. When printing PT instruction traces with perf script it is rather useful to see some indentation for the call tree. This patch adds a new callindent field to perf script that prints spaces for the function call stack depth. We already have code to track the function call stack for PT, that we can reuse with minor modifications. The resulting output is not quite as nice as ftrace yet, but a lot better than what was there before. Note there are some corner cases when the thread stack gets code confused and prints incorrect indentation. Even with that it is fairly useful. When displaying kernel code traces it is recommended to run as root, as otherwise perf doesn't understand the kernel addresses properly, and may not reset the call stack correctly on kernel boundaries. Example output: sudo perf-with-kcore record eg2 -a -e intel_pt// -- sleep 1 sudo perf-with-kcore script eg2 --ns -F callindent,time,comm,pid,sym,ip,addr,flags,cpu --itrace=cre | less ... swapper 0 [000] 5830.389116586: call irq_exit ffffffff8104d620 smp_call_function_single_interrupt+0x30 => ffffffff8107e720 irq_exit swapper 0 [000] 5830.389116586: call idle_cpu ffffffff8107e769 irq_exit+0x49 => ffffffff810a3970 idle_cpu swapper 0 [000] 5830.389116586: return idle_cpu ffffffff810a39b7 idle_cpu+0x47 => ffffffff8107e76e irq_exit swapper 0 [000] 5830.389116586: call tick_nohz_irq_exit ffffffff8107e7bd irq_exit+0x9d => ffffffff810f2fc0 tick_nohz_irq_exit swapper 0 [000] 5830.389116919: call __tick_nohz_idle_enter ffffffff810f2fe0 tick_nohz_irq_exit+0x20 => ffffffff810f28d0 __tick_nohz_idle_enter swapper 0 [000] 5830.389116919: call ktime_get ffffffff810f28f1 __tick_nohz_idle_enter+0x21 => ffffffff810e9ec0 ktime_get swapper 0 [000] 5830.389116919: call read_tsc ffffffff810e9ef6 ktime_get+0x36 => ffffffff81035070 read_tsc swapper 0 [000] 5830.389116919: return read_tsc ffffffff81035084 read_tsc+0x14 => ffffffff810e9efc ktime_get swapper 0 [000] 5830.389116919: return ktime_get ffffffff810e9f46 ktime_get+0x86 => ffffffff810f28f6 __tick_nohz_idle_enter swapper 0 [000] 5830.389116919: call sched_clock_idle_sleep_event ffffffff810f290b __tick_nohz_idle_enter+0x3b => ffffffff810a7380 sched_clock_idle_sleep_event swapper 0 [000] 5830.389116919: call sched_clock_cpu ffffffff810a738b sched_clock_idle_sleep_event+0xb => ffffffff810a72e0 sched_clock_cpu swapper 0 [000] 5830.389116919: call sched_clock ffffffff810a734d sched_clock_cpu+0x6d => ffffffff81035750 sched_clock swapper 0 [000] 5830.389116919: call native_sched_clock ffffffff81035754 sched_clock+0x4 => ffffffff81035640 native_sched_clock swapper 0 [000] 5830.389116919: return native_sched_clock ffffffff8103568c native_sched_clock+0x4c => ffffffff81035759 sched_clock swapper 0 [000] 5830.389116919: return sched_clock ffffffff8103575c sched_clock+0xc => ffffffff810a7352 sched_clock_cpu swapper 0 [000] 5830.389116919: return sched_clock_cpu ffffffff810a7356 sched_clock_cpu+0x76 => ffffffff810a7390 sched_clock_idle_sleep_event swapper 0 [000] 5830.389116919: return sched_clock_idle_sleep_event ffffffff810a7391 sched_clock_idle_sleep_event+0x11 => ffffffff810f2910 __tick_nohz_idle_enter ... Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1466689258-28493-4-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
The flags field is synthesized and may have a value when Instruction Trace decoding. The flags are "bcrosyiABEx" which stand for branch, call, return, conditional, system, asynchronous, interrupt, transaction abort, trace begin, trace end, and in transaction, respectively. Change the display so that known combinations of flags are printed more nicely e.g.: "call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b", "int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs", "async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB", "tr end" for "bE". However the "x" flag will be displayed separately in those cases e.g. "jcc (x)" for a condition branch within a transaction. Example: perf record -e intel_pt//u ls perf script --ns -F comm,cpu,pid,tid,time,ip,addr,sym,dso,symoff,flags ... ls 3689/3689 [001] 2062.020965237: jcc 7f06a958847a _dl_sysdep_start+0xfa (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9588450 _dl_sysdep_start+0xd0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965237: jmp 7f06a9588461 _dl_sysdep_start+0xe1 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95885a0 _dl_sysdep_start+0x220 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965237: jmp 7f06a95885a4 _dl_sysdep_start+0x224 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9588470 _dl_sysdep_start+0xf0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965904: call 7f06a95884c3 _dl_sysdep_start+0x143 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9589140 brk+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965904: syscall 7f06a958914a brk+0xa (/lib/x86_64-linux-gnu/ld-2.19.so) => 0 [unknown] ([unknown]) ls 3689/3689 [001] 2062.020966237: tr strt 0 [unknown] ([unknown]) => 7f06a958914c brk+0xc (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: return 7f06a9589165 brk+0x25 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95884c8 _dl_sysdep_start+0x148 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: jcc 7f06a95884d7 _dl_sysdep_start+0x157 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: call 7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a958ac50 strlen+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: jcc 7f06a958ac6e strlen+0x1e (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a958ac60 strlen+0x10 (/lib/x86_64-linux-gnu/ld-2.19.so) ... Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1466689258-28493-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 6月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To match the semantics for list.h in the kernel, that are used to implement those macros. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qbcjlgj0ffxquxscahbpddi3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 6月, 2016 1 次提交
-
-
由 He Kuang 提交于
There's a display inconsistency when there are multiple tracepoint events, some of which have the 'call-graph' config option set but the first one hasn't, i.e. the whole logic for call graph processing is enabled only if the first tracepoint event has call-graph set. For instance, if we record signal_deliver with call-graph and signal_generate without: $ perf record -g -a -e signal:signal_deliver -e signal:signal_generate/call-graph=no/ [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ] $ perf script kworker/u2:1 13 [000] 6563.875949: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1313 grp=1 res=0 ff61cc __send_signal+0x3ec ([kernel.kallsyms]) perf 1313 [000] 6563.877584: signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000 7ffff314 get_signal+0x80007f0023a4 ([kernel.kallsyms]) 7fffe358 do_signal+0x80007f002028 ([kernel.kallsyms]) 7fffa5e8 exit_to_usermode_loop+0x80007f002053 ([kernel.kallsyms]) ... Then we exchange the order of these two events in commandline, and keep signal_generate without call-graph. $ perf record -g -a -e signal:signal_generate/call-graph=no/ -e signal:signal_deliver [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ] $ perf script kworker/u2:2 1314 [000] 6933.353060: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1321 grp=1 res=0 perf 1321 [000] 6933.353872: signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000 This time, the callchain of the event signal_deliver disappeared. The problem is caused by that perf only checks for the first evsel in evlist and decides if callchain should be printed. This patch traverses all evsels in evlist to see if any of them have callchains, and shows the right result: $ perf script kworker/u2:2 1314 [000] 6933.353060: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1321 grp=1 res=0 ff61cc __send_signal+0x3ec ([kernel.kallsyms]) perf 1321 [000] 6933.353872: signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000 7ffff314 get_signal+0x80007f0023a4 ([kernel.kallsyms]) 7fffe358 do_signal+0x80007f002028 ([kernel.kallsyms]) 7fffa5e8 exit_to_usermode_loop+0x80007f002053 ([kernel.kallsyms]) ... Signed-off-by: NHe Kuang <hekuang@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1463374279-97209-1-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 5月, 2016 2 次提交
-
-
由 He Kuang 提交于
This patch moves the reference of buildid dir to 'symfs/.debug' and skips the local buildid dir when '--symfs' is given, so that every single file opened by perf is relative to symfs directory now. Signed-off-by: NHe Kuang <hekuang@huawei.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1463658462-85131-2-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We cannot limit processing stacks from the current value of the sysctl, as we may be processing perf.data files, possibly from other machines. Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can be overriden using --max-stack or equivalent. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Fixes: 4cb93446 ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack") Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 5月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case in 'perf script', so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-mt3xz7n2hl49ni2vx7kuq74g@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 27 4月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
There is an upper limit to what tooling considers a valid callchain, and it was tied to the hardcoded value in the kernel, PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl, make it read it and use that as the upper limit, falling back to PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-