- 06 12月, 2016 14 次提交
-
-
由 Jiri Olsa 提交于
Putting extra line between dependencies and cmd_* definition to make it more readable. Before: $ cat .builtin-top.o.cmd ... /home/jolsa/kernel/linux-perf/tools/include/linux/stringify.h \ /home/jolsa/kernel/linux-perf/tools/include/linux/time64.h cmd_builtin-top.o := gcc -Wp,-MD,./.builtin-top.o.d -Wp,-MT,builtin-... ... After: $ cat .builtin-top.o.cmd ... /home/jolsa/kernel/linux-perf/tools/include/linux/stringify.h \ /home/jolsa/kernel/linux-perf/tools/include/linux/time64.h cmd_builtin-top.o := gcc -Wp,-MD,./.builtin-top.o.d -Wp,-MT,builtin-... ... Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1480884178-8072-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
After this patch, perf utilizes builtin clang support to build BPF script, no longer depend on external clang, but fallbacking to it if for some reason the builtin compiling framework fails. Test: $ type clang -bash: type: clang: not found $ cat ~/.perfconfig $ echo '#define LINUX_VERSION_CODE 0x040700' > ./test.c $ cat ./tools/perf/tests/bpf-script-example.c >> ./test.c $ ./perf record -v --dry-run -e ./test.c 2>&1 | grep builtin bpf: successfull builtin compilation $ Can't pass cflags so unable to include kernel headers now. Will be fixed by following commits. Committer notes: Make sure '-v' comes before the '-e ./test.c' in the command line otherwise the 'verbose' variable will not be set when the bpf event is parsed and thus the pr_debug indicating a 'successfull builtin compilation' will not be output, as the debug level (1) will be less than what 'verbose' has at that point (0). Signed-off-by: NWang Nan <wangnan0@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-16-wangnan0@huawei.com [ Spell check/reflow successfull pr_debug string ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
getBPFObjectFromModule() is introduced to compile LLVM IR(Module) to BPF object. Add new testcase for it. Test result: $ ./buildperf/perf test -v clang 51: builtin clang support : 51.1: builtin clang compile C source to IR : --- start --- test child forked, pid 21822 test child finished with 0 ---- end ---- builtin clang support subtest 0: Ok 51.2: builtin clang compile C source to ELF object : --- start --- test child forked, pid 21823 test child finished with 0 ---- end ---- builtin clang support subtest 1: Ok Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-15-wangnan0@huawei.com [ Remove redundant "Test" from entry descriptions ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Allow C++ code to use util.h and tests/llvm.h. Let 'perf test' compile a real BPF script. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-14-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Improve getModuleFromSource() API to accept a cflags list. This feature will be used to pass LINUX_VERSION_CODE and -I flags. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-13-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Utilize clang's OverlayFileSystem facility, allow CompilerInstance to access real file system. With this patch the '#include' directive can be used. Add a new getModuleFromSource for real file. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-12-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Add basic clang support in clang.cpp and test__clang() testcase. The first testcase checks if builtin clang is able to generate LLVM IR. tests/clang.c is a proxy. Real testcase resides in utils/c++/clang-test.cpp in c++ and exports C interface to perf test subsystem. Test result: $ perf test -v clang 51: builtin clang support : 51.1: Test builtin clang compile C source to IR : --- start --- test child forked, pid 13215 test child finished with 0 ---- end ---- Test builtin clang support subtest 0: Ok Committer note: Make sure you've enabled CLANG and LLVM builtin support by setting the LIBCLANGLLVM variable on the make command line, e.g.: make LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin Otherwise you'll get this when trying to do the 'perf test' call above: # perf test clang 51: builtin clang support : Skip (not compiled in) # Signed-off-by: NWang Nan <wangnan0@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-11-wangnan0@huawei.com [ Removed "Test" from descriptions, redundant and already removed from all the other entries ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Add necessary c++ flags and link libraries to support builtin clang and LLVM. Add all llvm and clang libraries, so don't need to worry about clang changes its libraries setting. However, linking perf would take much longer than usual. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-10-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Check if basic clang compiling environment is ready. Doesn't like 'llvm-config --libs' which can returns llvm libraries in right order and duplicates some libraries if necessary, there's no correspondence for clang libraries (-lclangxxx). to avoid extra complexity and to avoid new clang breaking libraries ordering, use --start-group and --end-group. In this test case, manually identify required clang libs and hope it to be stable. Putting all clang libraries here is possible (use make's wildcard), but then feature checking becomes very slow. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-9-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Check if basic LLVM compiling environment is ready. Use llvm-config to detect include and library directories. Avoid using 'llvm-config --cxxflags' because its result contain some unwanted flags like --sysroot (if LLVM is built by yocto). Use '?=' to set LLVM_CONFIG, so explicitly passing LLVM_CONFIG to make would override it. Use 'llvm-config --libs BPF' to check if BPF backend is compiled in. Since now BPF bytecode is the only required backend, no need to waste time linking llvm and clang if BPF backend is missing. This also introduce an implicit requirement that LLVM should be new enough. Old LLVM doesn't support BPF backend. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-8-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
The following commits will use builtin clang to compile BPF scripts. llvm__get_kbuild_opts() and llvm__get_nr_cpus() are extracted to help building '-DKERNEL_VERSION_CODE' and '-D__NR_CPUS__' macros. Doing object dumping in bpf loader, so further builtin clang compiling needn't consider it. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-7-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Pass a pointer to perf hook functions so they receive context information during setup. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-6-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Peter Foley 提交于
Clang doesn't support multiple arguments being passed to -Wp, so split them. Fixes this error: HOSTCC tools/objtool/fixdep.o cat: tools/objtool/.fixdep.o.d: No such file or directory Signed-off-by: NPeter Foley <pefoley2@pefoley.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20161128024346.17371-1-pefoley2@pefoley.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The fixdep tool, among other things, replaces the target of the object in the gcc generated dependency output file. The parsing code assumes there's only single target in the rule but this is not always the case as described in here: https://gcc.gnu.org/ml/gcc-help/2016-11/msg00099.html Make the fixdep code smart enough to skip all the possible targets. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NPeter Foley <pefoley2@pefoley.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20161201130025.GA16430@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 12月, 2016 8 次提交
-
-
由 Kim Phillips 提交于
This is a regex converted version from the original: https://lkml.org/lkml/2016/5/19/461 Add basic support to recognise AArch64 assembly. This allows perf to identify AArch64 instructions that branch to other parts within the same function, thereby properly annotating them. Rebased onto new cross-arch annotation bits: https://lkml.org/lkml/2016/11/25/546 Sample output: security_file_permission vmlinux 5.80 │ ← ret ▒ │70: ldr w0, [x21,#68] ▒ 4.44 │ ↓ tbnz d0 ▒ │ mov w0, #0x24 // #36 ▒ 1.37 │ ands w0, w22, w0 ▒ │ ↑ b.eq 60 ▒ 1.37 │ ↓ tbnz e4 ▒ │ mov w19, #0x20000 // #131072 ▒ 1.02 │ ↓ tbz ec ▒ │90:┌─→ldr x3, [x21,#24] ▒ 1.37 │ │ add x21, x21, #0x10 ▒ │ │ mov w2, w19 ▒ 1.02 │ │ mov x0, x21 ▒ │ │ mov x1, x3 ▒ 1.71 │ │ ldr x20, [x3,#48] ▒ │ │→ bl __fsnotify_parent ▒ 0.68 │ │↑ cbnz 60 ▒ │ │ mov x2, x21 ▒ 1.37 │ │ mov w1, w19 ▒ │ │ mov x0, x20 ▒ 0.68 │ │ mov w5, #0x0 // #0 ▒ │ │ mov x4, #0x0 // #0 ▒ 1.71 │ │ mov w3, #0x1 // #1 ▒ │ │→ bl fsnotify ▒ 1.37 │ │↑ b 60 ▒ │d0:│ mov w0, #0x0 // #0 ▒ │ │ ldp x19, x20, [sp,#16] ▒ │ │ ldp x21, x22, [sp,#32] ▒ │ │ ldp x29, x30, [sp],#48 ▒ │ │← ret ▒ │e4:│ mov w19, #0x10000 // #65536 ▒ │ └──b 90 ◆ │ec: brk #0x800 ▒ Press 'h' for help on key bindings Signed-off-by: NKim Phillips <kim.phillips@arm.com> Signed-off-by: NChris Ryder <chris.ryder@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20161130092344.012e18e3e623bea395162f95@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kim Phillips 提交于
Presume neglected in commit 786c1b51 "perf annotate: Start supporting cross arch annotation". This doesn't fix a bug since none of the affected arches support parsing dec/inc instructions yet. Signed-off-by: NKim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Ryder <chris.ryder@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20161130092333.1cca5dd2c77e1790d61c1e9c@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Add option to allow user to control analysis window. e.g., collect data for time window and analyze a segment of interest within that window. Committer notes: Testing it: Using the perf.data file captured via 'perf kmem record': # perf report --header-only # ======== # captured on: Tue Nov 29 16:01:53 2016 # hostname : jouet # os release : 4.8.8-300.fc25.x86_64 # perf version : 4.9.rc6.g5a6aca # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz # cpuid : GenuineIntel,6,61,4 # total memory : 20254660 kB # cmdline : /home/acme/bin/perf kmem record usleep 1 # event : name = kmem:kmalloc, , id = { 931980, 931981, 931982, 931983 }, type = 2, size = 112, config = 0x1b9, { sample_period, sample_freq } = 1, sample_typ # event : name = kmem:kmalloc_node, , id = { 931984, 931985, 931986, 931987 }, type = 2, size = 112, config = 0x1b7, { sample_period, sample_freq } = 1, sampl # event : name = kmem:kfree, , id = { 931988, 931989, 931990, 931991 }, type = 2, size = 112, config = 0x1b5, { sample_period, sample_freq } = 1, sample_type # event : name = kmem:kmem_cache_alloc, , id = { 931992, 931993, 931994, 931995 }, type = 2, size = 112, config = 0x1b8, { sample_period, sample_freq } = 1, s # event : name = kmem:kmem_cache_alloc_node, , id = { 931996, 931997, 931998, 931999 }, type = 2, size = 112, config = 0x1b6, { sample_period, sample_freq } = # event : name = kmem:kmem_cache_free, , id = { 932000, 932001, 932002, 932003 }, type = 2, size = 112, config = 0x1b4, { sample_period, sample_freq } = 1, sa # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: cpu = 4, intel_pt = 7, intel_bts = 6, uncore_arb = 13, cstate_pkg = 15, breakpoint = 5, uncore_cbox_1 = 12, power = 9, software = 1, uncore_im # HEADER_CACHE info available, use -I to display # missing features: HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT # ======== # # # Looking at just the histogram entries for the first event: # # perf report | head -33 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 40 of event 'kmem:kmalloc' # Event count (approx.): 40 # # Overhead Trace output # ........ ............................................................................................................... # 37.50% call_site=ffffffffb91ad3c7 ptr=0xffff88895fc05000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL 10.00% call_site=ffffffffb9258416 ptr=0xffff888a1dc61f00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO 7.50% call_site=ffffffffb9258416 ptr=0xffff888a2640ac00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO 2.50% call_site=ffffffffb92759ba ptr=0xffff888a26776000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb9276864 ptr=0xffff8886f6b82600 bytes_req=136 bytes_alloc=192 gfp_flags=GFP_KERNEL|__GFP_ZERO 2.50% call_site=ffffffffb9276903 ptr=0xffff888aefcf0460 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb92ad0ce ptr=0xffff888756c98a00 bytes_req=392 bytes_alloc=512 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb92ad0ce ptr=0xffff888756c9ba00 bytes_req=504 bytes_alloc=512 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb92ad301 ptr=0xffff888a31747600 bytes_req=128 bytes_alloc=128 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb92ad511 ptr=0xffff888a9d26a2a0 bytes_req=28 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c11a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c12c0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c1540 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c15a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c15e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c16e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c1c20 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb936a7fb ptr=0xffff888a9d26a2a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL 2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931240 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO 2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931980 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO 2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931a00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO # # # And then limiting using the example for 'perf kmem stat --time' used # # in the previous changeset committer note we see that there were no # # kmem:kmalloc in that last part of the file, but there were some # # kmem:kmem_cache_alloc ones: # # perf report --time 20119.782088, --stdio # # Total Lost Samples: 0 # # Samples: 0 of event 'kmem:kmalloc' # Event count (approx.): 0 # # Overhead Trace output # ........ ............ # # Samples: 0 of event 'kmem:kmalloc_node' # Event count (approx.): 0 # # Overhead Trace output # ........ ............ # # Samples: 0 of event 'kmem:kfree' # Event count (approx.): 0 # # Overhead Trace output # ........ ............ # # Samples: 8 of event 'kmem:kmem_cache_alloc' # Event count (approx.): 8 # # Overhead Trace output # ........ .................................................................................................................. # 75.00% call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO 12.50% call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK 12.50% call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO # Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-7-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Add option to allow user to control analysis window. e.g., collect data for time window and analyze a segment of interest within that window. Committer notes: Testing it: # perf kmem record usleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 1.540 MB perf.data (2049 samples) ] # perf evlist kmem:kmalloc kmem:kmalloc_node kmem:kfree kmem:kmem_cache_alloc kmem:kmem_cache_alloc_node kmem:kmem_cache_free # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # # # Use 'perf script' to get a first approach, select a chunk for then using # # with 'perf kmem stat --time' # # perf script | tail -15 usleep 9889 [0] 20119.782088: kmem:kmem_cache_free: (selinux_file_free_security+0x27) call_site=ffffffffb936aa07 ptr=0xffff888a1df49fc0 perf 9888 [3] 20119.782088: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0 perf 9888 [3] 20119.782089: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO perf 9888 [3] 20119.782090: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0 perf 9888 [3] 20119.782090: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO usleep 9889 [0] 20119.782091: kmem:kmem_cache_alloc: (__sigqueue_alloc+0x4a) call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK perf 9888 [3] 20119.782091: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0 perf 9888 [3] 20119.782093: kmem:kmem_cache_free: (__sigqueue_free.part.17+0x33) call_site=ffffffffb90ad3f3 ptr=0xffff8889f071f6e0 perf 9888 [3] 20119.782098: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO perf 9888 [3] 20119.782098: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0 perf 9888 [3] 20119.782099: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO perf 9888 [3] 20119.782100: kmem:kmem_cache_alloc: (alloc_buffer_head+0x21) call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO perf 9888 [3] 20119.782101: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0 perf 9888 [3] 20119.782102: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO perf 9888 [3] 20119.782103: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0 # # # stats for the whole perf.data file, i.e. no interval specified # # perf kmem stat SUMMARY (SLAB allocator) ======================== Total bytes requested: 172,628 Total bytes allocated: 173,088 Total bytes freed: 161,280 Net total bytes allocated: 11,808 Total bytes wasted on internal fragmentation: 460 Internal fragmentation: 0.265761% Cross CPU allocations: 0/851 # # # stats for an end open interval, after a certain time: # # perf kmem stat --time 20119.782088, SUMMARY (SLAB allocator) ======================== Total bytes requested: 552 Total bytes allocated: 552 Total bytes freed: 448 Net total bytes allocated: 104 Total bytes wasted on internal fragmentation: 0 Internal fragmentation: 0.000000% Cross CPU allocations: 0/8 # Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-6-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Add option to allow user to control analysis window. e.g., collect data for time window and analyze a segment of interest within that window. Committer notes: Testing it: # perf sched record -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.593 MB perf.data (25 samples) ] # # perf sched timehist | head -18 Samples do not have callchains. time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) ------------- ------ --------------- --------- --------- -------- 19818.635579 [0002] <idle> 0.000 0.000 0.000 19818.635613 [0000] perf[9116] 0.000 0.000 0.000 19818.635676 [0000] <idle> 0.000 0.000 0.063 19818.635678 [0000] rcuos/2[29] 0.000 0.002 0.001 19818.635696 [0002] perf[9117] 0.000 0.004 0.116 19818.635702 [0000] <idle> 0.001 0.000 0.024 19818.635709 [0002] migration/2[25] 0.000 0.003 0.012 19818.636263 [0000] usleep[9117] 0.005 0.000 0.560 19818.636316 [0000] <idle> 0.560 0.000 0.053 19818.636358 [0002] <idle> 0.129 0.000 0.649 19818.636358 [0000] usleep[9117] 0.053 0.002 0.042 # # perf sched timehist --time 19818.635696, Samples do not have callchains. time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) ------------- ------ --------------- -------- --------- --------- 19818.635696 [0002] perf[9117] 0.000 0.120 0.000 19818.635702 [0000] <idle> 0.019 0.000 0.006 19818.635709 [0002] migration/2[25] 0.000 0.003 0.012 19818.636263 [0000] usleep[9117] 0.005 0.000 0.560 19818.636316 [0000] <idle> 0.560 0.000 0.053 19818.636358 [0002] <idle> 0.129 0.000 0.649 19818.636358 [0000] usleep[9117] 0.053 0.002 0.042 # # perf sched timehist --time 19818.635696,19818.635709 Samples do not have callchains. time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) ------------- ------ --------------- --------- --------- --------- 19818.635696 [0002] perf[9117] 0.000 0.120 0.000 19818.635702 [0000] <idle> 0.019 0.000 0.006 19818.635709 [0002] migration/2[25] 0.000 0.003 0.012 19818.635709 [0000] usleep[9117] 0.005 0.000 0.006 # Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-5-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Add option to allow user to control analysis window. e.g., collect data for some amount of time and analyze a segment of interest within that window. Committer notes: Testing it: # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # # perf script --hide-call-graph | head -15 swapper 0 [0] 9693.370039: 1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370044: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370046: 7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370048: 126 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370049: 2701 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370051: 58823 cycles:ppp: ffffffffb90cd2e0 idle_cpu (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370059: 1 cycles:ppp: ffffffffb91a713a ctx_resched (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370062: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370064: 13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370065: 250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370067: 5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux) perf 5124 [2] 9693.370076: 1 cycles:ppp: ffffffffb91a76c1 __perf_event_enable (.../4.8.8-300.fc25.x86_64/vmlinux) perf 5124 [2] 9693.370091: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux) perf 5124 [2] 9693.370095: 3 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) # # perf script --hide-call-graph --time ,9693.370048 swapper 0 [0] 9693.370039: 1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370044: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [0] 9693.370046: 7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) # perf script --hide-call-graph --time 9693.370064,9693.370076 swapper 0 [1] 9693.370064: 13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370065: 250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370067: 5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux) # Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-4-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Code move only; no functional change intended. Committer notes: Fix the build on Ubuntu 16.04 x86-64 cross-compiling to S/390, with this set of auto-detected features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ OFF ] ... libaudit: [ OFF ] ... libbfd: [ OFF ] ... libelf: [ on ] ... libnuma: [ OFF ] ... numa_num_possible_cpus: [ OFF ] ... libperl: [ OFF ] ... libpython: [ OFF ] ... libslang: [ OFF ] ... libcrypto: [ OFF ] ... libunwind: [ OFF ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ OFF ] ... get_cpuid: [ OFF ] ... bpf: [ on ] Where it was failing with: CC /tmp/build/perf/util/time-utils.o util/time-utils.c: In function 'parse_nsec_time': util/time-utils.c:17:13: error: implicit declaration of function 'strtoul' [-Werror=implicit-function-declaration] time_sec = strtoul(str, &end, 10); ^ util/time-utils.c:17:2: error: nested extern declaration of 'strtoul' [-Werror=nested-externs] time_sec = strtoul(str, &end, 10); ^ util/time-utils.c: In function 'perf_time__parse_str': util/time-utils.c:93:2: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration] free(str); ^ util/time-utils.c:93:2: error: incompatible implicit declaration of built-in function 'free' [-Werror] util/time-utils.c:93:2: note: include '<stdlib.h>' or provide a declaration of 'free' Do as suggested and add a '#include <stdlib.h>' to get the free() and strtoul() declarations and fix the build. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-3-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Add function to parse a user time string of the form <start>,<stop> where start and stop are time in sec.nsec format. Both start and stop times are optional. Add function to determine if a sample time is within a given time time window of interest. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-2-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 30 11月, 2016 1 次提交
-
-
由 David Ahern 提交于
Allow user to specify list of symbols which cause the dump of callchains to stop at that symbol. Committer notes: Testing it: # perf record -ag usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ] # # # Without it: # # perf script swapper 0 [000] 9693.370039: 1 cycles:ppp: 2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 137ffeb start_kernel ([kernel.vmlinux].init.text) 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text) 137f419 x86_64_start_kernel ([kernel.vmlinux].init.text) swapper 0 [000] 9693.370044: 1 cycles:ppp: 20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 137ffeb start_kernel ([kernel.vmlinux].init.text) 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text) # # # Using it to see just what are the calls from the 'remote_function' function: # # perf script --stop-bt remote_function swapper 0 [000] 9693.370039: 1 cycles:ppp: 2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) swapper 0 [000] 9693.370044: 1 cycles:ppp: 20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) 3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux) Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480104021-36275-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 11月, 2016 6 次提交
-
-
由 David Ahern 提交于
Track freed memory as well as allocations and show the net in the summary. Committer notes: Testing it: # perf kmem record usleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 1.626 MB perf.data (4208 samples) ] [root@jouet ~]# perf kmem stat --slab SUMMARY (SLAB allocator) ======================== Total bytes requested: 234,011 Total bytes allocated: 234,504 Total bytes freed: 213,328 <------ Net total bytes allocated: 21,176 Total bytes wasted on internal fragmentation: 493 Internal fragmentation: 0.210231% Cross CPU allocations: 4/1,963 # Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480110133-37039-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Having "test" in almost all test descriptions is redundant, simplify it removing and rewriting tests with such descriptions. End result: # perf test 1: vmlinux symtab matches kallsyms : Ok 2: Detect openat syscall event : Ok 3: Detect openat syscall event on all cpus : Ok 4: Read samples using the mmap interface : Ok 5: Parse event definition strings : Ok 6: PERF_RECORD_* events & perf_sample fields : Ok 7: Parse perf pmu format : Ok 8: DSO data read : Ok 9: DSO data cache : Ok 10: DSO data reopen : Ok 11: Roundtrip evsel->name : Ok 12: Parse sched tracepoints fields : Ok 13: syscalls:sys_enter_openat event fields : Ok 14: Setup struct perf_event_attr : Ok 15: Match and link multiple hists : Ok 16: 'import perf' in python : Ok 17: Breakpoint overflow signal handler : Ok 18: Breakpoint overflow sampling : Ok 19: Number of exit events of a simple workload : Ok 20: Software clock events period values : Ok 21: Object code reading : Ok 22: Sample parsing : Ok 23: Use a dummy software event to keep tracking: Ok 24: Parse with no sample_id_all bit set : Ok 25: Filter hist entries : Ok 26: Lookup mmap thread : Ok 27: Share thread mg : Ok 28: Sort output of hist entries : Ok 29: Cumulate child hist entries : Ok 30: Track with sched_switch : Ok 31: Filter fds with revents mask in a fdarray : Ok 32: Add fd to a fdarray, making it autogrow : Ok 33: kmod_path__parse : Ok 34: Thread map : Ok 35: LLVM search and compile : 35.1: Basic BPF llvm compile : Ok 35.2: kbuild searching : Ok 35.3: Compile source for BPF prologue generation: Ok 35.4: Compile source for BPF relocation : Ok 36: Session topology : Ok 37: BPF filter : 37.1: Basic BPF filtering : Ok 37.2: BPF prologue generation : Ok 37.3: BPF relocation checker : Ok 38: Synthesize thread map : Ok 39: Synthesize cpu map : Ok 40: Synthesize stat config : Ok 41: Synthesize stat : Ok 42: Synthesize stat round : Ok 43: Synthesize attr update : Ok 44: Event times : Ok 45: Read backward ring buffer : Ok 46: Print cpu map : Ok 47: Probe SDT events : Ok 48: is_printable_array : Ok 49: Print bitmap : Ok 50: perf hooks : Ok 51: x86 rdpmc : Ok 52: Convert perf time to TSC : Ok 53: DWARF unwind : Ok 54: x86 instruction decoder - new instructions : Ok 55: Intel cqm nmi context read : Skip # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-rx2lbfcrrio2yx1fxcljqy0e@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Perf hooks allow hooking user code at perf events. They can be used for manipulation of BPF maps, taking snapshot and reporting results. In this patch two perf hook points are introduced: record_start and record_end. To avoid buggy user actions, a SIGSEGV signal handler is introduced into 'perf record'. It turns off perf hook if it causes a segfault and report an error to help debugging. A test case for perf hook is introduced. Test result: $ ./buildperf/perf test -v hook 50: Test perf hooks : --- start --- test child forked, pid 10311 SIGSEGV is observed as expected, try to recover. Fatal error (SEGFAULT) in perf hook 'test' test child finished with 0 ---- end ---- Test perf hooks: Ok Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-5-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Add a new API to libbpf, caller is able to get bpf_map through the offset of bpf_map_def to 'maps' section. The API will be used to help jitted perf hook code find fd of a map. Signed-off-by: NWang Nan <wangnan0@huawei.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-4-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Similar to other classes defined in libbpf.h (map and program), allow 'object' class has its own private data. Signed-off-by: NWang Nan <wangnan0@huawei.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-3-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Add more BPF map operations to libbpf. Also add bpf_obj_{pin,get}(). They can be used on not only BPF maps but also BPF programs. Signed-off-by: NWang Nan <wangnan0@huawei.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-2-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 11月, 2016 4 次提交
-
-
由 David Ahern 提交于
Leverage pid/tid filtering done by symbol_conf hooks. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1480091392-35645-1-git-send-email-dsa@cumulusnetworks.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Add handlers for sched:sched_migrate_task event. Total number of migrations is added to summary display and -M/--migrations can be used to show migration events. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1480091321-35591-1-git-send-email-dsa@cumulusnetworks.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
To help in debugging when the wrong offset is being used, like in: │13d98: ↓ jne 13dd1 <lzma_lzma_preset@@XZ_5.0+0x28e1> That is the full line from objdump, and it seems what should be used is 13dd1, not 28e1. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-4nc0marsgst1ft6inmvqber7@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
To print some values, like in the annotation code with invalid jump offsets. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-1vk0g5twas2ioswn1mmvnvwq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 11月, 2016 7 次提交
-
-
由 Eric Leblond 提交于
It is not correct to assimilate the elf data of the maps section to an array of map definition. In fact the sizes differ. The offset provided in the symbol section has to be used instead. This patch fixes a bug causing a elf with two maps not to load correctly. Wang Nan added: This patch requires a name for each BPF map, so array of BPF maps is not allowed. This restriction is reasonable, because kernel verifier forbid indexing BPF map from such array unless the index is a fixed value, but if the index is fixed why not merging it into name? For example: Program like this: ... unsigned long cpu = get_smp_processor_id(); int *pval = map_lookup_elem(&map_array[cpu], &key); ... Generates bytecode like this: 0: (b7) r1 = 0 1: (63) *(u32 *)(r10 -4) = r1 2: (b7) r1 = 680997 3: (63) *(u32 *)(r10 -8) = r1 4: (85) call 8 5: (67) r0 <<= 4 6: (18) r1 = 0x112dd000 8: (0f) r0 += r1 9: (bf) r2 = r10 10: (07) r2 += -4 11: (bf) r1 = r0 12: (85) call 1 Where instruction 8 is the computation, 8 and 11 render r1 to an invalid value for function map_lookup_elem, causes verifier report error. Signed-off-by: NEric Leblond <eric@regit.org> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Wang Nan <wangnan0@huawei.com> [ Merge bpf_object__init_maps_name into bpf_object__init_maps. Fix segfault for buggy BPF script Validate obj->maps ] Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161115040617.69788-5-wangnan0@huawei.comSigned-off-by: NWang Nan <wangnan0@huawei.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Commit 0b3c2264 ("perf symbols: Fix kallsyms perf test on ppc64le") refers struct symbol in probe_event.h, but forgets to include its definition. Gcc will complain about it when that definition is not added, by sheer luck, by some other header included before probe_event.h. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161115040617.69788-4-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Before this patch perf panics if kptr_restrict is set to 1 and perf is owned by root with suid set: $ whoami wangnan $ ls -l ./perf -rwsr-xr-x 1 root root 19781908 Sep 21 19:29 /home/wangnan/perf $ cat /proc/sys/kernel/kptr_restrict 1 $ cat /proc/sys/kernel/perf_event_paranoid -1 $ ./perf record -a Segmentation fault (core dumped) $ The reason is that perf assumes it is allowed to read kptr from /proc/kallsyms when euid is root, but in fact the kernel doesn't allow reading kptr when euid and uid do not match with each other: $ cp /bin/cat . $ sudo chown root:root ./cat $ sudo chmod u+s ./cat $ cat /proc/kallsyms | grep do_fork 0000000000000000 T _do_fork <--- kptr is hidden even euid is root $ sudo cat /proc/kallsyms | grep do_fork ffffffff81080230 T _do_fork See lib/vsprintf.c for kernel side code. This patch fixes this problem by checking both uid and euid. Signed-off-by: NWang Nan <wangnan0@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161115040617.69788-3-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
On ubuntu the internal kernel version code is different from what can be retrived from uname: $ uname -r 4.4.0-47-generic $ cat /lib/modules/`uname -r`/build/include/generated/uapi/linux/version.h #define LINUX_VERSION_CODE 263192 #define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c)) $ cat /lib/modules/`uname -r`/build/include/generated/utsrelease.h #define UTS_RELEASE "4.4.0-47-generic" #define UTS_UBUNTU_RELEASE_ABI 47 $ cat /proc/version_signature Ubuntu 4.4.0-47.68-generic 4.4.24 The macro LINUX_VERSION_CODE is set to 4.4.24 (263192 == 0x40418), but `uname -r` reports 4.4.0. This mismatch causes LINUX_VERSION_CODE macro passed to BPF script become an incorrect value, results in magic failure in BPF loading: $ sudo ./buildperf/perf record -e ./tools/perf/tests/bpf-script-example.c ls event syntax error: './tools/perf/tests/bpf-script-example.c' \___ Failed to load program for unknown reason According to Ubuntu document (https://wiki.ubuntu.com/Kernel/FAQ), the correct kernel version can be retrived through /proc/version_signature, which is ubuntu specific. This patch checks the existance of /proc/version_signature, and returns version number through parsing this file instead of uname. Version string is untouched (value returns from uname) because `uname -r` is required to be consistence with path of kbuild directory in /lib/module. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161115040617.69788-2-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
When it records callchains, they will always have 2 scheduler functions (__schedule + schedule or __schedule + preempt_schedule) and get ignored. So it should collect 2 more functions to show the expected number of callchains to user. Committer Notes: Example of final result, using the same perf.data file as in the previous cset comment, but this time redirecting the output of 'perf sched timehist' to a file instead of copy'n'pasting from xterm: [root@jouet experimental]# perf sched timehist > /tmp/bla [root@jouet experimental]# cat /tmp/bla time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) -------- ---- -------------------- ------ ------ ----- 6.494998 [01] <idle> 0.000 0.000 0.000 6.495027 [02] perf[519] 0.000 0.000 0.000 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll 6.495096 [03] <idle> 0.000 0.000 0.000 6.495100 [03] rcuos/0[9] 0.000 0.005 0.003 rcu_nocb_kthread <- kthread <- ret_from_fork 6.495113 [01] perf[520] 0.000 0.008 0.114 preempt_schedule_common <- _cond_resched <- wait_for_completion <- stop_one_cpu <- sched_exec <- do_execveat_common.isra.35 6.495121 [00] <idle> 0.000 0.000 0.000 6.495129 [01] migration/1[17] 0.000 0.003 0.016 smpboot_thread_fn <- kthread <- ret_from_fork 6.496085 [02] <idle> 0.000 0.000 1.057 6.496096 [02] kworker/u16:1[31169] 0.000 0.004 0.011 worker_thread <- kthread <- ret_from_fork 6.496096 [03] <idle> 0.003 0.000 0.996 6.496169 [02] <idle> 0.011 0.000 0.072 6.496171 [00] ls[520] 0.008 0.000 1.049 do_exit <- do_group_exit <- [unknown] <- entry_SYSCALL_64_fastpath 6.496172 [03] gnome-terminal-[4391] 0.000 0.003 0.076 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20161124011114.7102-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The sched_switch event always captured from the scheduler function. So it'd be great omit them from the callchain. This patch marks the functions to be omitted by later patch. Committer notes: Testing it: Before: [root@jouet experimental]# perf sched record -g ls Dockerfile perf.data x-mips64 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.355 MB perf.data (29 samples) ] [root@jouet experimental]# perf sched timehist time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) ----------- ----- ----------------- ------ ------ ------ 6.494998 [001] <idle> 0.000 0.000 0.000 6.495027 [002] perf[519] 0.000 0.000 0.000 __schedule <- schedule <- schedule_hrtimeout_range_clock <- schedule_hrtimeou 6.495096 [003] <idle> 0.000 0.000 0.000 6.495100 [003] rcuos/0[9] 0.000 0.005 0.003 __schedule <- schedule <- rcu_nocb_kthread <- kthread <- ret_from_fork 6.495113 [001] perf[520] 0.000 0.008 0.114 __schedule <- preempt_schedule_common <- _cond_resched <- wait_for_completion 6.495121 [000] <idle> 0.000 0.000 0.000 6.495129 [001] migration/1[17] 0.000 0.003 0.016 __schedule <- schedule <- smpboot_thread_fn <- kthread <- ret_from_fork 6.496085 [002] <idle> 0.000 0.000 1.057 6.496096 [002] kworker/u16:1[31169] 0.000 0.004 0.011 __schedule <- schedule <- worker_thread <- kthread <- ret_from_fork 6.496096 [003] <idle> 0.003 0.000 0.996 6.496169 [002] <idle> 0.011 0.000 0.072 6.496171 [000] ls[520] 0.008 0.000 1.049 __schedule <- schedule <- do_exit <- do_group_exit <- [unknown] 6.496172 [003] gnome-terminal-[4391] 0.000 0.003 0.076 __schedule <- schedule <- schedule_hrtimeout_range_clock <- schedule_hrtimeo After: [root@jouet experimental]# perf sched timehist time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) ----------- ----- ----------------- ----- ----- ------ 6.494998 [001] <idle> 0.000 0.000 0.000 6.495027 [002] perf[519] 0.000 0.000 0.000 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_t 6.495096 [003] <idle> 0.000 0.000 0.000 6.495100 [003] rcuos/0[9] 0.000 0.005 0.003 rcu_nocb_kthread <- kthread <- ret_from_fork 6.495113 [001] perf[520] 0.000 0.008 0.114 preempt_schedule_common <- _cond_resched <- wait_for_completion <- stop_one_c 6.495121 [000] <idle> 0.000 0.000 0.000 6.495129 [001] migration/1[17] 0.000 0.003 0.016 smpboot_thread_fn <- kthread <- ret_from_fork 6.496085 [002] <idle> 0.000 0.000 1.057 6.496096 [002] kworker/u16:1[31169] 0.000 0.004 0.011 worker_thread <- kthread <- ret_from_fork 6.496096 [003] <idle> 0.003 0.000 0.996 6.496169 [002] <idle> 0.011 0.000 0.072 6.496171 [000] ls[520] 0.008 0.000 1.049 do_exit <- do_group_exit <- [unknown] 6.496172 [003] gnome-terminal-[4391] 0.000 0.003 0.076 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_ [root@jouet experimental]# Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20161124011114.7102-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
For tracepoint events, callchains always contain certain functions. Sometimes it'd be better to skip those functions as they have no value. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20161124011114.7102-2-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-