- 15 4月, 2016 3 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To disentangle symbol printing from all the code related to symbol tables, resolution of addresses to symbols, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-eik9g3hbtdc7ddv57f1d4v3p@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
# perf test -v python 16: Try 'import perf' in python, checking link problems : --- start --- test child forked, pid 672 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.so: undefined symbol: symbol_conf test child finished with -1 ---- end ---- Try 'import perf' in python, checking link problems: FAILED! # To fix it just pass a parameter to perf_evsel__fprintf_sym telling if callchains should be printed. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-comrsr20bsnr8bg0n6rfwv12@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
The recent perf_evsel__fprintf_callchain() move to evsel.c added several new symbol requirements to the python binding, for instance: # perf test -v python 16: Try 'import perf' in python, checking link problems : --- start --- test child forked, pid 18030 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.so: undefined symbol: callchain_cursor test child finished with -1 ---- end ---- Try 'import perf' in python, checking link problems: FAILED! # This would require linking against callchain.c to access to the global callchain_cursor variables. Since lots of functions already receive as a parameter a callchain_cursor struct pointer, make that be the case for some more function so that we can start phasing out usage of yet another global variable. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 4月, 2016 4 次提交
-
-
由 Taeung Song 提交于
This infrastructure code was designed for upcoming features of 'perf config'. That collect config key-value pairs from user and system config files (i.e. user wide ~/.perfconfig and system wide $(sysconfdir)/perfconfig) to manage perf's configs. Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NTaeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1460620401-23430-2-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
perf_data_file__switch() closes current output file, renames it, then open a new one to continue recording. It will be used by 'perf record' to split output into multiple perf.data files. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-3-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
ordered_events__free() leaves linked lists and timestamps not cleared, so unable to be reused after ordered_events__free(). Which is inconvenient after 'perf record' supports generating multiple perf.data output and process build-ids for each of them. Use ordered_events__reinit() for this. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com> [ Split from larger patch ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
'perf record' will use this when outputting multiple perf.data files. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com> [ Split from larger patch ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 4月, 2016 4 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Those were converted to be evsel methods long ago, move the source to where it belongs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vja8rjmkw3gd5ungaeyb5s2j@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
It will be used in following patch. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-6-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding cpu_map__has() to return bool of cpu presence in cpus map. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding thread_map__has() to return bool of pid presence in threads map. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 4月, 2016 10 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The fprintf_sym() and fprintf_callchain() methods now allow users to change the existing behaviour of showing "[unknown]" as the name of unresolved symbols to instead show "[0x123456]", i.e. its address. The current patch doesn't change tools to use this facility, the results from 'perf trace' and 'perf script' cotinue like: 70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout [unknown] (/usr/lib64/libc-2.22.so) [unknown] (/usr/lib64/libspice-server.so.1.10.0) [unknown] (/usr/lib64/libspice-server.so.1.10.0) [unknown] (/usr/lib64/libspice-server.so.1.10.0) start_thread+0xca (/usr/lib64/libpthread-2.22.so) __clone+0x6d (/usr/lib64/libc-2.22.so) The next patch will make 'perf trace' use the new formatting. Suggested-by: NMilian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
The rename is for consistency with the parameter name. Make it public for fine grained control of which evsels should have callchains enabled, like, for instance, will be done in the next changesets in 'perf trace', to enable callchains just on the "raw_syscalls:sys_exit" tracepoint. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-og8vup111rn357g4yagus3ao@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
For fiddling with sample_type fields in all evsels in an evlist. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-dg6yavctt0hzl2tsgfb43qsr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Instead receive a callchain_param pointer to configure callchain aspects, not doing so if NULL is passed. This will allow fine grained control over which evsels in an evlist gets callchains enabled. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
In 'perf trace' we're just interested in printing callchains, and we don't want to use the symbol_conf.use_callchain, so move the callchain part to a new method. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
As it receives a FILE, and its more than just the IP, which can even be requested not to be printed. For consistency with other similar methods in tools/perf/, name it as perf_evsel__fprintf_sym() and make it return the number of bytes printed, just like 'fprintf(3)' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
For callchains, etc where we want it to align just below the syscall name, for instance, in 'perf trace' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Milian Wolff 提交于
As this function will be used in 'perf trace'. Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevvlse@git.kernel.org [ Split from a larger patch ] Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
-
由 Wang Nan 提交于
This patch removes the need to set a bpf-output event in cmdline. By referencing a map named '__bpf_stdout__', perf automatically creates an event for it. For example: # perf record -e ./test_bpf_trace.c usleep 100000 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ] # perf script usleep 4639 [000] 261895.307826: 0 __bpf_stdout__: ffffffff810eb9a1 ... BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" usleep 4639 [000] 261895.407883: 0 __bpf_stdout__: ffffffff8105d609 ... BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" perf record -e ./test_bpf_trace.c usleep 100000 equals to: perf record -e bpf-output/no-inherit=1,name=__bpf_stdout__/ \ -e ./test_bpf_trace.c/map:__bpf_stdout__.event=__bpf_stdout__/ \ usleep 100000 Where test_bpf_trace.c is: /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") __bpf_stdout__ = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; static inline int __attribute__((always_inline)) func(void *ctx, int type) { char output_str[] = "Raise a BPF event!"; char err_str[] = "BAD %d\n"; int err; err = perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), &output_str, sizeof(output_str)); if (err) trace_printk(err_str, sizeof(err_str), err); return 1; } SEC("func_begin=sys_nanosleep") int func_begin(void *ctx) {return func(ctx, 1);} SEC("func_end=sys_nanosleep%return") int func_end(void *ctx) { return func(ctx, 2);} char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Committer note: Testing with 'perf trace': # trace -e nanosleep --ev test_bpf_stdout.c usleep 1 0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ... 0.007 ( ): __bpf_stdout__:Raise a BPF event!..) 0.008 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 0.069 ( ): __bpf_stdout__:Raise a BPF event!..) 0.070 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.072 ( 0.072 ms): usleep/729 ... [continued]: nanosleep()) = 0 # Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-5-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
This patch allows cloning bpf-output event configuration among multiple bpf scripts. If there exist a map named '__bpf_output__' and not configured using 'map:__bpf_output__.event=', this patch clones the configuration of another '__bpf_stdout__' map. For example, following command: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c usleep 100000 equals to: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \ usleep 100000 Signed-off-by: NWang Nan <wangnan0@huawei.com> Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 4月, 2016 9 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when parsing tracepoint event definitions, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when synthesizing events for pre-existing threads by traversing /proc, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. CC /tmp/build/perf/util/event.o util/event.c: In function '__event__synthesize_thread': util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations] while (!readdir_r(tasks, &dirent, &next) && next) { ^~~~~ In file included from /usr/include/features.h:368:0, from /usr/include/stdint.h:25, from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9, from /git/linux/tools/include/linux/types.h:6, from util/event.c:1: /usr/include/dirent.h:189:12: note: declared here Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case in thread_map, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
He Kuang reported a problem that perf fails to get correct symbol on Android platform in [1]. The problem can be reproduced on normal x86_64 platform. I will describe the reproducing steps in detail at the end of commit message. The reason of this problem is the missing of symbol adjustment for normal shared objects. In most of the cases skipping adjustment is okay. However, when '.text' section have different 'address' and 'offset' the result is wrong. I checked all shared objects in my working platform, only wine dll objects and debug objects (in .debug) have this problem. However, it is common on Android. For example: $ readelf -S ./libsurfaceflinger.so | grep \.text [10] .text PROGBITS 0000000000029030 00012030 This patch enables symbol adjustment for dynamic objects so the symbol address got from elfutils would be adjusted correctly. Now nearly all types of ELF files should adjust symbols. Makes ss->adjust_symbols default to true. Steps to reproduce the problem: $ cat ./Makefile PWD := $(shell pwd) LDFLAGS += "-Wl,-rpath=$(PWD)" CFLAGS += -g main: main.c libbuggy.so libbuggy.so: buggy.c gcc -g -shared -fPIC -Wl,-Ttext-segment=0x200000 $< -o $@ clean: rm -rf main libbuggy.so *.o $ cat ./buggy.c int fib(int x) { return (x == 0) ? 1 : (x == 1) ? 1 : fib(x - 1) + fib(x - 2); } $ cat ./main.c #include <stdio.h> extern int fib(int x); int main() { int i; for (i = 0; i < 40; i++) printf("%d\n", fib(i)); return 0; } $ make $ perf record ./main ... $ perf report --stdio # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 14.97% main libbuggy.so [.] 0x000000000000066c 8.68% main libbuggy.so [.] 0x00000000000006aa 8.52% main libbuggy.so [.] fib@plt 7.95% main libbuggy.so [.] 0x0000000000000664 5.94% main libbuggy.so [.] 0x00000000000006a9 5.35% main libbuggy.so [.] 0x0000000000000678 ... The correct result should be (after this patch): # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 91.47% main libbuggy.so [.] fib 8.52% main libbuggy.so [.] fib@plt 0.00% main [kernel.kallsyms] [k] kmem_cache_free [1] http://lkml.kernel.org/g/1452567507-54013-1-git-send-email-hekuang@huawei.comSigned-off-by: NWang Nan <wangnan0@huawei.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460024671-64774-3-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
In this patch, the offset of '.text' section is stored into dso and used here to re-calculate address to objdump. In most of the cases, executable code is in '.text' section, so the adjustment made to a symbol in dso__load_sym (using sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to 'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back get objdump address from symbol address (rip). However, it is not true for kernel and kernel module since there could be multiple executable sections with different offset. Exclude kernel for this reason. After this patch, even dso->adjust_symbols is set to true for shared objects, map__rip_2objdump() and map__objdump_2mem() would return correct result, so perf behavior of annotate won't be changed. Signed-off-by: NWang Nan <wangnan0@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460024671-64774-2-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We used libaudit to map ids to syscall names and vice-versa, but that imposes a delay in supporting new syscalls, having to wait for libaudit to get those new syscalls on its tables. To remove that delay, for x86_64 initially, grab a copy of arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those tables. Syscalls currently not available in audit-libs: # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd Error: Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd Hint: try 'perf list syscalls:sys_enter_*' Hint: and: 'man syscalls' # With this patch: # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd 8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36 8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40 30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096 31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096 31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org [ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Tools should use a mechanism similar to arch/x86/entry/syscalls/ to generate a header file with the definitions for two variables: static const char *syscalltbl_x86_64[] = { [0] = "read", [1] = "write", <SNIP> [324] = "membarrier", [325] = "mlock2", [326] = "copy_file_range", }; static const int syscalltbl_x86_64_max_id = 326; In a per arch file that should then be included in tools/perf/util/syscalltbl.c. First one will be for x86_64. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-02uuamkxgccczdth8komspgp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We're using libaudit for doing name to id and id to syscall name translations, but that makes 'perf trace' to have to wait for newer libaudit versions supporting recently added syscalls, such as "userfaultfd" at the time of this changeset. We have all the information right there, in the kernel sources, so move this code to a separate place, wrapped behind functions that will progressively use the kernel source files to extract the syscall table for use in 'perf trace'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i38opd09ow25mmyrvfwnbvkj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Milian reported issue with thread::priv, which was double booked by perf trace and DWARF unwind code. So using those together is impossible at the moment. Moving DWARF unwind private data into separate variable so perf trace can keep using thread::priv. Reported-and-Tested-by: NMilian Wolff <milian.wolff@kdab.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Andreas Hollmann <hollmann@in.tum.de> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460013073-18444-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 4月, 2016 1 次提交
-
-
由 Jiri Olsa 提交于
To be used in cases for both sides trim. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Andreas Hollmann <hollmann@in.tum.de> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460013073-18444-1-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 4月, 2016 3 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
This ended up triggering these warnings when building on Ubuntu 12.04.5: util/scripting-engines/trace-event-perl.c: In function 'perl_process_callchain': util/scripting-engines/trace-event-perl.c:293:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:294:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:295:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:297:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:309:4: error: value computed is not used [-Werror=unused-value] cc1: all warnings being treated as errors mv: cannot stat `/tmp/build/perf/util/scripting-engines/.trace-event-perl.o.tmp': No such file or directory make[4]: *** [/tmp/build/perf/util/scripting-engines/trace-event-perl.o] Error 1 Fix it by doing error checking when building the perl data structures related to callchains. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@gmail.com> Fixes: f7380c12 ("perf script perl: Perl scripts now get a backtrace, like the python ones") Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
If not, tell the user that: config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157 And return -ENOTSUPP in die_get_var_range(), failing features that need it, like the one pointed out above. This fixes the build on older systems, such as Ubuntu 12.04.5. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vinson Lee <vlee@freedesktop.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Vinson Lee 提交于
Fix build error on Ubuntu 12.04.5 with GCC 4.6.3. CC util/config.o util/config.c: In function ‘perf_buildid_config’: util/config.c:384:15: error: declaration of ‘dirname’ shadows a global declaration [-Werror=shadow] Signed-off-by: NVinson Lee <vlee@freedesktop.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 9cb5987c ("perf config: Rework buildid_dir_command_config to perf_buildid_config") Link: http://lkml.kernel.org/r/1459807659-9020-1-git-send-email-vlee@freedesktop.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 4月, 2016 3 次提交
-
-
由 Wang Nan 提交于
Before this patch we can see very large time in the events before the 'bpf-output' event. For example: # perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 ... 18446744073709.551 (18446564645918.480 ms): usleep/4157 nanosleep(rqtp: 0x7ffd3f0dc4e0) ... 18446744073709.551 ( ): evt:Raise a BPF event!..) 179427791.076 ( ): perf_bpf_probe:func_begin:(ffffffff810eb9a0)) 179427791.081 ( ): sched:sched_switch:usleep:4157 [120] S ==> swapper/2:0 [120]) ... We can also see the differences between bpf-output events and breakpoint events: For bpf output event: sample_type IP|TID|RAW|IDENTIFIER For tracepoint events: sample_type IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER This patch fix this differences by adding more sample type for bpf-output events. After this patch: # perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 ... 179877370.878 ( 0.003 ms): usleep/5336 nanosleep(rqtp: 0x7ffff866c450) ... 179877370.878 ( ): evt:Raise a BPF event!..) 179877370.878 ( ): perf_bpf_probe:func_begin:(ffffffff810eb9a0)) 179877370.882 ( ): sched:sched_switch:usleep:5336 [120] S ==> swapper/4:0 [120]) 179877370.945 ( ): evt:Raise a BPF event!..) ... # ./perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 2>&1 | grep sample_type sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW The 'IDENTIFIER' info is not required because all events have the same sample_type. Committer notes: Further testing, on top of the changes making 'perf trace' avoid samples from events without PERF_SAMPLE_TIME: Before: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 <SNIP> 0.560 ( 0.001 ms): brk( ) = 0x55e5a1df8000 18446640227439.430 (18446640227438.859 ms): nanosleep(rqtp: 0x7ffc96643370) ... 18446640227439.430 ( ): evt:Raise a BPF event!..) 0.576 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 18446640227439.430 ( ): evt:Raise a BPF event!..) 0.645 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.646 ( 0.076 ms): ... [continued]: nanosleep()) = 0 # After: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 <SNIP> 0.292 ( 0.001 ms): brk( ) = 0x55c7cd6e1000 0.302 ( 0.004 ms): nanosleep(rqtp: 0x7ffedd8bc0f0) ... 0.302 ( ): evt:Raise a BPF event!..) 0.303 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 0.397 ( ): evt:Raise a BPF event!..) 0.397 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.398 ( 0.100 ms): ... [continued]: nanosleep()) = 0 Signed-off-by: NWang Nan <wangnan0@huawei.com> Reported-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1459517202-42320-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
Currently the max value of format is calculated by the bits number. It relies on the continuity of the format. However, uncore event format is not continuous. E.g. uncore qpi event format can be 0-7,21. If bit 21 is set, there is parsing issues as below. $ perf stat -a -e uncore_qpi_0/event=0x200002,umask=0x8/ event syntax error: '..pi_0/event=0x200002,umask=0x8/' \___ value too big for format, maximum is 511 This patch return the real max value by setting all possible bits to 1. Signed-off-by: NKan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1459365375-14285-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Intel PT uses TSC as a timestamp, so add support for using TSC instead of the monotonic clock. Use of TSC is selected by an environment variable "JITDUMP_USE_ARCH_TIMESTAMP" and flagged in the jitdump file with flag JITDUMP_FLAGS_ARCH_TIMESTAMP. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1457426330-30226-1-git-send-email-adrian.hunter@intel.com [ Added the fixup from He Kuang to make it build on other arches, ] [ such as aarch64, to avoid inserting this bisectiong breakage upstream ] Link: http://lkml.kernel.org/r/1459482572-129494-1-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 31 3月, 2016 2 次提交
-
-
由 Adrian Hunter 提交于
Intel PT uses the time members from the perf_event_mmap_page to convert between TSC and perf time. Due to a lack of foresight when Intel PT was implemented, those time members were recorded in the (implementation dependent) AUXTRACE_INFO event, the structure of which is generally inaccessible outside of the Intel PT decoder. However now the conversion between TSC and perf time is needed when processing a jitdump file when Intel PT has been used for tracing. So add a user event to record the time members. 'perf record' will synthesize the event if the information is available. And session processing will put a copy of the event on the session so that tools like 'perf inject' can easily access it. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Anton Blanchard 提交于
Commit 9b07e27f ("perf inject: Add jitdump mmap injection support") incorrectly assumed that PowerPC is big endian only. Simplify things by consolidating the define of GEN_ELF_ENDIAN and checking for __BYTE_ORDER == __BIG_ENDIAN. The PowerPC checks were also incorrect, they do not match what gcc emits. We should first look for __powerpc64__, then __powerpc__. Signed-off-by: NAnton Blanchard <anton@samba.org> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Cc: Carl Love <cel@us.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Fixes: 9b07e27f ("perf inject: Add jitdump mmap injection support") Link: http://lkml.kernel.org/r/20160329175944.33a211cc@krytenSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 30 3月, 2016 1 次提交
-
-
由 Andi Kleen 提交于
When using 'perf script' to look at PT traces it is often useful to ignore the initialization code at the beginning. On larger traces which may have many millions of instructions in initialization code doing that in a pipeline can be very slow, with perf script spending a lot of CPU time calling printf and writing data. This patch adds an extension to the --itrace argument that skips 'n' events (instructions, branches or transactions) at the beginning. This is much more efficient. v2: Add support for BTS (Adrian Hunter) Document in itrace.txt Fix branch check Check transactions and instructions too Committer note: To test intel_pt one needs to make sure VT-x isn't active, i.e. stopping KVM guests on the test machine, as described by Andi Kleen at http://lkml.kernel.org/r/20160301234953.GD23621@tassilo.jf.intel.comSigned-off-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1459187142-20035-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-