- 03 3月, 2016 11 次提交
-
-
由 Colin Ian King 提交于
The 'command_line' variable is free'd twice if db_export__branch_types() fails. To avoid this, defer the free'ing of 'command_line' to after this call so that the error return path will just free 'command_line' once. Signed-off-by: NColin Ian King <colin.king@canonical.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Javi Merino <javi.merino@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1456875980-25606-1-git-send-email-colin.king@canonical.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Enable metrics printing in --per-core / --per-socket mode. We need to save the shadow metrics in a unique place. Always use the first CPU in the aggregation. Then use the same CPU to retrieve the shadow value later. Example output: % perf stat --per-core -a ./BC1s Performance counter stats for 'system wide': S0-C0 2 2966.020381 task-clock (msec) # 2.004 CPUs utilized (100.00%) S0-C0 2 49 context-switches # 0.017 K/sec (100.00%) S0-C0 2 4 cpu-migrations # 0.001 K/sec (100.00%) S0-C0 2 467 page-faults # 0.157 K/sec S0-C0 2 4,599,061,773 cycles # 1.551 GHz (100.00%) S0-C0 2 9,755,886,883 instructions # 2.12 insn per cycle (100.00%) S0-C0 2 1,906,272,125 branches # 642.704 M/sec (100.00%) S0-C0 2 81,180,867 branch-misses # 4.26% of all branches S0-C1 2 2965.995373 task-clock (msec) # 2.003 CPUs utilized (100.00%) S0-C1 2 62 context-switches # 0.021 K/sec (100.00%) S0-C1 2 8 cpu-migrations # 0.003 K/sec (100.00%) S0-C1 2 281 page-faults # 0.095 K/sec S0-C1 2 6,347,290 cycles # 0.002 GHz (100.00%) S0-C1 2 4,654,156 instructions # 0.73 insn per cycle (100.00%) S0-C1 2 947,121 branches # 0.319 M/sec (100.00%) S0-C1 2 37,322 branch-misses # 3.94% of all branches 1.480409747 seconds time elapsed v2: Rebase to older patches v3: Document shadow cpus. Fix aggr_get_id argument. Fix -A shadows (Jiri) Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1456785386-19481-4-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Now support CSV output for metrics. With the new output callbacks this is relatively straight forward by creating new callbacks. This allows to easily plot metrics from CSV files. The new line callback needs to know the number of fields to skip them correctly Example output before: % perf stat -x, true 0.200687,,task-clock,200687,100.00 0,,context-switches,200687,100.00 0,,cpu-migrations,200687,100.00 40,,page-faults,200687,100.00 730871,,cycles,203601,100.00 551056,,stalled-cycles-frontend,203601,100.00 <not supported>,,stalled-cycles-backend,0,100.00 385523,,instructions,203601,100.00 78028,,branches,203601,100.00 3946,,branch-misses,203601,100.00 After: % perf stat -x, true .502457,,task-clock,502457,100.00,0.485,CPUs utilized 0,,context-switches,502457,100.00,0.000,K/sec 0,,cpu-migrations,502457,100.00,0.000,K/sec 45,,page-faults,502457,100.00,0.090,M/sec 644692,,cycles,509102,100.00,1.283,GHz 423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle <not supported>,,stalled-cycles-backend,0,100.00,,,, 492701,,instructions,509102,100.00,0.76,insn per cycle ,,,,,0.86,stalled cycles per insn 97767,,branches,509102,100.00,194.578,M/sec 4788,,branch-misses,509102,100.00,4.90,of all branches or easier readable $ perf stat -x, -o x.csv true $ column -s, -t x.csv 0.490635 task-clock 490635 100.00 0.489 CPUs utilized 0 context-switches 490635 100.00 0.000 K/sec 0 cpu-migrations 490635 100.00 0.000 K/sec 45 page-faults 490635 100.00 0.092 M/sec 629080 cycles 497698 100.00 1.282 GHz 409498 stalled-cycles-frontend 497698 100.00 65.09 frontend cycles idle <not supported> stalled-cycles-backend 0 100.00 491424 instructions 497698 100.00 0.78 insn per cycle 0.83 stalled cycles per insn 97278 branches 497698 100.00 198.270 M/sec 4569 branch-misses 497698 100.00 4.70 of all branches Two new fields are added: metric value and metric name. v2: Split out function argument changes v3: Reenable metrics for real. v4: Fix wrong hunk from refactoring. v5: Remove extra "noise" printing (Jiri), but add it to the not counted case. Print empty metrics for not counted. v6: Avoid outputting metric on empty format. v7: Print metric at the end v8: Remove extra run, ena fields v9: Avoid extra new line for unsupported counters Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1456785386-19481-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
perf_evlist__mmap_ex() can fail without setting errno (for example, fail in condition checking. In this case all syscall is success). If this happen, record__open() incorrectly returns 0. Force setting rc is a quick way to avoid this problem, or we have to follow all possible code path in perf_evlist__mmap_ex() to make sure there's at least one system call before returning an error. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-30-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Move code for finalizing 'perf.data' to record__finish_output(). It will be used by following commits to split output to multiple files. Signed-off-by: NHe Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-23-git-send-email-wangnan0@huawei.comSigned-off-by: NWang Nan <wangnan0@huawei.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Create record__synthesize(). It can be used to create tracking events for each perf.data after perf supporting splitting into multiple outputs. Signed-off-by: NHe Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-20-git-send-email-wangnan0@huawei.comSigned-off-by: NWang Nan <wangnan0@huawei.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Commits in a BPF patchkit will extract kernel and module synthesizing code into a separated function and call it multiple times. This patch replace 'if (err < 0)' using WARN_ONCE, makes sure the error message show one time. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-19-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
After babeltrace commit 5cec03e402aa ("ir: copy variants and sequences when setting a field path"), 'perf data convert' gets incorrect result if there's bpf output data. For example: # perf data convert --to-ctf ./out.ctf # babeltrace ./out.ctf [10:44:31.186045346] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E7DD1, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0xC028E32F, [1] = 0x815D0100, [2] = 0x1000000 ] } [10:44:31.286101003] (+0.100055657) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF8105B609, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0x35D9F1EB, [1] = 0x15D81, [2] = 0x2 ] } The expected result of the first sample should be: raw_data = [ [0] = 0x2FE328C0, [1] = 0x15D81, [2] = 0x1 ] } however, 'perf data convert' output big endian value to resuling CTF file. The reason is a internal change (or a bug?) of babeltrace. Before this patch, at the first add_bpf_output_values(), byte order of all integer type is uncertain (is 0, neither 1234 (le) nor 4321 (be)). It would be fixed by: perf_evlist__deliver_sample -> process_sample_event -> ctf_stream ... ->bt_ctf_trace_add_stream_class ->bt_ctf_field_type_structure_set_byte_order ->bt_ctf_field_type_integer_set_byte_order during creating the stream. However, the babeltrace commit mentioned above duplicates types in sequence to prevent potential conflict in following call stack and link the newly allocated type into the 'raw_data' sequence: perf_evlist__deliver_sample -> process_sample_event -> ctf_stream ... -> bt_ctf_trace_add_stream_class -> bt_ctf_stream_class_resolve_types ... -> bt_ctf_field_type_sequence_copy ->bt_ctf_field_type_integer_copy This happens before byte order setting, so only the newly allocated type is initialized, the byte order of original type perf choose to create the first raw_data is still uncertain. Byte order in CTF output is not related to byte order in perf.data. Setting it to anything other than BT_CTF_BYTE_ORDER_NATIVE solves this problem (only BT_CTF_BYTE_ORDER_NATIVE needs to be fixed). To reduce behavior changing, set byte order according to compiling options. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-10-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
bpf_perf_event_output() outputs data through sample->raw_data. This patch adds support to convert those data into CTF. A python script then can be used to process output data from BPF programs. Test result: # cat ./test_bpf_output_2.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; static inline int __attribute__((always_inline)) func(void *ctx, int type) { struct { u64 ktime; int type; } __attribute__((packed)) output_data; char error_data[] = "Error: failed to output\n"; int err; output_data.type = type; output_data.ktime = ktime_get_ns(); err = perf_event_output(ctx, &channel, get_smp_processor_id(), &output_data, sizeof(output_data)); if (err) trace_printk(error_data, sizeof(error_data)); return 0; } SEC("func_begin=sys_nanosleep") int func_begin(void *ctx) {return func(ctx, 1);} SEC("func_end=sys_nanosleep%return") int func_end(void *ctx) { return func(ctx, 2);} char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ # ./perf record -e bpf-output/no-inherit,name=evt/ \ -e ./test_bpf_output_2.c/map:channel.event=evt/ \ usleep 100000 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ] # ./perf script usleep 14942 92503.198504: evt: ffffffff810e0ba1 sys_nanosleep (/lib/modules/4.3.0.... usleep 14942 92503.298562: evt: ffffffff810585e9 kretprobe_trampoline_holder (/lib.... # ./perf data convert --to-ctf ./out.ctf [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ] [ perf data convert: Converted and wrote 0.000 MB (2 samples) ] # babeltrace ./out.ctf [01:41:43.198504134] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E0BA1, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x32C0C07B, [1] = 0x5421, [2] = 0x1 ] } [01:41:43.298562257] (+0.100058123) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810585E9, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x38B77FAA, [1] = 0x5421, [2] = 0x2 ] } # cat ./test_bpf_output_2.py from babeltrace import TraceCollection tc = TraceCollection() tc.add_trace('./out.ctf', 'ctf') d = {1:[], 2:[]} for event in tc.events: if not event.name.startswith('evt'): continue raw_data = event['raw_data'] (time, type) = ((raw_data[0] + (raw_data[1] << 32)), raw_data[2]) d[type].append(time) print(list(map(lambda i: d[2][i] - d[1][i], range(len(d[1]))))); # python3 ./test_bpf_output_2.py [100056879] Committer note: Make sure you have python3-devel installed, not python-devel, which may be for python2, which will lead to some "PyInstance_Type" errors. Also make sure that you use the right libbabeltrace, because it is shipped in Fedora, for instance, but an older version. To build libbabeltrace's python binding one also needs to use: ./configure --enable-python-bindings And then set PYTHONPATH=/usr/local/lib64/python3.4/site-packages/. Signed-off-by: NWang Nan <wangnan0@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-9-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Only put the frontend/backend stalled cycles into the default perf stat events when the CPU actually supports them. This avoids empty columns with --metric-only on newer Intel CPUs. Committer note: Before: $ perf stat ls Performance counter stats for 'ls': 1.080893 task-clock (msec) # 0.619 CPUs utilized 0 context-switches # 0.000 K/sec 0 cpu-migrations # 0.000 K/sec 97 page-faults # 0.090 M/sec 3,327,741 cycles # 3.079 GHz <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 1,609,544 instructions # 0.48 insn per cycle 319,117 branches # 295.235 M/sec 12,246 branch-misses # 3.84% of all branches 0.001746508 seconds time elapsed $ After: $ perf stat ls Performance counter stats for 'ls': 0.693948 task-clock (msec) # 0.662 CPUs utilized 0 context-switches # 0.000 K/sec 0 cpu-migrations # 0.000 K/sec 95 page-faults # 0.137 M/sec 1,792,509 cycles # 2.583 GHz 1,599,047 instructions # 0.89 insn per cycle 316,328 branches # 455.838 M/sec 12,453 branch-misses # 3.94% of all branches 0.001048987 seconds time elapsed $ Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456532881-26621-2-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Ingo reported regression on display format of big numbers, which is missing separators (in default perf stat output). triton:~/tip> perf stat -a sleep 1 ... 127008602 cycles # 0.011 GHz 279538533 stalled-cycles-frontend # 220.09% frontend cycles idle 119213269 instructions # 0.94 insn per cycle This is caused by recent change: perf stat: Check existence of frontend/backed stalled cycles that added call to pmu_have_event, that subsequently calls perf_pmu__parse_scale, which has a bug in locale handling. The lc string returned from setlocale, that we use to store old locale value, may be allocated in static storage. Getting a dynamic copy to make it survive another setlocale call. $ perf stat ls ... 2,360,602 cycles # 3.080 GHz 2,703,090 instructions # 1.15 insn per cycle 546,031 branches # 712.511 M/sec Committer note: Since the patch introducing the regression didn't made to perf/core, move it to just before where the regression was introduced, so that we don't break bisection for this feature. Reported-by: NIngo Molnar <mingo@redhat.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20160303095348.GA24511@krava.redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 2月, 2016 2 次提交
-
-
由 Taeung Song 提交于
Format fields of a syscall have the first variable '__syscall_nr' or 'nr' that mean the syscall number. But it isn't relevant here so drop it. 'nr' among fields of syscall was renamed '__syscall_nr'. So add exception handling to drop '__syscall_nr' and modify the comment for this excpetion handling. Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NTaeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1456492465-5946-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The util/python-ext-sources file contains source files required to build the python extension relative to $(srctree)/tools/perf, Such a file path $(FILE).c is handed over to the python extension build system, which builds the final object in the $(PYTHON_EXTBUILD)/tmp/$(FILE).o path. After the build is done all files from $(PYTHON_EXTBUILD)lib/ are carried as the result binaries. Above system fails when we add source file relative to ../lib, which we do for: ../lib/bitmap.c ../lib/find_bit.c ../lib/hweight.c ../lib/rbtree.c All above objects will be built like: $(PYTHON_EXTBUILD)/tmp/../lib/bitmap.c $(PYTHON_EXTBUILD)/tmp/../lib/find_bit.c $(PYTHON_EXTBUILD)/tmp/../lib/hweight.c $(PYTHON_EXTBUILD)/tmp/../lib/rbtree.c which accidentally happens to be final library path: $(PYTHON_EXTBUILD)/lib/ Changing setup.py to pass full paths of source files to Extension build class and thus keep all built objects under $(PYTHON_EXTBUILD)tmp directory. Reported-by: NJeff Bastian <jbastian@redhat.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NJosh Boyer <jwboyer@fedoraproject.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@vger.kernel.org # v4.2+ Link: http://lkml.kernel.org/r/20160227201350.GB28494@krava.redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 27 2月, 2016 9 次提交
-
-
由 Wang Nan 提交于
With this patch the contend of BPF output event is printed by 'perf trace'. For example: # ./perf trace -a --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 100000 ... 1.787 ( 0.004 ms): usleep/3832 nanosleep(rqtp: 0x7ffc78b18980 ) ... 1.787 ( ): evt:Raise a BPF event!..) 1.788 ( ): perf_bpf_probe:func_begin:(ffffffff810e97d0)) ... 101.866 (87.038 ms): gmain/1654 poll(ufds: 0x7f57a80008c0, nfds: 2, timeout_msecs: 1000 ) ... 101.866 ( ): evt:Raise a BPF event!..) 101.867 ( ): perf_bpf_probe:func_end:(ffffffff810e97d0 <- ffffffff81796173)) 101.869 (100.087 ms): usleep/3832 ... [continued]: nanosleep()) = 0 ... (There is an extra ')' at the end of several lines. However, it is another problem, unrelated to this commit.) Where test_bpf_trace.c is: /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; static inline int __attribute__((always_inline)) func(void *ctx, int type) { char output_str[] = "Raise a BPF event!"; char err_str[] = "BAD %d\n"; int err; err = perf_event_output(ctx, &channel, get_smp_processor_id(), &output_str, sizeof(output_str)); if (err) trace_printk(err_str, sizeof(err_str), err); return 1; } SEC("func_begin=sys_nanosleep") int func_begin(void *ctx) {return func(ctx, 1);} SEC("func_end=sys_nanosleep%return") int func_end(void *ctx) { return func(ctx, 2);} char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-8-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Without this patch BPF map configuration is not applied. Command like this: # ./perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 100000 Load BPF files without error, but since map:channel.event=evt is not applied, bpf-output event not work. This patch allows 'perf trace' load and run BPF scripts. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-7-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
perf_evlist__set_filter() tries to set filter to every evsel linked in the evlist. However, since filters can only be applied to tracepoints, checking type of evsel before calling perf_evsel__set_filter() would be better. Signed-off-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-6-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wang Nan 提交于
Before this patch each subcommand calls perf_config() by themself, reading the default configuration together with subcommand specific options. If a subcommand doesn't have it own options, it needs to call 'perf_config(perf_default_config, NULL)' to ensure .perfconfig is loaded. This patch brings perf_config(perf_default_config, NULL) to the very start of main(), so subcommands don't need to do it. After this patch, 'llvm.clang-path' works for 'perf trace'. Signed-off-by: NWang Nan <wangnan0@huawei.com> Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-4-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The column width of dynamic entries is updated when comparing hist entries. However some unique entries can miss the chance to update. So move the update to output resort stage to make sure every entry will get called before display. To do that, abuse ->sort callback to update the width when the third argument is NULL. When resorting entries in normal path, it never be NULL so it should be fine IMHO. Before: # Overhead ptr / bytes_req / gfp_flags # .............. .......................................... # 37.50% 0xffff8803f7669400 37.50% 448 37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC 10.42% 0xffff8803f766be00 8.33% 96 8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC 2.08% 512 2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP <-- here After: # Overhead ptr / bytes_req / gfp_flags # .............. ..................................................... # 37.50% 0xffff8803f7669400 37.50% 448 37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC 10.42% 0xffff8803f766be00 8.33% 96 8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC 2.08% 512 2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NOMEMALLOC Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456512767-1164-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
When dynamic sort key is used it might not show pretty printed output. This is because the trace output was not set only for the first dynamic sort key. During hierarchy_insert_entry() it missed to pass the trace_output to dynamic entries. Also even if it did, only first entry will have it. Subsequent entries might set it during collapsing stage but it's not guaranteed. Before: $ perf report --hierarchy --stdio -s ptr,bytes_req,gfp_flags -g none # # Overhead ptr / bytes_req / gfp_flags # .............. .......................................... # 37.50% 0xffff8803f7669400 37.50% 448 37.50% 66080 10.42% 0xffff8803f766be00 8.33% 96 8.33% 66080 2.08% 512 2.08% 67280 After: # # Overhead ptr / bytes_req / gfp_flags # .............. .......................................... # 37.50% 0xffff8803f7669400 37.50% 448 37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC 10.42% 0xffff8803f766be00 8.33% 96 8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC 2.08% 512 2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456512767-1164-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The dynamic entries are right-aligned unlike other entries since it usually has numeric value. But for the hierarchy mode, left alignment is more appropriate IMHO. Also trim spaces on the left so that we can easily identify the hierarchy. Before: $ perf report --hierarchy -i perf.data.kmem -s gfp_flags,ptr,bytes_req --stdio -g none ... # # Overhead gfp_flags / ptr / bytes_req # .............. ................................................................................................. # 91.67% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC 37.50% 0xffff8803f7669400 37.50% 448 8.33% 0xffff8803f766be00 8.33% 96 4.17% 0xffff8800d156dc00 4.17% 704 After: # Overhead gfp_flags / ptr / bytes_req # .............. .................................... # 91.67% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC 37.50% 0xffff8803f7669400 37.50% 448 8.33% 0xffff8803f766be00 8.33% 96 4.17% 0xffff8800d156dc00 4.17% 704 Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456512767-1164-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
When dynamic entries are used in the hierarchy mode with multiple events, the output might not be aligned properly. In the hierarchy mode, the each sort column is indented using total number of sort keys. So it keeps track of number of sort keys when adding them. However a dynamic sort key can be added more than once when multiple events have same field names. This results in unnecessarily long indentation in the output. For example perf kmem records following events: $ perf evlist --trace-fields -i perf.data.kmem kmem:kmalloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags kmem:kmalloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node kmem:kfree: trace_fields: call_site,ptr kmem:kmem_cache_alloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags kmem:kmem_cache_alloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node kmem:kmem_cache_free: trace_fields: call_site,ptr kmem:mm_page_alloc: trace_fields: page,order,gfp_flags,migratetype kmem:mm_page_free: trace_fields: page,order As you can see, many field names shared between kmem events. So adding 'ptr' dynamic sort key alone will set nr_sort_keys to 6. And this adds many unnecessary spaces between columns. Before: $ perf report -i perf.data.kmem --hierarchy -s ptr -g none --stdio ... # Overhead ptr # ....................... ................................... # 99.89% 0xffff8803ffb79720 0.06% 0xffff8803d228a000 0.03% 0xffff8803f7678f00 0.00% 0xffff880401dc5280 0.00% 0xffff880406172380 0.00% 0xffff8803ffac3a00 0.00% 0xffff8803ffac1600 After: # Overhead ptr # ........ .................... # 99.89% 0xffff8803ffb79720 0.06% 0xffff8803d228a000 0.03% 0xffff8803f7678f00 0.00% 0xffff880401dc5280 0.00% 0xffff880406172380 0.00% 0xffff8803ffac3a00 0.00% 0xffff8803ffac1600 Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456512767-1164-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
When hist_entry__cmp() and hist_entry__collapse() are called, they should check if the dynamic entry is comparing matching hists only. Otherwise it might access different hists resulting in incorrect output. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456512767-1164-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 2月, 2016 6 次提交
-
-
由 Namhyung Kim 提交于
Like the stdio, it should show messages about omitted hierarchy entries. Please refer the previous commit for more details. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456488800-28124-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Like the stdio, it should show messages about omitted hierarchy entries. Please refer the previous commit for more details. As it needs to check an entry is omitted or not multiple times, add the has_no_entry field in the hist entry. Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456488800-28124-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The previous patch introduced __rb_hierarchy_next() function with various move direction like HMD_FORCE_CHILD but missed to change using it some place. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456488800-28124-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
When the hierarchy mode is used, some entries might be omiited due to a percent limit or filter. In this case the output hierarchy is different than other entries. Add an informative message to users about this. For example, when 4% of percent limit is applied: Before: # Overhead Command / Shared Object / Symbol # .............. .......................................... # 49.09% swapper 48.67% [kernel.vmlinux] 34.42% [k] intel_idle 11.51% firefox 8.87% libpthread-2.22.so 6.60% [.] __GI___libc_recvmsg 10.49% gnome-shell 4.74% libc-2.22.so 10.08% Xorg 6.11% libc-2.22.so 5.27% [.] __memcpy_sse2_unaligned 6.15% perf Note that, gnome-shell/libc has no symbols and perf has no dso/symbols. With that patch the output will look like below: After: # Overhead Command / Shared Object / Symbol # .............. .......................................... # 49.09% swapper 48.67% [kernel.vmlinux] 34.42% [k] intel_idle 11.51% firefox 8.87% libpthread-2.22.so 6.60% [.] __GI___libc_recvmsg 10.49% gnome-shell 4.74% libc-2.22.so no entry >= 4.00% 10.08% Xorg 6.11% libc-2.22.so 5.27% [.] __memcpy_sse2_unaligned 6.15% perf no entry >= 4.00% Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456488800-28124-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The hists__overhead_width() is to calculate width occupied by the overhead (and others) columns before the sort columns. The hist_entry__has_hiearchy_children() is to check whether an entry has lower entries (children) in the hierarchy to be shown in the output. This means the children should not be filtered out and above the percent limit. These two functions will be used to show information when all children of an entry is omitted by the percent limit (or filter). Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456488800-28124-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Taeung Song 提交于
script_spec_register() called two functions: script_spec__find() and script_spec__findnew(). But this way script_spec__find() gets called two times, directly and via script_spec__findnew(). So remove script_spec__findnew() and make script_spec_register() only call once script_spec__find(). Signed-off-by: NTaeung Song <treeze.taeung@gmail.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1456413190-12378-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 2月, 2016 12 次提交
-
-
由 Taeung Song 提交于
After collecting samples for events 'syscalls:', perf-script with python script doesn't occasionally work generating a segmentation fault. The reason is that the print fmt is empty and a value of event->print_fmt.args is NULL, so dereferencing the null pointer results in a segmentation fault i.e.: # perf record -e syscalls:* # perf script -g python # perf script -s perf-script.py in trace_begin syscalls__sys_enter_brk 3 79841.832099154 3777 test.sh syscall_nr=12, brk=0 ... (omitted) ... Segmentation fault (core dumped) For example, a format of sys_enter_getuid() hasn't print fmt as below. # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_getuid/format name: sys_enter_getuid ID: 188 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int syscall_nr; offset:8; size:4; signed:1; print fmt: "" So add exception handling to avoid this problem. Signed-off-by: NTaeung Song <treeze.taeung@gmail.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456413179-12331-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
In 1d55e8ef ("perf tools: Introduce opt_event_config nonterminal") I removed the unconditional "'/' '/'" for pmu events such as "intel_pt//" but forgot to use opt_event_config where it expected some event_config, oops. Fix it. Noticed when trying to use: # perf record -e intel_pt// -a sleep 1 event syntax error: 'intel_pt//' \___ parser error Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 1d55e8ef ("perf tools: Introduce opt_event_config nonterminal") Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Stephane Eranian 提交于
This patch improves the error message given by jvmti Makefile when the alternatives command cannot be found. It now suggests the user locates the root of their Java installation and pass it with JDIR= Signed-off-by: NStephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1456378056-18812-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
No need to use strbuf there, its just a simple alloc+formatting, which asprintf does just fine. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-6q6cxfhk8c8ypg3tfpo0i2iy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Support hierarchy output for perf-top using --hierarchy option. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-19-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
In the hierarchy mode, hist entries should decay their children too. Also update hists__delete_entry() to be able to free child entries. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-18-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The --hierarchy option is to show output in hierarchy mode. It extends folding/unfolding in the TUI and GTK browsers to support sort items as well as callchains. Users can toggle the items to see the performance result at wanted level. $ perf report --hierarchy --tui Overhead Command / Shared Object / Symbol -------------------------------------------------- + 32.96% gnome-shell - 15.11% swapper - 14.97% [kernel.vmlinux] 6.82% [k] intel_idle 0.66% [k] menu_select 0.43% [k] __hrtimer_start_range_ns ... Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-17-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The hierarchy output mode is to group entries for each level so that user can see higher level picture more easily. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-16-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Like in stdio, fit column header to hierarchy output. Merge column headers with "/" as a separator. Overhead Command / Shared Object / Symbol ... + 0.09% dwm + 0.06% emacs - 0.05% perf - 0.05% [kernel.vmlinux] + 0.03% [k] memcpy_orig + 0.01% [k] unmap_single_vma + 0.01% [k] smp_call_function_single + 0.00% [k] native_irq_return_iret + 0.00% [k] arch_trigger_all_cpu_backtrace_handler + 0.00% [k] native_write_msr_safe Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-15-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Implement hierarchy mode in TUI. The output is look like stdio but it also supports to fold/unfold children dynamically. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-14-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The 'C' and 'E' keys are to collapse/expand all hist entries. Update nr_hierarchy_entries properly in this case. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-13-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Add nr_hierarchy_entries field to keep current number of (unfolded) hist entries. And the hist_entry->nr_rows carries number of direct children. But in the hierarchy mode, entry can have grand children and callchains. So update the number properly using hierarchy_count_rows() when toggling the folded state (by pressing ENTER key). Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-12-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-