- 18 12月, 2018 36 次提交
-
-
由 Florian Fainelli 提交于
The breakpoint tests on the ARM 32-bit kernel are broken in several ways. The breakpoint length requested does not necessarily match whether the function address has the Thumb bit (bit 0) set or not, and this does matter to the ARM kernel hw_breakpoint infrastructure. See [1] for background. [1]: https://lkml.org/lkml/2018/11/15/205 As Will indicated, the overflow handling would require single-stepping which is not supported at the moment. Just disable those tests for the ARM 32-bit platforms and update the comment above to explain these limitations. Co-developed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181203191138.2419-1-f.fainelli@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Robert Walker 提交于
This patch adds support for generating instruction samples from trace of AArch32 programs using the A32 and T32 instruction sets. T32 has variable 2 or 4 byte instruction size, so the conversion between addresses and instruction counts requires extra information from the trace decoder, requiring version 0.10.0 of OpenCSD. A check for the OpenCSD library version has been added to the feature check for OpenCSD. Signed-off-by: NRobert Walker <robert.walker@arm.com> Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org> Tested-by: NLeo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1543839526-30348-1-git-send-email-robert.walker@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Sihyeon Jang 提交于
Committer testing: Before: # tools/perf/trace/beauty/mmap_flags.sh static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", tools/perf/trace/beauty/mmap_flags.sh: line 23: [: missing `]' [ilog2(0x01) + 1] = "SHARED", [ilog2(0x02) + 1] = "PRIVATE", [ilog2(0x10) + 1] = "FIXED", [ilog2(0x20) + 1] = "ANONYMOUS", [ilog2(0x100000) + 1] = "FIXED_NOREPLACE", tools/perf/trace/beauty/mmap_flags.sh: line 28: [: missing `]' [ilog2(0x0100) + 1] = "GROWSDOWN", [ilog2(0x0800) + 1] = "DENYWRITE", [ilog2(0x1000) + 1] = "EXECUTABLE", [ilog2(0x2000) + 1] = "LOCKED", [ilog2(0x4000) + 1] = "NORESERVE", [ilog2(0x8000) + 1] = "POPULATE", [ilog2(0x10000) + 1] = "NONBLOCK", [ilog2(0x20000) + 1] = "STACK", [ilog2(0x40000) + 1] = "HUGETLB", [ilog2(0x80000) + 1] = "SYNC", }; # After: $ tools/perf/trace/beauty/mmap_flags.sh static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", [ilog2(0x01) + 1] = "SHARED", [ilog2(0x02) + 1] = "PRIVATE", [ilog2(0x10) + 1] = "FIXED", [ilog2(0x20) + 1] = "ANONYMOUS", [ilog2(0x100000) + 1] = "FIXED_NOREPLACE", [ilog2(0x0100) + 1] = "GROWSDOWN", [ilog2(0x0800) + 1] = "DENYWRITE", [ilog2(0x1000) + 1] = "EXECUTABLE", [ilog2(0x2000) + 1] = "LOCKED", [ilog2(0x4000) + 1] = "NORESERVE", [ilog2(0x8000) + 1] = "POPULATE", [ilog2(0x10000) + 1] = "NONBLOCK", [ilog2(0x20000) + 1] = "STACK", [ilog2(0x40000) + 1] = "HUGETLB", [ilog2(0x80000) + 1] = "SYNC", }; $ Signed-off-by: NSihyeon Jang <uneedsihyeon@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: aca70cc40a0b ("perf beauty mmap_flags: Check if the arch has a mmap.h file") Link: http://lkml.kernel.org/r/20181202080651.4685-1-uneedsihyeon@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tzvetomir Stoyanov 提交于
In order to make libtraceevent into a proper library, its API should be straightforward. This patch hides few API functions, intended for internal usage only: tep_free_event(), tep_free_format_field(), __tep_data2host2(), __tep_data2host4() and __tep_data2host8(). The patch also alignes the libtraceevent summary man page with these API changes. Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181130154647.891651290@goodmis.orgSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tzvetomir Stoyanov 提交于
In order to make libtraceevent into a proper library, its API should be straightforward. The __tep_data2host*() functions are going to no longer be available as a libtraceevent API, tep_read_number() should be used instead. This patch replaces __tep_data2host*() usage with tep_read_number() in perf. Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181130154647.743979275@goodmis.orgSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tzvetomir Stoyanov 提交于
In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. This renames tep_free_format() to tep_free_event(), which describes more closely the purpose of the function. Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181130154647.591673556@goodmis.orgSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tzvetomir Stoyanov 提交于
In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. This renames 'struct tep_event_format' to 'struct tep_event', which describes more closely the purpose of the struct. Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181130154647.436403995@goodmis.orgSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> [ Fixup conflict with 6e33c250a88f ("tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.c") ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tzvetomir Stoyanov 提交于
This patch installs trace-seq.h header file on "make install". Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181130154647.176265533@goodmis.orgSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tzvetomir Stoyanov 提交于
This patch implements integration with pkg-config framework. pkg-config can be used by the library users to determine required CFLAGS and LDFLAGS in order to use the library Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181130154647.022471992@goodmis.orgSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tzvetomir Stoyanov 提交于
This patch implements a new API of the tracevent library: int tep_get_ref(struct tep_handle *tep); The API returns the reference counter "ref_count" of the tep handler. As "struct tep_handle" is internal only, its members cannot be accessed by the library users, the API is used to get the reference counter. Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181130154646.890615385@goodmis.orgSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
Add explanations for new columns "IPC" and "IPC coverage" in perf documentation. v5: --- Update the description according to Ingo's comments. Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1543586097-27632-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
Support displaying the average IPC and IPC coverage per symbol in 'perf report' --tui and --stdio modes. For example, $ perf record -b ... $ perf report -s symbol Overhead Symbol IPC [IPC Coverage] 39.60% [.] __random 2.30 [ 54.8%] 18.02% [.] main 0.43 [ 54.3%] 14.21% [.] compute_flag 2.29 [100.0%] 14.16% [.] rand 0.36 [100.0%] 7.06% [.] __random_r 2.57 [ 70.5%] 6.85% [.] rand@plt 0.00 [ 0.0%] Jiri Olsa <jolsa@redhat.com> provided the patch to support the --stdio mode. I merged Jiri's code in this patch. $ perf report -s symbol --stdio # Overhead Symbol IPC [IPC Coverage] # ........ ........................... .................... # 39.60% [.] __random 2.30 [ 54.8%] 18.02% [.] main 0.43 [ 54.3%] 14.21% [.] compute_flag 2.29 [100.0%] 14.16% [.] rand 0.36 [100.0%] 7.06% [.] __random_r 2.57 [ 70.5%] 6.85% [.] rand@plt 0.00 [ 0.0%] 0.02% [k] run_timer_softirq 1.60 [ 57.2%] The columns "IPC" and "[IPC Coverage]" are automatically enabled when the sort-key "symbol" is specified. If the perf.data file doesn't contain timed LBR information, columns are filled with "-". For example, # Overhead Symbol IPC [IPC Coverage] # ........ ........................... .................... # 46.57% [.] main - - 17.60% [.] rand - - 15.84% [.] __random_r - - 11.90% [.] __random - - 6.50% [.] compute_flag - - 1.59% [.] rand@plt - - 0.00% [.] _dl_relocate_object - - 0.00% [k] tlb_flush_mmu - - 0.00% [k] perf_event_mmap - - 0.00% [k] native_sched_clock - - 0.00% [k] intel_pmu_handle_irq_v4 - - 0.00% [k] native_write_msr - - v3: --- Removed the sortkey 'ipc' from command-line. The columns "IPC" and "[IPC Coverage]" are automatically enabled when "symbol" is specified. v2: --- Merge in Jiri's patch to support stdio mode Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1543586097-27632-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
We often use the symbol__annotate2() to annotate a specified symbol. While annotating may take some time, so in order to avoid annotating the same symbol repeatedly, the patch creates a new flag to indicate the symbol has been annotated. Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1543586097-27632-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
Add support to 'perf report' annotate view or 'perf annotate --stdio2' to aggregate the IPC derived from timed LBRs per symbol. We compute the average IPC and the IPC coverage percentage. For example: $ perf annotate --stdio2 Percent IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%) Disassembly of section .text: 000000000003aac0 <random@@GLIBC_2.2.5>: 8.32 3.28 sub $0x18,%rsp 3.28 mov $0x1,%esi 3.28 xor %eax,%eax 3.28 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0 11.57 3.28 1 ↓ je 20 lock cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0 ↓ jne 29 ↓ jmp 43 11.57 1.10 20: cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0 0.00 1.10 1 ↓ je 43 29: lea __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi sub $0x80,%rsp → callq __lll_lock_wait_private add $0x80,%rsp 0.00 3.00 43: lea __ctype_b@GLIBC_2.2.5+0x38,%rdi 3.00 lea 0xc(%rsp),%rsi 8.49 3.00 1 → callq __random_r 7.91 1.94 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0 0.00 1.94 1 ↓ je 68 lock decl __abort_msg@@GLIBC_PRIVATE+0x8a0 ↓ jne 70 ↓ jmp 8a 0.00 2.00 68: decl __abort_msg@@GLIBC_PRIVATE+0x8a0 21.56 2.00 1 ↓ je 8a 70: lea __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi sub $0x80,%rsp → callq __lll_unlock_wake_private add $0x80,%rsp 21.56 2.90 8a: movslq 0xc(%rsp),%rax 2.90 add $0x18,%rsp 9.03 2.90 1 ← retq It shows for this symbol the average IPC is 2.30 and the IPC coverage is 54.8%. Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1543586097-27632-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Tzvetomir Stoyanov 提交于
This patch adds a sanity check to is_timestamp_in_us() input parameter trace_clock. It avoids a potential segfault in this function for the case trace_clock is NULL. Reported-by: NSlavomir Kaslev <kaslevs@vmware.com> Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20181128145552.68c4f87b@gandalf.local.homeSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
If not, then just use what is in asm-generic. This fixes the build for my sh4, m68k and riscv64 perf test build containers that were failing due to 80ee5668 ("perf beauty: Add a generator for MAP_ mmap's flag constants"), that were not covered in the cset introducing those tools/arch/*/include/uapi/asm/mman.h files. f3539c12 ("tools include: Add uapi mman.h for each architecture") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 80ee5668 ("perf beauty: Add a generator for MAP_ mmap's flag constants") Link: https://lkml.kernel.org/n/tip-rpy9t2e0wxpnum1yvxhreafe@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Multi AIO trace writing allows caching more kernel data into userspace memory postponing trace writing for the sake of overall profiling data thruput increase. It could be seen as kernel data buffer extension into userspace memory. With an --aio option value different from 0 (default value is 1) the tool has capability to cache more and more data into user space along with delegating spill to AIO. That allows avoiding to suspend at record__aio_sync() between calls of record__mmap_read_evlist() and increases profiling data thruput at the cost of userspace memory. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/050bb053-e7f3-aa83-fde7-f27ff90be7f6@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
The trace file offset is read once before mmaps iterating loop and written back after all performance data is enqueued for aio writing. The trace file offset is incremented linearly after every successful aio write operation. record__aio_sync() blocks till completion of the started AIO operation and then proceeds. record__aio_mmap_read_sync() implements a barrier for all incomplete aio write requests. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ce2d45e9-d236-871c-7c8f-1bed2d37e8ac@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
The map->data buffer is used to preserve map->base profiling data for writing to disk. AIO map->cblock is used to queue corresponding map->data buffer for asynchronous writing. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5fcda10c-6c63-68df-383a-c6d9e5d1f918@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
This will be used by 'perf record' to speed up reading the perf ring buffer. Committer testing: $ make -C tools/perf O=/tmp/build/perf make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j8' parallel build Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ OFF ] ... libaudit: [ OFF ] ... libbfd: [ OFF ] ... libelf: [ on ] ... libnuma: [ OFF ] ... numa_num_possible_cpus: [ OFF ] ... libperl: [ OFF ] ... libpython: [ OFF ] ... libslang: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] $ ls -la /tmp/build/perf/feature/test-libaio.* -rwxrwxr-x. 1 acme acme 18296 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.bin -rw-rw-r--. 1 acme acme 1165 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.d -rw-rw-r--. 1 acme acme 0 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.make.output $ $ grep -i aio /tmp/build/perf/FEATURE-DUMP feature-libaio=1 $ Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5fcda10c-6c63-68df-383a-c6d9e5d1f918@linux.intel.com [ split from a larger patch ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Users should never use 'pt=0', but if they do it may give a meaningless error: $ perf record -e intel_pt/pt=0/u uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u). Fix that by forcing 'pt=1'. Committer testing: # perf record -e intel_pt/pt=0/u uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u). /bin/dmesg | grep -i perf may provide additional information. # perf record -e intel_pt/pt=0/u uname pt=0 doesn't make sense, forcing pt=1 Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data ] # Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/b7c5b4e5-9497-10e5-fd43-5f3e4a0fe51d@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
This basically replicates what was done for 'perf report' in: b226a5a7 ("perf report: Allow user to specify path to kallsyms file") This should help with resolving eBPF symbols, that are in kallsyms but, of course, not in vmlinux. Reported-by: NIvan Babrou <ibobrik@gmail.com> Tested-by: NIvan Babrou <ibobrik@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-x52mx1ybq8128rtg9hjrj5qk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Wen Yang 提交于
Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)). This makes it more readable and also fix this warning detected by err_cast.cocci: tools/perf/util/bpf-loader.c:1606:11-18: WARNING: ERR_CAST can be used with op Signed-off-by: NWen Yang <wen.yang99@zte.com.cn> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wen Yang <yellowriver2010@hotmail.com> Cc: zhong.weidong@zte.com.cn Link: http://lkml.kernel.org/r/20181127090610.28488-1-wen.yang99@zte.com.cnSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Add ERR_CAST(), so that tools can use it, just like the kernel. This addresses coccinelle checks that are being performed to tools/ in addition to kernel sources, so lets add this to cover that and to get tools code closer to kernel coding standards. This originally was introduced in the kernel headers in this cset: d1bc8e95 ("Add an ERR_CAST() function to complement ERR_PTR and co.") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Wen Yang <yellowriver2010@hotmail.com> Cc: zhong.weidong@zte.com.cn Link: https://lkml.kernel.org/n/tip-tlt97p066zyhzqhl5jt86og7@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Fix inconsistent use of tabs and spaces error: # perf test 16 -v 16: Setup struct perf_event_attr : --- start --- test child forked, pid 20224 File "/usr/libexec/perf-core/tests/attr.py", line 119 log.warning("expected %s=%s, got %s" % (t, self[t], other[t])) ^ TabError: inconsistent use of tabs and spaces in indentation test child finished with -1 ---- end ---- Setup struct perf_event_attr: FAILED! Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20181122140456.16817-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
If the 'sleep' command is provided by coreutils, then the "PERF_RECORD_* events & perf_sample fields" test will fail because the MMAP name is 'coreutils' not 'sleep', and there is an extra COMM event. Fix the test to detect that case. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20181122135545.16295-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Fix following warnings: event-parse.c: In function ‘tep_find_event_by_name’: event-parse.c:3521:21: warning: ‘event’ may be used uninitialized in this function [-Wmaybe-uninitialized] pevent->last_event = event; ~~~~~~~~~~~~~~~~~~~^~~~~~~ CC ui/gtk/hists.o LINK plugin_mac80211.so CC nlattr.o event-parse.c: In function ‘tep_data_lat_fmt’: event-parse.c:5200:4: warning: ‘migrate_disable’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, "%d", migrate_disable); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:5207:4: warning: ‘lock_depth’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, "%d", lock_depth); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ LINK plugin_sched_switch.so LINK plugin_function.so LINK plugin_xen.so event-parse.c: In function ‘tep_event_info’: event-parse.c:5047:7: warning: ‘len_arg’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, format, len_arg, (char)val); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:4884:6: note: ‘len_arg’ was declared here int len_arg; ^~~~~~~ event-parse.c:4338:11: warning: ‘vsize’ may be used uninitialized in this function [-Wmaybe-uninitialized] val = tep_read_number(pevent, bptr, vsize); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:4224:6: note: ‘vsize’ was declared here int vsize; ^~~~~ $ gcc --version gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502 Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Link: http://lkml.kernel.org/r/20181122112937.10582-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Branch stacks do not necessarily have the same cpumode as the 'ip'. Use the fallback functions in those cases. This patch depends on patch "perf tools: Add fallback functions for cases where cpumode is insufficient". Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
thread__resolve() is used in the sample_addr_correlates_sym() cases where 'addr' is a destination of a branch which does not necessarily have the same cpumode as the 'ip'. Use the fallback function in that case. This patch depends on patch "perf tools: Add fallback functions for cases where cpumode is insufficient". Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
For branch stacks or branch samples, the sample cpumode might not be correct because it applies only to the sample 'ip' and not necessary to 'addr' or branch stack addresses. Add fallback functions that can be used to deal with those cases Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Some architectures have a single address space for kernel and user addresses, which makes it possible to determine if an address is in kernel space or user space. Some don't, e.g.: sparc. Cache that info in perf_env so that, for instance, code needing to fallback failed symbol lookups at the kernel space in single address space arches can lookup at userspace. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com [ split from a larger patch ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We'll set a new machine field based on env->arch, which for live mode, like with 'perf top' means we need to use uname() to figure the name of the arch, fix perf_env__arch() to consider both (env == NULL) and (env->arch == NULL) as local operation. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org # 4.19 Link: https://lkml.kernel.org/n/tip-vcz4ufzdon7cwy8dm2ua53xk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Eric Saint-Etienne 提交于
A double pointer is used in map__find() where a single pointer is enough because the function doesn't affect the rbtree and the rbtree is locked. Signed-off-by: NEric Saint-Etienne <eric.saint.etienne@oracle.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eric Saint-Etienne <eric.saintetienne@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1542969759-24346-1-git-send-email-eric.saint.etienne@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Stephane Eranian 提交于
When using the -x option, perf stat prints CSV-style output with one event per line. For each event, it prints the count, the unit, the event name, the cgroup, and a bunch of other event specific fields (such as insn per cycles). When you use CSV-style mode, you expect a normalized output where each event is printed with the same number of fields regardless of what it is so it can easily be imported into a spreadsheet or parsed. For instance, if an event does not have a unit, then print an empty field for it. Although this approach was implemented for the unit, it was not for the cgroup. When mixing cgroup and non-cgroup events, then non-cgroup events would not show an empty field, instead the next field was printed, make columns not line up correctly. This patch fixes the cgroup output issues by forcing an empty field for non-cgroup events as soon as one event has cgroup. Before: <not counted> @ @cycles @foo @ 0 @100.00@@ 2531614 @ @cycles @6420922@100.00@ @ foo cgroup lines up with time_running! After: <not counted> @ @cycles @foo @0 @100.00@@ 2594834 @ @cycles @ @5287372 @100.00@@ Fields line up. Signed-off-by: NStephane Eranian <eranian@google.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1541587845-9150-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ravi Bangoria 提交于
Commit 0aa802a7 ("perf stat: Get rid of extra clock display function") introduced scale and unit for clock events. Thus, perf_stat__update_shadow_stats() now saves scaled values of clock events in msecs, instead of original nsecs. But while calculating values of shadow stats we still consider clock event values in nsecs. This results in a wrong shadow stat values. Ex, # ./perf stat -e task-clock,cycles ls <SNIP> 2.60 msec task-clock:u # 0.877 CPUs utilized 2,430,564 cycles:u # 1215282.000 GHz Fix this by saving original nsec values for clock events in perf_stat__update_shadow_stats(). After patch: # ./perf stat -e task-clock,cycles ls <SNIP> 3.14 msec task-clock:u # 0.839 CPUs utilized 3,094,528 cycles:u # 0.985 GHz Suggested-by: NJiri Olsa <jolsa@redhat.com> Reported-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: yuzhoujian@didichuxing.com Fixes: 0aa802a7 ("perf stat: Get rid of extra clock display function") Link: http://lkml.kernel.org/r/20181116042843.24067-1-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
In debian/ubuntu its libssl-dev, but for fedora/RHEL/Centos/etc its openssl-devel, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 8ee46460 ("perf build: Add libcrypto feature detection") Link: https://lkml.kernel.org/n/tip-lnxqszts6aq2c9jy4b7mlnym@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 11月, 2018 4 次提交
-
-
由 Ingo Molnar 提交于
Merge tag 'perf-core-for-mingo-4.21-20181122' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Start using BPF maps in 'perf trace' for filters in the augmented syscalls code, keeping the existing code for tracepoint filters so that we can switch back and forth while getting everything BPFied (Arnaldo Carvalho de Melo) - Suppress potential format-truncation warning in the PMU code (Ben Hutchings) - Introduce 'perf bench epoll', with "wait" and "ctl" benchmarks (Davidlohr Bueso) - Fix slowness due to -ffunction-section, do it by sorting the maps by name, so avoiding the using rb_first/next to traverse all entries looking for a map name, that with --ffunction-section gets to thousands of maps (Eric Saint-Etienne) - Separate jvmti cmlr check (Jiri Olsa) - Allow using the stepping when figuring out which JSON files to use for a x86 processor, so that Cascadelake server can be support, which has the same cpuid as some other processor, being different only in the stepping (Kan Liang) - Share code and output format for uregs and iregs 'perf script' output (Milian Wolff) - Use perf_evsel__is_clocki() for clock events in 'perf stat' (Ravi Bangoria) Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
The weak functions, strcmp_cpuid_str() and get_cpuid_str(), are defined in pmu.c. Most of the cpuid related functions, including *_cpuid_str()'s declaration and platform specific definition, are in header.c/h. To make the declaration and definition of all cpuid related functions in a consistent place, move the weak functions to header.c. There is no functional change. Suggested-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Link: http://lkml.kernel.org/r/20181121164939.13482-1-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Eric Saint-Etienne 提交于
Perf can take minutes to parse an image when -ffunction-section is used. This is especially true with the kernel image when it is compiled this way, which is the arm64 default since the patcheset "Enable deadcode elimination at link time". Perf organize maps using a rbtree. Whenever perf finds a new symbols, it first searches this rbtree for the map it belongs to, by strcmp()'aring section names. When it finds the map with the right name, it uses it to add the symbol. With a usual image there aren't so many maps but when using -ffunction-section there's basically one map per function. With the kernel image that's north of 40,000 maps. For most symbols perf has to parses the entire rbtree to eventually create a new map and add it. Consequently perf spends most of the time browsing a rbtree that keeps getting larger. This performance fix introduces a secondary rbtree that indexes maps based on the section name. Signed-off-by: NEric Saint-Etienne <eric.saint.etienne@oracle.com> Reviewed-by: NDave Kleikamp <dave.kleikamp@oracle.com> Reviewed-by: NDavid Aldridge <david.aldridge@oracle.com> Reviewed-by: NRob Gardner <rob.gardner@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1542822679-25591-1-git-send-email-eric.saint.etienne@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The Compiled Method Load Record (cmlr) is JDK specific interface to access JVM stack info. This makes the jvmti agent code not compile under another jdk, which does not support that. Separating jvmti cmlr check into special feature check, and adding HAVE_JVMTI_CMLR macro to indicate that. Mark cmlr code in jvmti/libjvmti.c with HAVE_JVMTI_CMLR, so we can compile it on system without cmlr support. This change makes the jvmti compile with java-1.8.0-ibm package. It's without the line numbers support, but the rest works. Adding NO_JVMTI_CMLR compile variable for testing. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Gustavo Luiz Duarte <gduarte@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20181121154341.21521-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-