- 16 5月, 2019 32 次提交
-
-
由 Florian Fainelli 提交于
The Cortex-A57 and Cortex-A72 both support all ARMv8 recommended events up to the RC_ST_SPEC (0x91) event with the exception of: - L1D_CACHE_REFILL_INNER (0x44) - L1D_CACHE_REFILL_OUTER (0x45) - L1D_TLB_RD (0x4E) - L1D_TLB_WR (0x4F) - L2D_TLB_REFILL_RD (0x5C) - L2D_TLB_REFILL_WR (0x5D) - L2D_TLB_RD (0x5E) - L2D_TLB_WR (0x5F) - STREX_SPEC (0x6F) Create an appropriate JSON file for mapping those events and update the mapfile.csv for matching the Cortex-A57 and Cortex-A72 MIDR to that file. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Reviewed-by: NJohn Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean V Kelley <seanvk.dev@oregontracks.org> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org (moderated list:arm pmu profiling and debugging) Link: http://lkml.kernel.org/r/20190513202522.9050-4-f.fainelli@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Florian Fainelli 提交于
Broadcom's Brahma-B53 CPUs support the same type of events that the Cortex-A53 supports, recognize its CPUID and map it to the cortex-a53 events. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Acked-by: NWill Deacon <will.deacon@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean V Kelley <seanvk.dev@oregontracks.org> Cc: bcm-kernel-feedback-list@broadcom.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org (moderated list Link: http://lkml.kernel.org/r/20190513202522.9050-3-f.fainelli@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Florian Fainelli 提交于
ARM64's implementation of get_cpuidr_str() masks out the revision bits [3:0] while reading the CPU identifier, there is no need for the [[:xdigit:]] wildcard. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean V Kelley <seanvk.dev@oregontracks.org> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org (moderated list:arm pmu profiling and debugging) Link: http://lkml.kernel.org/r/20190513202522.9050-2-f.fainelli@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Zenghui Yu 提交于
Address gcc warning: pmu-events/jevents.c: In function ‘save_arch_std_events’: pmu-events/jevents.c:417:15: warning: unused variable ‘sb’ [-Wunused-variable] struct stat *sb = data; ^~ Signed-off-by: NZenghui Yu <yuzenghui@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: wanghaibin.wang@huawei.com Link: http://lkml.kernel.org/r/1557919169-23972-1-git-send-email-yuzenghui@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
The shell tests should not redirect useful output to /dev/null, as that is done automatically by 'perf test' in non verbose mode, so remove that from the zstd comp/decomp test, fixing up verbose mode. Before: $ perf test zstd 68: Zstd perf.data compression/decompression : Ok $ perf test -v zstd 68: Zstd perf.data compression/decompression : --- start --- test child forked, pid 11956 -z, --compression-level[=<n>] Collecting compressed record file: Checking compressed events stats: test child finished with 0 ---- end ---- Zstd perf.data compression/decompression: Ok $ Now: $ perf test zstd 68: Zstd perf.data compression/decompression : Ok $ perf test -v zstd 68: Zstd perf.data compression/decompression : --- start --- test child forked, pid 12695 Collecting compressed record file: 0+500 records in 72+1 records out 37361 bytes (37 kB, 36 KiB) copied, 9.83796 s, 3.8 kB/s [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB /tmp/perf.data.rzq, compressed (original 0.004 MB, ratio is 3.679) ] Checking compressed events stats: # compressed : Zstd, level = 1, ratio = 4 COMPRESSED events: 3 test child finished with 0 ---- end ---- Zstd perf.data compression/decompression: Ok $ Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-tp96618ds42zic94nlh0msz3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Introduce a basic integration test for Zstd based record compression/decompression using 'perf record' and 'perf report'. Committer notes: Reduce a bit the freq (from 25 kHz to 5 kHz) and the number of /dev/null records read (from 1000 to 500), reducing the time it takes to something more in line with the time existing 'perf test' entries take to run. With that in place: $ time perf test zstd 68: Zstd perf.data compression/decompression : Ok real 0m10.376s user 0m0.105s sys 0m0.440s $ grep "model name" /proc/cpuinfo | head -1 model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz $ Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/dc007ae4-104a-2b7c-316e-275929025f0d@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Initialized decompression part of Zstd based API so COMPRESSED records would be decompressed into the resulting output data file. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c27d7500-ecdd-3569-cab5-8f70bbed5ea4@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
zstd_init(, comp_level = 0) initializes decompression part of API only hat now consists of zstd_decompress_stream() function. The perf.data PERF_RECORD_COMPRESSED records are decompressed using zstd_decompress_stream() function into a linked list of mmaped memory regions of mmap_comp_len size (struct decomp). After decompression of one COMPRESSED record its content is iterated and fetched for usual processing. The mmaped memory regions with decompressed events are kept in the linked list till the tool process termination. When dumping raw records (e.g., perf report -D --header) file offsets of events from compressed records are printed as zero. Committer notes: Since now we have support for processing PERF_RECORD_COMPRESSED, we see none, in raw form, like we saw in the previous patch commiter notes, they were decompressed into the usual PERF_RECORD_{FORK,MMAP,COMM,etc} records, we only see the stats for those PERF_RECORD_COMPRESSED events, and since I used the file generated in the commiter notes for the previous patch, there they are, 2 compressed records: $ perf report --header-only | grep cmdline # cmdline : /home/acme/bin/perf record -z2 sleep 1 $ perf report -D | grep COMPRESS COMPRESSED events: 2 COMPRESSED events: 0 $ perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 15 of event 'cycles:u' # Event count (approx.): 962227 # # Overhead Command Shared Object Symbol # ........ ....... ................ ........................... # 46.99% sleep libc-2.28.so [.] _dl_addr 29.24% sleep [unknown] [k] 0xffffffffaea00a67 16.45% sleep libc-2.28.so [.] __GI__IO_un_link.part.1 5.92% sleep ld-2.28.so [.] _dl_setup_hash 1.40% sleep libc-2.28.so [.] __nanosleep 0.00% sleep [unknown] [k] 0xffffffffaea00163 # # (Tip: To see callchains in a more compact form: perf report -g folded) # $ Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Implemented -z,--compression_level[=<n>] option that enables compression of mmaped kernel data buffers content in runtime during perf record mode collection. Default option value is 1 (fastest compression). Compression overhead has been measured for serial and AIO streaming when profiling matrix multiplication workload: ------------------------------------------------------------- | SERIAL | AIO-1 | ----------------------------------------------------------------| |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) | |---------------------------------------------------------------| | 0 | 1,00 | 1,000 179,424 | 1,00 | 1,000 187,527 | | 1 | 1,04 | 8,427 181,148 | 1,01 | 8,474 188,562 | | 2 | 1,07 | 8,055 186,953 | 1,03 | 7,912 191,773 | | 3 | 1,04 | 8,283 181,908 | 1,03 | 8,220 191,078 | | 5 | 1,09 | 8,101 187,705 | 1,05 | 7,780 190,065 | | 8 | 1,05 | 9,217 179,191 | 1,12 | 6,111 193,024 | ----------------------------------------------------------------- OVH = (Execution time with -z N) / (Execution time with -z 0) ratio - compression ratio size - number of bytes that was compressed size ~= trace size x ratio Committer notes: Testing it I noticed that it failed to disable build id processing when compression is enabled, and as we'd have to uncompress everything to look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids to read from DSOs, we better disable build id processing when compression is enabled, logging with pr_debug() when doing so: Original patch: # perf record -z2 ^C[ perf record: Woken up 1 times to write data ] 0x1746e0 [0x76]: failed to process type: 81 [Invalid argument] [ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ] # After auto-disabling build id processing when compression is enabled: $ perf record -z2 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ] $ perf record -v -z2 sleep 1 Compression enabled, disabling build id collection at the end of the session. <SNIP extra -v pr_debug() messages> [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ] $ Also, with parts of the patch originally after this one moved to just before this one we get: $ perf record -z2 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ] $ perf report -D | grep COMPRESS 0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled! 0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled! COMPRESSED events: 2 COMPRESSED events: 0 $ I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code to process, we just show it as not being handled, skip them and continue, while before we had: $ perf report -D | grep COMPRESS 0x1b8 [0x169]: failed to process type: 81 [Invalid argument] Error: failed to process sample 0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED $ Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/9ff06518-ae63-a908-e44d-5d9e56dd66d9@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Committer note: Split from a larger patch, this only dumps PERF_RECORD_COMPRESSED as unhandled, so that when we introduce the record part in the next patch, we don't see unhandled events when using 'perf record -D'. Changed it so that we dump the event if the handler is just a stub, i.e. for the case where we don't have ZSTD linked but we're processing a perf.data file generated by a tool with that linked. Also when failing to decompress we can't just dump the uncompressed event and return 0, we have to propagate the error. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Compression is implemented using the functions from zstd.c. As the memory to operate on the compression uses mmap->aio.data[] buffers. If Zstd streaming compression API fails for some reason the data to be compressed are just copied into the memory buffers using plain memcpy(). Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the compressed chunk that is decompressed on the loading stage. perf_mmap__aio_push() is replaced by perf_mmap__push() which is now used in the both serial and AIO streaming cases. perf_mmap__push() is extended with positive return values to signify absence of data ready for processing. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/77db2b2c-5d03-dbb0-aeac-c4dd92129ab9@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Compression is implemented using the functions from zstd.c. As the memory to operate on the compression uses mmap->data buffer. If Zstd streaming compression API fails for some reason the data to be compressed are just copied into the memory buffers using plain memcpy(). Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the compressed chunk that is decompressed on the loading stage. Comitter notes: Undo some unnecessary line breaks, remove some unnecessary () around zstd_data to then just get its address, and fix conflicts with BPF_PROG_INFO/BPF_BTF patchkits. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/744df43f-3932-2594-ddef-1e99a3cad03a@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Implemented functions are based on Zstd streaming compression API. The functions are used in runtime to compress data that come from mmaped kernel buffer. zstd_init(), zstd_fini() are used for initialization and finalization to allocate and deallocate internal zstd objects. zstd_compress_stream_to_records() is used to convert parts of mmaped kernel buffer into an array of PERF_RECORD_COMPRESSED records. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/18bf36f3-b85a-1fe2-dd83-10e0c6069568@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Implemented mmap data buffer that is used as the memory to operate on when compressing data in case of serial trace streaming. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/49b31321-0f70-392b-9a4f-649d3affe090@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Implemented PERF_RECORD_COMPRESSED event, related data types, header feature and functions to write, read and print feature attributes from the trace header section. comp_mmap_len preserves the size of mmaped kernel buffer that was used during collection. comp_mmap_len size is used on loading stage as the size of decomp buffer for decompression of COMPRESSED events content. Committer notes: Fixed up conflict with BPF_PROG_INFO and BTF_BTF header features. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ebbaf031-8dda-3864-ebc6-7922d43ee515@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexey Budankov 提交于
Define 'bytes_transferred' and 'bytes_compressed' metrics to calculate ratio in the end of the data collection: compression ratio = bytes_transferred / bytes_compressed The 'bytes_transferred' metric accumulates the amount of bytes that was extracted from the mmaped kernel buffers for compression, while 'bytes_compressed' accumulates the amount of bytes that was received after applying compression. Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1d4bf499-cb03-26dc-6fc6-f14fec7622ce@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
So that we can test the ifdef parts for this feature. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-7o65mfl10wlvm8v3f0ombxd1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Donald Yandt 提交于
If fgets() fails due to any other error besides end-of-file, the version char array may not even be null-terminated. Signed-off-by: NDonald Yandt <donald.yandt@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Avi Kivity <avi@scylladb.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Fixes: a1645ce1 ("perf: 'perf kvm' tool for monitoring guest performance from host") Link: http://lkml.kernel.org/r/20190514110100.22019-1-donald.yandt@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
Perf cannot parse UPI (Intel's "Ultra Path Interconnect" [1]) events. # perf stat -e UPI_DATA_BANDWIDTH_TX event syntax error: 'UPI_DATA_BANDWIDTH_TX' \___ parser error Run 'perf list' for a list of valid events The JSON lists call the box UPI LL, while perf calls it upi. Add conversion support to JSON to convert the unit properly. Committer notes: [1] https://en.wikipedia.org/wiki/Intel_Ultra_Path_Interconnect "The Intel Ultra Path Interconnect (UPI) is a point-to-point processor interconnect developed by Intel which replaced the Intel QuickPath Interconnect (QPI) in Xeon Skylake-SP platforms starting in 2017. UPI is a low-latency coherent interconnect for scalable multiprocessor systems with a shared address space. It uses a directory-based home snoop coherency protocol with a transfer speed of up to 10.4 GT/s. Supporting processors typically have two or three UPI links." Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557234991-130456-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
With support for Python 2 or 3 and PySide 1 or 2 (Qt 4 or 5), it is useful to see what versions are in use. Add an 'About' dialog box that displays Python, PySide, Qt and database server (SQLite or PostgreSQL) version numbers. Committer testing: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Then go to 'Help', then 'About', select all the lines with the mouse press 'Control+C', then, on the same terminal press control+shift+V which shows my current environment: Python version: 2.7.16 PySide version: 1 Qt version: 4.8.7 SQLite version: 3.26.0 Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-7-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Add a context menu (right-click) that provides options for copying to clipboard, including, for trees, the ability to copy only the cell under the mouse pointer. Committer testing: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Simply right click and pick "Copy selection", that at this point has just the first line, not expanded, then see what was copied by pressing shift+control+v on a terminal: Call Path,Object,Count,Time (ns),Time (%),Branch Count,Branch Count (%)
▶ simple-retpolin,,,,,, Ditto after expanding, i.e. the selection continues to be just one line: Call Path Object Count Time (ns) Time (%) Branch Count Branch Count (%) ▼ simple-retpolin Now select all the lines with the mouse and control+shift+v again: Call Path Object Count Time (ns) Time (%) Branch Count Branch Count (%) ▼ 14503:14503 ▼ _start ld-2.28.so 1 156267 100.0 10602 100.0▶ unknown unknown 1 2276 1.5 1 0.0▶ _dl_start ld-2.28.so 1 137047 87.7 10088 95.2▶ _dl_init ld-2.28.so 1 9142 5.9 326 3.1 ▼ _start simple-retpoline 1 7457 4.8 182 1.7▶ unknown unknown 1 805 10.8 1 0.5▶ __libc_start_main libc-2.28.so 1 6347 85.1 179 98.4 Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-6-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> -
由 Adrian Hunter 提交于
Add support for copying to clipboard. Two menu options are added to copy the selected rows / columns with normal spacing, or as comma-separated-values. In the case of trees, only entire rows can be copied. Comitter testing: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Select the lines, press control+C and on the same terminal, press control+shift+V and voilà: Call Path Object Count Time (ns) Time (%) Branch Count Branch Count (%) ▼ 14503:14503 ▼ _start ld-2.28.so 1 156267 100.0 10602 100.0 unknown unknown 1 2276 1.5 1 0.0 ▼ _dl_start ld-2.28.so 1 137047 87.7 10088 95.2
▶ unknown unknown 4 4127 3.0 4 0.0 _dl_setup_hash ld-2.28.so 1 0 0.0 1 0.0▶ _dl_sysdep_start ld-2.28.so 1 131342 95.8 9981 98.9 ▼ _dl_init ld-2.28.so 1 9142 5.9 326 3.1 ▼ call_init.part.0 ld-2.28.so 3 9133 99.9 319 97.9▶ _init libc-2.28.so 1 6877 75.3 110 34.5▶ check_stdfiles_vtables libc-2.28.so 1 76 0.8 2 0.6▶ init_cacheinfo libc-2.28.so 1 1991 21.8 197 61.8▶ _start simple-retpoline 1 7457 4.8 182 1.7 Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> -
由 Adrian Hunter 提交于
As preparation for adding support for copying to clipboard, keep track of what level each item is in tree items. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Fix the following error if shrink / enlarge font is used with the help window. Traceback (most recent call last): File "tools/perf/scripts/python/exported-sql-viewer.py", line 2791, in ShrinkFont ShrinkFont(win.view) AttributeError: 'HelpWindow' object has no attribute 'view' Committer testing: Before, matches above output: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Traceback (most recent call last): File "/home/acme/libexec/perf-core/scripts/python/exported-sql-viewer.py", line 2780, in EnlargeFont EnlargeFont(win.view) AttributeError: 'HelpWindow' object has no attribute 'view' $ After: No more tracebacks, but the fonts don't get enlarged, which is kinda frustrating... Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
As preparation for adding support for copying to clipboard, create view in TreeWindowBase instead of derived classes. Committer testing: Tested using an old .db used to test some older patches: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Nothing breaks. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Icelake and later platforms support collecting XMM registers with PEBS event. Add support for 'perf script' to dump them, and support for the register parser in 'perf record -I=' ... to configure them. For now they are just printed in hex, we could potentially later add other formats too. Committer testing: Before: # perf record -IXMM0 Warning: unknown register XMM0, check man page or run 'perf record -I?' Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] # # perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] # After: # perf record -IXMM0 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles). /bin/dmesg | grep -i perf may provide additional information. # # perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names # More work is needed to, when faced with such error, warn the user that that register is not available on the running platform. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190506141926.13659-1-kan.liang@linux.intel.comSigned-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Add quotes around the register name and suggest using 'perf record -I?' to get the list of available registers. Before: # perf record -Idi,xmm20,xmm1 Warning: unknown register xmm20, check man page Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names # # perf record -Idi,xmm20,xmm1 Warning: unknown register "xmm20", check man page or run "perf record -I?" Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/n/tip-9a9hyuum8c0oggg86xd3sxc5@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
$ perf record -h -I Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names $ m $ perf record -I ? Workload failed: No such file or directory $ After: $ perf record -h -I Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names $ $ perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Fixes: bcc84ec6 ("perf record: Add ability to name registers to record") Link: https://lkml.kernel.org/n/tip-r0xhfhy5radmkhhcbcfs5izf@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
When compiled with libunwind, perf does some preparatory work when processing side-band events. This is not needed when report actually don't unwind dwarf callchains, so it's disabled with dwarf_callchain_users bool. However we could move that check to higher level and shield more unwanted code for normal report processing, giving us following speed up on kernel build profile: Before: $ perf record make -j40 ... $ ll ../../perf.data -rw-------. 1 jolsa jolsa 461783932 Apr 26 09:11 perf.data $ perf stat -e cycles:u,instructions:u perf report -i perf.data > out Performance counter stats for 'perf report -i perf.data': 78,669,920,155 cycles:u 99,076,431,951 instructions:u # 1.26 insn per cycle 55.382823668 seconds time elapsed 27.512341000 seconds user 27.712871000 seconds sys After: $ perf stat -e cycles:u,instructions:u perf report -i perf.data > out Performance counter stats for 'perf report -i perf.data': 59,626,798,904 cycles:u 88,583,575,849 instructions:u # 1.49 insn per cycle 21.296935559 seconds time elapsed 20.010191000 seconds user 1.202935000 seconds sys The speed is higher with profile having many side-band events, because these trigger libunwind preparatory code. This does not apply for perf compiled with libdw for dwarf unwind, only for build with libunwind. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190426073804.17238-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mao Han 提交于
This patch add support for DWARF register mappings and libdw registers initialization, which is used by perf callchain analyzing when --call-graph=dwarf is given. Here is the elfutils csky backend patch set: https://sourceware.org/ml/elfutils-devel/2019-q2/msg00007.htmlSigned-off-by: NMao Han <han_mao@c-sky.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-arch@vger.kernel.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1555860794-10572-1-git-send-email-guoren@kernel.orgSigned-off-by: NGuo Ren <ren_guo@c-sky.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Colin Ian King 提交于
There are a couple of spelling mistakes in test assert messages. Fix them. Signed-off-by: NColin King <colin.king@canonical.com> Reviewed-by: NMukesh Ojha <mojha@codeaurora.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/20190417105539.5902-1-colin.king@canonical.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
The hist__account_cycles() function is executed when the hist_iter__branch_callback() is called. But it looks it's not necessary. In hist__account_cycles, it already walks on all branch entries. This patch moves the hist__account_cycles out of callback, now the data processing is much faster than before. Previous code has an issue that the ch[offset].num++ (in __symbol__account_cycles) is executed repeatedly since hist__account_cycles is called in each hist_iter__branch_callback, so the counting of ch[offset].num is not correct (too big). With this patch, the issue is fixed. And we don't need the code of "ch->reset >= ch->num / 2" to check if there are too many overlaps (in annotation__count_and_fill), otherwise some data would be hidden. Now, we can try, for example: perf record -b ... perf annotate or perf report -s symbol The before/after output should be no change. v3: --- Fix the crash in stdio mode. Like previous code, it needs the checking of ui__has_annotation() before hist__account_cycles() v2: --- 1. Cover the similar perf report 2. Remove the checking code "ch->reset >= ch->num / 2" Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1552684577-29041-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 03 5月, 2019 8 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
We were including sys/syscall.h and asm/unistd.h, since sys/syscall.h includes asm/unistd.h, sometimes this leads to the redefinition of defines, breaking the build. Noticed on ARC with uCLibc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rich Felker <dalias@libc.org> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: https://lkml.kernel.org/n/tip-xjpf80o64i2ko74aj2jih0qg@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Thomas Backlund reported that the perf build was failing on the Mageia 7 distro, that is because it uses: cat /tmp/build/perf/feature/test-disassembler-four-args.make.output /usr/bin/ld: /usr/lib64/libbfd.a(plugin.o): in function `try_load_plugin': /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:243: undefined reference to `dlopen' /usr/bin/ld: /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:271: undefined reference to `dlsym' /usr/bin/ld: /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:256: undefined reference to `dlclose' /usr/bin/ld: /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:246: undefined reference to `dlerror' as we allow dynamic linking and loading Mageia 7 uses these linker flags: $ rpm --eval %ldflags -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro -Wl,-O1 -Wl,--build-id -Wl,--enable-new-dtags So add -ldl to this feature LDFLAGS. Reported-by: NThomas Backlund <tmb@mageia.org> Tested-by: NThomas Backlund <tmb@mageia.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lkml.kernel.org/r/20190501173158.GC21436@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Leo Yan 提交于
Robert Walker reported a segmentation fault is observed when process CoreSight trace data; this issue can be easily reproduced by the command 'perf report --itrace=i1000i' for decoding tracing data. If neither the 'b' flag (synthesize branches events) nor 'l' flag (synthesize last branch entries) are specified to option '--itrace', cs_etm_queue::prev_packet will not been initialised. After merging the code to support exception packets and sample flags, there introduced a number of uses of cs_etm_queue::prev_packet without checking whether it is valid, for these cases any accessing to uninitialised prev_packet will cause crash. As cs_etm_queue::prev_packet is used more widely now and it's already hard to follow which functions have been called in a context where the validity of cs_etm_queue::prev_packet has been checked, this patch always allocates memory for cs_etm_queue::prev_packet. Reported-by: NRobert Walker <robert.walker@arm.com> Suggested-by: NRobert Walker <robert.walker@arm.com> Signed-off-by: NLeo Yan <leo.yan@linaro.org> Tested-by: NRobert Walker <robert.walker@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Fixes: 7100b12c ("perf cs-etm: Generate branch sample for exception packet") Fixes: 24fff5eb ("perf cs-etm: Avoid stale branch samples when flush packet") Link: http://lkml.kernel.org/r/20190428083228.20246-1-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Leo Yan 提交于
Since cs_etm_queue::prev_packet is allocated for all cases, it will never be NULL pointer; now validity checking prev_packet is pointless, remove all of them. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Tested-by: NRobert Walker <robert.walker@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190428083228.20246-2-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Thomas Richter 提交于
An -ENOMEM error is not reported in the GTK GUI. Instead this error message pops up on the screen: [root@m35lp76 perf]# ./perf report -i perf.data.error68-1 Processing events... [974K/3M] Error:failed to process sample 0xf4198 [0x8]: failed to process type: 68 However when I use the same perf.data file with --stdio it works: [root@m35lp76 perf]# ./perf report -i perf.data.error68-1 --stdio \ | head -12 # Total Lost Samples: 0 # # Samples: 76K of event 'cycles' # Event count (approx.): 99056160000 # # Overhead Command Shared Object Symbol # ........ ............... ................. ......... # 8.81% find [kernel.kallsyms] [k] ftrace_likely_update 8.74% swapper [kernel.kallsyms] [k] ftrace_likely_update 8.34% sshd [kernel.kallsyms] [k] ftrace_likely_update 2.19% kworker/u512:1- [kernel.kallsyms] [k] ftrace_likely_update The sample precentage is a bit low..... The GUI always fails in the FINISHED_ROUND event (68) and does not indicate the reason why. When happened is the following. Perf report calls a lot of functions and down deep when a FINISHED_ROUND event is processed, these functions are called: perf_session__process_event() + perf_session__process_user_event() + process_finished_round() + ordered_events__flush() + __ordered_events__flush() + do_flush() + ordered_events__deliver_event() + perf_session__deliver_event() + machine__deliver_event() + perf_evlist__deliver_event() + process_sample_event() + hist_entry_iter_add() --> only called in GUI case!!! + hist_iter__report__callback() + symbol__inc_addr_sample() Now this functions runs out of memory and returns -ENOMEM. This is reported all the way up until function perf_session__process_event() returns to its caller, where -ENOMEM is changed to -EINVAL and processing stops: if ((skip = perf_session__process_event(session, event, head)) < 0) { pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n", head, event->header.size, event->header.type); err = -EINVAL; goto out_err; } This occurred in the FINISHED_ROUND event when it has to process some 10000 entries and ran out of memory. This patch indicates the root cause and displays it in the status line of ther perf report GUI. Output before (on GUI status line): 0xf4198 [0x8]: failed to process type: 68 Output after: 0xf4198 [0x8]: failed to process type: 68 [not enough memory] Committer notes: the 'skip' variable needs to be initialized to -EINVAL, so that when the size is less than sizeof(struct perf_event_attr) we avoid this valid compiler warning: util/session.c: In function ‘perf_session__process_events’: util/session.c:1936:7: error: ‘skip’ may be used uninitialized in this function [-Werror=maybe-uninitialized] err = skip; ~~~~^~~~~~ util/session.c:1874:6: note: ‘skip’ was declared here s64 skip; ^~~~ cc1: all warnings being treated as errors Signed-off-by: NThomas Richter <tmricht@linux.ibm.com> Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: NJiri Olsa <jolsa@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20190423105303.61683-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
While cross building perf to the ARC architecture on a fedora 30 host, we were failing with: CC /tmp/build/perf/bench/numa.o bench/numa.c: In function ‘worker_thread’: bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’? getrusage(RUSAGE_THREAD, &rusage); ^~~~~~~~~~~~~ SIGEV_THREAD bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in [perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version | head -1 arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225 [perfbuilder@60d5802468f6 perf]$ Trying to reproduce a report by Vineet, I noticed that, with just cross-built zlib and numactl libraries, I ended up with the above failure. So, since RUSAGE_THREAD is available as a define, check for that and numactl libraries, I ended up with the above failure. So, since RUSAGE_THREAD is available as a define in the system headers, check if it is defined in the 'perf bench numa' sources and define it if not. Now it builds and I have to figure out if the problem reported by Vineet only takes place if we have libelf or some other library available. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linux-snps-arc@lists.infradead.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
Commit 6987561c ("perf annotate: Enable annotation of BPF programs") adds support for BPF programs annotations but the new code does not build on 32-bit. Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: NSong Liu <songliubraving@fb.com> Fixes: 6987561c ("perf annotate: Enable annotation of BPF programs") Link: http://lkml.kernel.org/r/20190403194452.10845-1-cascardo@canonical.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Bo YU 提交于
In perf_env__find_btf(), we're returning without unlocking "env->bpf_progs.lock". There may be cause lockdep issue. Detected by CoversityScan, CID# 1444762:(program hangs(LOCK)) Signed-off-by: NBo YU <tsu.yubo@gmail.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Fixes: 2db7b1e0: (perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf()) Link: http://lkml.kernel.org/r/20190422080138.10088-1-tsu.yubo@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-