提交 · 99620a5d0cc8e2dd9aedb629a6e81825f0db020e · openanolis / cloud-kernel

28 10月, 2016 1 次提交

perf tools: Introduce timestamp__scnprintf_usec() · 99620a5d

由 Namhyung Kim 提交于 10月 24, 2016

Joonwoo reported that there's a mismatch between timestamps in script
and sched commands.  This was because of difference in printing the
timestamp.  Factor out the code and share it so that they can be in
sync.  Also I found that sched map has similar problem, fix it too.

Committer notes:

Fixed the max_lat_at bug introduced by Namhyung's original patch, as
pointed out by Joonwoo, and made it a function following the scnprintf()
model, i.e. returning the number of bytes formatted, and receiving as
the first parameter the object from where the data to the formatting is
obtained, renaming it from:

   char *timestamp_in_usec(char *bf, size_t size, u64 timestamp)

to

   int timestamp__scnprintf_usec(u64 timestamp, char *bf, size_t size)
Reported-by: NJoonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20161024020246.14928-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

99620a5d

24 10月, 2016 20 次提交

perf list: Make vendor event matching case insensitive · 38d14f0c

由 Andi Kleen 提交于 10月 19, 2016

Make the 'perf list' glob matching for vendor events case insensitive.
This allows to use the upper case vendor events with perf list too.

Now the following works:

  % perf list LONGEST_LAT

  ...

  cache:
    longest_lat_cache.miss
         [Core-originated cacheable demand requests missed LLC]
    longest_lat_cache.reference
         [Core-originated cacheable demand requests that refer to LLC]
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Suggested-by: NIngo Molnar <mingo@kernel.org>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1476899402-31460-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

38d14f0c

perf tools: Use normal error reporting when processing PERF_RECORD_READ events · 89973506

由 Arnaldo Carvalho de Melo 提交于 10月 14, 2016

We already have handling for errors when processing PERF_RECORD_ events,
so instead of calling die() when not being able to alloc, propagate the
error, so that the normal UI exit sequence can take place, the user be
warned and possibly the terminal be properly reset to a sane mode.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-r90je3c009a125dvs3525yge@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

89973506

perf tools: Normalize sq_quote_argv() error reporting · e7b32d12

由 Arnaldo Carvalho de Melo 提交于 10月 14, 2016

It already returns whatever strbuf_(grow|addch)() returns in case of
failure, so just return -ENOSPC in the only case where it was die()ing.
When it returns, its only caller will call die() anyway, so no need to
be so eager, die later.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-as05b7mbogprlwi8iarwns8e@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

e7b32d12

perf hists browser: Dynamically change verbosity level · 21e8c810

由 Alexis Berlemont 提交于 10月 12, 2016

Here is a small patch which tries to fulfill a point in the perf todo
list:

* Make pressing 'V' multiple times to go on cycling thru various
  verbosity levels in 'perf top', so that info that is present in
  'perf top -v' can be obtained without having to restart the tool
  (acme).

After a small grep in the code, the max verbosity level seems 3; so,
we cycle at 4; I did not dare define a MAX_VERBOSE_LEVEL constant.
Signed-off-by: NAlexis Berlemont <alexis.berlemont@gmail.com>
Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161012214823.14324-2-alexis.berlemont@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

21e8c810

perf tools: Fix typo "No enough" to "Not enough" · 042cfb5f

由 Alexander Alemayhu 提交于 10月 13, 2016

The latter version occurs much more when running git grep.
Signed-off-by: NAlexander Alemayhu <alexander@alemayhu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20161013161811.4939-1-alexander@alemayhu.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

042cfb5f

perf pmu: Only print Using CPUID message once · fb967063

由 Andi Kleen 提交于 10月 13, 2016

With uncore event aliases which are duplicated over multiple PMUs the
"Using CPUID" message with -v could be printed many times.  Only print
it once.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1476393332-20732-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

fb967063

perf jit: Check JITHEADER_VERSION · 6760d77b

由 Stefano Sanfilippo 提交于 10月 13, 2016

Check the version number when opening a jitdump file.  Accept older
versions, but not newer ones.
Signed-off-by: NStefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: NRoss McIlroy <rmcilroy@chromium.org>
Reviewed-by: NStephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-9-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

6760d77b

perf jit: Generate .eh_frame/.eh_frame_hdr in DSO · 086f9f3d

由 Stefano Sanfilippo 提交于 10月 13, 2016

When the jit_buf_desc contains unwinding information, it is emitted as
eh_frame unwinding sections in the DSOs generated by perf inject.

The unwinding information is required to unwind of JITed code which do
not maintain the frame pointer register during function calls.  It can
be emitted by V8 / Chromium when the --perf_prof_unwinding_info is
passed to V8.

The eh_frame and eh_frame_hdr sections are emitted immediately after the
.text.

The .eh_frame is aligned at a 8-byte boundary, and .eh_frame_hdr at a
4-byte one. Since size of the .eh_frame is required to be a multiple of
the word size, which means there will never be additional padding
between it and the .eh_frame_hdr on machines where the word size is 4 or
8 bytes.

However, additional padding might be inserted between .text and
.eh_frame to reach the correct alignment, which will always be 8 bytes,
also on 32bit machines. The reasoning behind this choice is that 4 extra
bytes of padding worst case are not a large cost for the advantage of
removing word-size dependent offset calculations when emitting the
jitdump.
Signed-off-by: NStefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: NRoss McIlroy <rmcilroy@chromium.org>
Reviewed-by: NStephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-8-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

086f9f3d

perf jit: Add unwinding support · 0284fecd

由 Stefano Sanfilippo 提交于 10月 13, 2016

This record is intended to provide unwinding information in the
eh_frame format. This is required to unwind JITed code which
does not maintain the frame pointer register during function calls.

The eh_frame unwinding information can be emitted by V8 / Chromium
when the --perf_prof_unwinding_info is passed.

A record of type jr_code_unwinding_info comes before the jr_code_load
it referred to and contains both the .eh_frame and .eh_frame_hdr.

The fields in the header have the following meaning:

  * unwinding_size: size of the eh_frame and eh_frame_hdr, necessary
    for distinguishing the content from the padding.

  * eh_frame_hdr_size: as the name says.

  * mapped_size: size of the payload that was in memory at runtime.
    typically unwinding_size if the .eh_frame_hdr and .eh_frame were
    mapped, or 0 if they weren't. It should always be the former case,
    since the .eh_frame is guaranteed to be mapped in memory. However,
    certain JITs might want to inject an .eh_frame_hdr with an empty LUT
    to trigger fp-based unwinding fallback in libunwind. The only part
    of the .eh_frame_hdr that libunwind reads from remote memory is the
    LUT, and since there is none, mapping the unwinding info in memory
    is not necessary, and 0 in this field signifies that it wasn't.
    This practical hack allows to save bytes in code memory for those
    JIT compilers that might or might not maintain a valid frame pointer.

The payload that follows is assumed to contain first the .eh_frame and
then the .eh_header_hdr, with no padding between the two.
Signed-off-by: NStefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: NRoss McIlroy <rmcilroy@chromium.org>
Reviewed-by: NStephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-7-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0284fecd

perf jit: Do not assume pgoff is zero · eac05af2

由 Stefano Sanfilippo 提交于 10月 13, 2016

When calculating .eh_frame_hdr base and LUT offsets do not always assume
that pgoff is zero.

The assumption is false for DSOs built from the jitdump by perf inject,
because the ELF header did not exist in memory at sampling time.
Signed-off-by: NStefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: NRoss McIlroy <rmcilroy@chromium.org>
Reviewed-by: NStephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-6-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

eac05af2

perf jit: Make perf skip unknown records · 7354ec7a

由 Stefano Sanfilippo 提交于 10月 13, 2016

The behavior before this commit was to skip the remaining portion of the
jitdump in case an unknown record was found, including those records
that perf could handle.

With this change, parsing a record with an unknown id will cause a
warning to be emitted, the record will be skipped and parsing will
resume from the next (valid) one.

The patch aims at making perf more future proof, by extracting as much
information as possible from jitdumps.
Signed-off-by: NStefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: NRoss McIlroy <rmcilroy@chromium.org>
Reviewed-by: NStephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-5-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

7354ec7a

perf jit: Enable jitdump support without dwarf · 621cb4e7

由 Maciej Debski 提交于 10月 13, 2016

This patch modifies the build dependencies on the jitdump support in
perf. As it stands jitdump was wrongfully made dependent 100% on using
DWARF. However, the dwarf dependency, only exist if generating the
source line table in genelf_debug.c. The rest of the support does not
need DWARF.

This patch removes the dependency on DWARF for the entire jitdump
support. It keeps it only for the genelf_debug.c support.
Signed-off-by: NMaciej Debski <maciejd@google.com>
Reviewed-by: NStephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-3-git-send-email-eranian@google.com
Fixes: e12b202f ("perf jitdump: Build only on supported archs")
[ Make it build only if NO_LIBELF isn't defined, as jitdump.o will only be built in that case ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

621cb4e7

perf jit: Add NT_GNU_BUILD_ID definition for older distros · 5fef5f3f

由 Arnaldo Carvalho de Melo 提交于 10月 13, 2016

Such as CentOS5, where such define is not present in elf.h.

This file, genelf.c, wasn't being built for several systems, because
it mistakenly was conditional on some DWARF features, now that it
is just needing libelf, after "perf jit: Enable jitdump support without
dwarf" it fails.

So, as preparation for "perf jit: Enable jitdump support without dwarf",
conditionally define it, if not available.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Maciej Debski <maciejd@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-k09qay1cmr0l3fzprmztzy3o@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

5fef5f3f

perf jit: Avoid returning garbage for a ret variable · ef2c3e76

由 Arnaldo Carvalho de Melo 提交于 10月 13, 2016

When the loop body isn't executed at all, then the 'ret' local variable,
that is uninitialized will be used as the return value.

This triggers this error on Alpine Linux:

    CC	   /tmp/build/perf/util/demangle-java.o
    CC	   /tmp/build/perf/util/demangle-rust.o
    CC       /tmp/build/perf/util/jitdump.o
    CC       /tmp/build/perf/util/genelf.o
  util/jitdump.c: In function 'jit_process':
  util/jitdump.c:622:3: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     fprintf(stderr, "injected: %s (%d)\n", path, ret);
     ^
  util/jitdump.c:584:6: note: 'ret' was declared here
    int ret;
        ^
    FLEX     /tmp/build/perf/util/parse-events-flex.c

  / $ gcc -v
  Using built-in specs.
  COLLECT_GCC=gcc
  COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-alpine-linux-musl/5.3.0/lto-wrapper
  Target: x86_64-alpine-linux-musl
  Configured with: /home/buildozer/aports/main/gcc/src/gcc-5.3.0/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info
  +--build=x86_64-alpine-linux-musl --host=x86_64-alpine-linux-musl --target=x86_64-alpine-linux-musl --with-pkgversion='Alpine 5.3.0' --enable-checking=release
  +--disable-fixed-point --disable-libstdcxx-pch --disable-multilib --disable-nls --disable-werror --disable-symvers --enable-__cxa_atexit --enable-esp
  +--enable-cloog-backend --enable-languages=c,c++,objc,java,fortran,ada --disable-libssp --disable-libmudflap --disable-libsanitizer --enable-shared
  +--enable-threads --enable-tls --with-system-zlib
  Thread model: posix
  gcc version 5.3.0 (Alpine 5.3.0)

But this so far got under the radar, not causing any build problem, till the
"perf jit: enable jitdump support without dwarf" gets applied, when the above
problem takes place, some combination of inlining or whatever, the problem
is real, so fix it by initializing the variable to zero.

Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Maciej Debski <maciejd@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lkml.kernel.org/r/20161013200437.GA12815@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

ef2c3e76

perf tools: Implement branch_type event parameter · ac12f676

由 Andi Kleen 提交于 10月 12, 2016

It can be useful to specify branch type state per event, for example if
we want to collect both software trace points and last branch PMU events
in a single collection. Currently this doesn't work because the software
trace point errors out with -b.

There was already a branch-type parameter to configure branch sample
types per event in the parser, but it was stubbed out. This patch
implements the necessary plumbing to actually enable it.

Now:

  $ perf record -e sched:sched_switch,cpu/cpu-cycles,branch_type=any/ ...

works.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1476306127-19721-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

ac12f676

perf header: Display feature name on write failure · 0c2aff4c

由 Jiri Olsa 提交于 10月 10, 2016

Display name of feature instead of just the number
during recording data.

Before:
  failed to write feature 13

Now:
  failed to write feature HEADER_CPU_TOPOLOGY
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-k9d9trozi5kkx737cy8n5xh5@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0c2aff4c

perf header: Display missing features · aabae165

由 Jiri Olsa 提交于 10月 10, 2016

Display missing features in header info, like:

  $ perf report --header-only
  # ========
  # captured on: Mon Oct 10 09:39:47 2016
  ...
  # missing features: HEADER_TRACING_DATA HEADER_CPU_TOPOLOGY ...

To help in diagnosing problems.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-bh5gp84gobdmyl345dcp64se@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

aabae165

perf report: Move captured info to generic header info · f45f5615

由 Jiri Olsa 提交于 10月 10, 2016

It's not displayed in TUI now, putting it into generic part.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5fk88kejqgi50ye7xdkhiloz@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

f45f5615

perf intel-pt/bts: Report instruction bytes and length in sample · faaa8768

由 Andi Kleen 提交于 10月 07, 2016

Change Intel PT and BTS to pass up the length and the instruction
bytes of the decoded or sampled instruction in the perf sample.

The decoder already knows this information, we just need to pass it
up. Since it is only a couple of movs it is not very expensive.

Handle instruction cache too. Make sure ilen is always initialized.

Used in the next patch.

[Adrian: re-base on top (and adjust for) instruction buffer size tidy-up]
[Adrian: add BTS support and adjust commit message accordingly]
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/1475847747-30994-3-git-send-email-adrian.hunter@intel.comSigned-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

faaa8768

perf intel-pt/bts: Tidy instruction buffer size usage · 32f98aab

由 Adrian Hunter 提交于 10月 07, 2016

Tidy instruction buffer size usage in preparation for copying the
instruction bytes onto samples.

The instruction buffer is presently used for debugging, so rename its
size macro from INTEL_PT_INSN_DBG_BUF_SZ to INTEL_PT_INSN_BUF_SZ, and
use it everywhere.

Note that the maximum instruction size is 15 which is a less efficient size
to copy than 16, which is why a separate buffer size is used.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1475847747-30994-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

32f98aab

21 10月, 2016 1 次提交

perf c2c report: Limit the cachelines table entries · 9857b717

由 Jiri Olsa 提交于 8月 17, 2016

Add a limit for entries number of the cachelines table entries. By
default now it's the 0.0005% minimum of remote HITMs.

Also display only cachelines with remote hitm or store data.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-inykbom2f19difvsu1e18avr@git.kernel.org
[ Disabled for now ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

9857b717

20 10月, 2016 3 次提交

perf c2c report: Add src line sort key · 89d9ba8f

由 Jiri Olsa 提交于 7月 10, 2016

It is to be displayed in the single cacheline output:

  cl_srcline

It displays source line related to the code address that accessed
cacheline. It's a wrapper to global srcline sort entry.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-cmnzgm37mjz56ozsg4mnbgxq@git.kernel.org
[ Remove __maybe_unused from now used 'he' parameter in filter_cb() ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

89d9ba8f

perf c2c: Introduce c2c_add_stats function · 0a9a24cc

由 Jiri Olsa 提交于 9月 22, 2016

Introducing c2c_add_stats function helper to cumulate c2c_stats.
Original-patch-by: NDick Fowles <rfowles@redhat.com>
Original-patch-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1474558645-19956-4-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0a9a24cc

perf c2c: Introduce c2c_decode_stats function · aadddd68

由 Jiri Olsa 提交于 9月 22, 2016

Introducing c2c_decode_stats function, which decodes
data_src data into new struct c2c_stats.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Original-patch-by: NDick Fowles <rfowles@redhat.com>
Original-patch-by: NDon Zickus <dzickus@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1474558645-19956-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

aadddd68

17 10月, 2016 1 次提交

perf jevents: Handle events including .c and .o · 2d470b62

由 Wang Nan 提交于 10月 08, 2016

This patch helps with Sukadev's vendor event tree where such events can happen.

>From Andi Kleen:
 Any event including a .c/.o/.bpf currently triggers BPF compilation or loading
 and then an error. This can happen for some Intel vendor events, which cannot
 be used.

This patch fixes this problem by forbidding BPF file patch containing '{', '}'
and ',', make sure flex consumes the leading '{', instead of matching it using
a BPF file path.

Tested result:

  $ perf stat -e '{unc_p_clockticks,unc_p_power_state_occupancy.cores_c0}' -a -I 1000
  invalid or unsupported event: '{unc_p_clockticks,unc_p_power_state_occupancy.cores_c0}'
  Run 'perf list' for a list of valid events
  (as expected, interperted as event)

  $ perf stat -e 'aaa.c' -a -I 1000
  ERROR: problems with path aaa.c: No such file or directory
  (as expected, interpreted as BPF source)

  $ perf stat -e 'aaa.ccc' -a -I 1000
  invalid or unsupported event: 'aaa.ccc'
  (as expected, interpreted as event)

  $ perf stat -e '{aaa.c}' -a -I 1000
  ERROR: problems with path aaa.c: No such file or directory
  event syntax error: '{aaa.c}'
  <SKIP>
  (as expected, interpreted as BPF source)

  $ perf stat -e '{cycles,aaa.c}' -a -I 1000
  ERROR: problems with path aaa.c: No such file or directory
  event syntax error: '{cycles,aaa.c}'
  (as expected, interpreted as BPF source)
Signed-off-by: NWang Nan <wangnan0@huawei.com>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Reported-by: NAndi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1475900185-37967-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

2d470b62

13 10月, 2016 1 次提交

perf header: Set nr_numa_nodes only when we parsed all the data · f957a530

由 Jiri Olsa 提交于 10月 10, 2016

Sukadev reported segfault on releasing perf env's numa data.  It's due
to nr_numa_nodes being set no matter if the numa data gets parsed
properly. The perf_env__exit crash the on releasing non existed data.

Setting nr_numa_nodes only when data are parsed out properly.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Reported-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dt9c0zgkt4hybn2cr4xiawta@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

f957a530

05 10月, 2016 2 次提交

perf intel-pt: Fix MTC timestamp calculation for large MTC periods · 3bccbe20

由 Adrian Hunter 提交于 9月 28, 2016

The MTC packet provides a 8-bit slice of CTC which is related to TSC by
the TMA packet, however the TMA packet only provides the lower 16 bits
of CTC. If mtc_shift > 8 then some of the MTC bits are not in the CTC
provided by the TMA packet. Fix-up the last_mtc calculated from the TMA
packet by copying the missing bits from the current MTC assuming the
least difference between the two, and that the current MTC comes after
last_mtc.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.3+
Link: http://lkml.kernel.org/r/1475062896-22274-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

3bccbe20

perf intel-pt: Fix estimated timestamps for cycle-accurate mode · 51ee6481

由 Adrian Hunter 提交于 9月 28, 2016

In cycle-accurate mode, timestamps can be calculated from CYC packets.
The decoder also estimates timestamps based on the number of
instructions since the last timestamp. For that to work in
cycle-accurate mode, the instruction count needs to be reset to zero
when a timestamp is calculated from a CYC packet, but that wasn't
happening, so fix it.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.3+
Link: http://lkml.kernel.org/r/1475062896-22274-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

51ee6481

04 10月, 2016 9 次提交

perf tools: Make alias matching case-insensitive · e312bcf1

由 Andi Kleen 提交于 9月 15, 2016

Make alias matching the events parser case-insensitive. This is useful
with the JSON events. perf uses lower case events, but the CPU manuals
generally use upper case event names. The JSON files use lower case by
default too. But if we search case insensitively then users can
cut-n-paste the upper case event names.

So the following works:

% perf stat -e BR_INST_EXEC.TAKEN_INDIRECT_NEAR_CALL true

 Performance counter stats for 'true':

               305      BR_INST_EXEC.TAKEN_INDIRECT_NEAR_CALL

       0.000492799 seconds time elapsed
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-17-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

e312bcf1

perf tools: Allow period= in perf stat CPU event descriptions. · 06835545

由 Sukadev Bhattiprolu 提交于 9月 15, 2016

This avoids the JSON PMU events parser having to know whether its
aliases are for perf stat or perf record.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-20-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

06835545

perf list jevents: Add support for event list topics · dd5f1036

由 Andi Kleen 提交于 9月 15, 2016

Add support to group the output of perf list by the Topic field in the
JSON file.

Example output:

% perf list
...
Cache:
  l1d.replacement
       [L1D data line replacements]
  l1d_pend_miss.pending
       [L1D miss oustandings duration in cycles]
  l1d_pend_miss.pending_cycles
       [Cycles with L1D load Misses outstanding]
  l2_l1d_wb_rqsts.all
       [Not rejected writebacks from L1D to L2 cache lines in any state]
  l2_l1d_wb_rqsts.hit_e
       [Not rejected writebacks from L1D to L2 cache lines in E state]
  l2_l1d_wb_rqsts.hit_m
       [Not rejected writebacks from L1D to L2 cache lines in M state]

...
Pipeline:
  arith.fpu_div
       [Divide operations executed]
  arith.fpu_div_active
       [Cycles when divider is busy executing divide operations]
  baclears.any
       [Counts the total number when the front end is resteered, mainly
       when the BPU cannot provide a correct prediction and this is
       corrected by other branch handling mechanisms at the front end]
  br_inst_exec.all_branches
       [Speculative and retired branches]
  br_inst_exec.all_conditional
       [Speculative and retired macro-conditional branches]
  br_inst_exec.all_direct_jmp
       [Speculative and retired macro-unconditional branches excluding
       calls and indirects]
  br_inst_exec.all_direct_near_call
       [Speculative and retired direct near calls]
  br_inst_exec.all_indirect_jump_non_call_ret
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NJiri Olsa <jolsa@redhat.com>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-14-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

dd5f1036

perf list: Support long jevents descriptions · c8d6828a

由 Sukadev Bhattiprolu 提交于 9月 15, 2016

Previously we were dropping the useful longer descriptions that some
events have in the event list completely. This patch makes them appear with
perf list.

Old perf list:

baclears:
  baclears.all
       [Counts the number of baclears]

vs new:

perf list -v:
...
baclears:
  baclears.all
       [The BACLEARS event counts the number of times the front end is
        resteered, mainly when the Branch Prediction Unit cannot provide
        a correct prediction and this is corrected by the Branch Address
        Calculator at the front end. The BACLEARS.ANY event counts the
        number of baclears for any type of branch]
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NJiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-13-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

c8d6828a

perf pmu: Add override support for event list CPUID · fc06e2a5

由 Andi Kleen 提交于 9月 15, 2016

Add a PERF_CPUID variable to override the CPUID of the current CPU
(within the current architecture). This is useful for testing, so that
all event lists can be tested on a single system.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NJiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-10-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

fc06e2a5

perf list: Add a --no-desc flag · 1c5f01fe

由 Andi Kleen 提交于 9月 15, 2016

Add a --no-desc flag to 'perf list' to not print the event descriptions
that were earlier added for JSON events. This may be useful to get a
less crowded listing.

It's still default to print descriptions as that is the more useful
default for most users.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NJiri Olsa <jolsa@redhat.com>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1473978296-20712-9-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

1c5f01fe

perf tools: Query terminal width and use in perf list · 61eb2eb4

由 Andi Kleen 提交于 9月 15, 2016

Automatically adapt the now wider and word wrapped perf list output to
wider terminals. This requires querying the terminal before the auto
pager takes over, and exporting this information from the pager
subsystem.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NJiri Olsa <jolsa@redhat.com>
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-8-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

61eb2eb4

perf pmu: Support alias descriptions · 08e60ed1

由 Andi Kleen 提交于 9月 15, 2016

Add support to print alias descriptions in perf list, which are taken
from the generated event files.

The sorting code is changed to put the events with descriptions at the
end. The descriptions are printed as possibly multiple word wrapped
lines.

Example output:

% perf list
...
  arith.fpu_div
       [Divide operations executed]
  arith.fpu_div_active
       [Cycles when divider is busy executing divide operations]

Committer notes:

Further testing on a Broadwell machine (ThinkPad t450s), using these
files:

  $ find tools/perf/pmu-events/arch/x86/
  tools/perf/pmu-events/arch/x86/
  tools/perf/pmu-events/arch/x86/Broadwell
  tools/perf/pmu-events/arch/x86/Broadwell/Cache.json
  tools/perf/pmu-events/arch/x86/Broadwell/Other.json
  tools/perf/pmu-events/arch/x86/Broadwell/Frontend.json
  tools/perf/pmu-events/arch/x86/Broadwell/Virtual-Memory.json
  tools/perf/pmu-events/arch/x86/Broadwell/Pipeline.json
  tools/perf/pmu-events/arch/x86/Broadwell/Floating-point.json
  tools/perf/pmu-events/arch/x86/Broadwell/Memory.json
  tools/perf/pmu-events/arch/x86/mapfile.csv
  $

Taken from:

https://github.com/sukadev/linux/tree/json-code+data-v21/tools/perf/pmu-events/arch/x86/

to get this machinery to actually parse JSON files, generate
$(OUTPUT)pmu-events/pmu-events.c, compile it and link it with perf, that
will then use the table it contains, these files will be submitted right
after this patchkit.

  [acme@jouet linux]$ perf list page_walker

  List of pre-defined events (to be used in -e):

    page_walker_loads.dtlb_l1
         [Number of DTLB page walker hits in the L1+FB]
    page_walker_loads.dtlb_l2
         [Number of DTLB page walker hits in the L2]
    page_walker_loads.dtlb_l3
         [Number of DTLB page walker hits in the L3 + XSNP]
    page_walker_loads.dtlb_memory
         [Number of DTLB page walker hits in Memory]
    page_walker_loads.itlb_l1
         [Number of ITLB page walker hits in the L1+FB]
    page_walker_loads.itlb_l2
         [Number of ITLB page walker hits in the L2]
    page_walker_loads.itlb_l3
         [Number of ITLB page walker hits in the L3 + XSNP]

[acme@jouet linux]$
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NJiri Olsa <jolsa@redhat.com>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-7-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

08e60ed1

perf pmu: Use pmu_events table to create aliases · 933f82ff

由 Sukadev Bhattiprolu 提交于 9月 15, 2016

At run time (when 'perf' is starting up), locate the specific table of
PMU events that corresponds to the current CPU. Using that table, create
aliases for the each of the PMU events in the CPU. The use these aliases
to parse the user specified perf event.

In short this would allow the user to specify events using their aliases
rather than raw event codes.

Based on input and some earlier patches from Andi Kleen, Jiri Olsa.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NJiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-4-git-send-email-sukadev@linux.vnet.ibm.com
[ Make pmu_add_cpu_aliases() return void, since it was returning just '0' and
  furthermore, even that was being discarded via an explicit (void) cast ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

933f82ff

03 10月, 2016 2 次提交

perf tools: Experiment with cppcheck · 18ef15c6

由 Arnaldo Carvalho de Melo 提交于 10月 03, 2016

Experimenting a bit using cppcheck[1], a static checker brought to my
attention by Colin, reducing the scope of some variables, reducing the
line of source code lines in the process:

  $ cppcheck --enable=style tools/perf/util/thread.c
  Checking tools/perf/util/thread.c...
  [tools/perf/util/thread.c:17]: (style) The scope of the variable 'leader' can be reduced.
  [tools/perf/util/thread.c:133]: (style) The scope of the variable 'err' can be reduced.
  [tools/perf/util/thread.c:273]: (style) The scope of the variable 'err' can be reduced.

Will continue later, but these are already useful, keep them.

1: https://sourceforge.net/p/cppcheck/wiki/Home/

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ixws7lbycihhpmq9cc949ti6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

18ef15c6

perf probe: Check if *ptr2 is zero and not ptr2 · ead1a574

由 Colin Ian King 提交于 10月 03, 2016

Static anaylsis with cppcheck[1] detected an incorrect comparison:
[tools/perf/util/probe-event.c:216]: (warning) Char literal compared
with pointer 'ptr2'. Did you intend to dereference it?

Dereference ptr2 for the comparison to fix this.

1: https://sourceforge.net/p/cppcheck/wiki/Home/Signed-off-by: NColin King <colin.king@canonical.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 35726d3a ("perf probe: Fix to cut off incompatible chars from group name")
Link: http://lkml.kernel.org/r/20161003103431.18534-1-colin.king@canonical.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

ead1a574

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功