提交 · 2d4f27999b8877409f326682fd8cc40c52f47cea · openeuler / Kernel

23 2月, 2019 16 次提交

perf data: Add global path holder · 2d4f2799

由 Jiri Olsa 提交于 2月 21, 2019

Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.

This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.

Also it actually makes the code little simpler.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

2d4f2799

perf data: Move size to struct perf_data_file · 45112e89

由 Jiri Olsa 提交于 2月 21, 2019

We are about to add support for multiple files, so we need each file to
keep its size.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

45112e89

perf, pt, coresight: Fix address filters for vmas with non-zero offset · c60f83b8

由 Alexander Shishkin 提交于 2月 15, 2019

Currently, the address range calculation for file-based filters works as
long as the vma that maps the matching part of the object file starts
from offset zero into the file (vm_pgoff==0). Otherwise, the resulting
filter range would be off by vm_pgoff pages. Another related problem is
that in case of a partially matching vma, that is, a vma that matches
part of a filter region, the filter range size wouldn't be adjusted.

Fix the arithmetics around address filter range calculations, taking
into account vma offset, so that the entire calculation is done before
the filter configuration is passed to the PMU drivers instead of having
those drivers do the final bit of arithmetics.

Based on the patch by Adrian Hunter <adrian.hunter.intel.com>.
Reported-by: NAdrian Hunter <adrian.hunter@intel.com>
Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 375637bc ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20190215115655.63469-3-alexander.shishkin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

c60f83b8

perf: Copy parent's address filter offsets on clone · 18736eef

由 Alexander Shishkin 提交于 2月 15, 2019

When a child event is allocated in the inherit_event() path, the VMA
based filter offsets are not copied from the parent, even though the
address space mapping of the new task remains the same, which leads to
no trace for the new task until exec.
Reported-by: NMansour Alharthi <malharthi9@gatech.edu>
Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 375637bc ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20190215115655.63469-2-alexander.shishkin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

18736eef

perf scripts python: exported-sql-viewer.py: Add top calls report · cd358012

由 Adrian Hunter 提交于 2月 22, 2019

Add a new report to display top calls by elapsed time. It displays calls
in descending order of time elapsed between when the function was called
and when it returned.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

cd358012

perf scripts python: exported-sql-viewer.py: Remove no selection error · fc2c77aa

由 Adrian Hunter 提交于 2月 22, 2019

If no selection is made on the 'Selected branches' dialog, then the
output is the same as the 'All branches' report. That is not really an
error, and is not desirable for future reports, so remove it.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

fc2c77aa

perf scripts python: exported-sql-viewer.py: Remove SQLTableDialogDataItem · 0d5f8f23

由 Adrian Hunter 提交于 2月 22, 2019

Remove SQLTableDialogDataItem as it is no longer used.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0d5f8f23

perf scripts python: exported-sql-viewer.py: Create new dialog data item classes · 1c3ca1b3

由 Adrian Hunter 提交于 2月 22, 2019

Create new dialog data item classes to replace SQLTableDialogDataItem.
This separates out different dialog data items and makes it easier to
add new ones. SQLTableDialogDataItem is removed in a separate patch
because it makes the diff more readable.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

1c3ca1b3

perf scripts python: exported-sql-viewer.py: Move report name into ReportVars · 947cc38d

由 Adrian Hunter 提交于 2月 22, 2019

The report name is a report variable so move it into into ReportVars.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

947cc38d

perf scripts python: exported-sql-viewer.py: Factor out ReportVars · 0bf0947a

由 Adrian Hunter 提交于 2月 22, 2019

Factor out ReportVars to provide a single container for information from
report dialogs.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0bf0947a

perf scripts python: exported-sql-viewer.py: Factor out ReportDialogBase · 0924cd68

由 Adrian Hunter 提交于 2月 22, 2019

Factor out ReportDialogBase so it can be re-used.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0924cd68

perf scripts python: exported-sql-viewer.py: Move column headers · 8c90fef9

由 Adrian Hunter 提交于 2月 22, 2019

Move column headers from SQLAutoTableModel into SQLTableModel so that
they can be used for other models based on SQLTableModel.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

8c90fef9

perf scripts python: exported-sql-viewer.py: Hide Call Graph option if no calls table · 655cb952

由 Adrian Hunter 提交于 2月 22, 2019

The Call Graph depends on the calls table which is optional when exporting
data, so hide the Call Graph option if there is no calls table.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

655cb952

perf scripts python: exported-sql-viewer.py: Remove leftover debugging prints · df8794fe

由 Adrian Hunter 提交于 2月 22, 2019

Remove leftover debugging prints.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

df8794fe

perf scripts python: exported-sql-viewer.py: Fix missing shebang · b3a67546

由 Adrian Hunter 提交于 2月 22, 2019

exported-sql-viewer.py is a standalone python script and requires a
shebang. Also only python2 is supported at present. Restore the shebang
but use the more flexible 'env' form.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: a38352de ("perf script python: Remove explicit shebang from Python script")
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

b3a67546

perf thread-stack: Hide x86 retpolines · 3c0cd952

由 Adrian Hunter 提交于 1月 09, 2019

x86 retpoline functions pollute the call graph by showing up everywhere
there is an indirect branch, but they do not really mean anything. Make
changes so that the default retpoline functions will no longer appear in
the call graph. Note this only affects the call graph, since all the
original branches are left unchanged.

This does not handle function return thunks, nor is there any
improvement for the handling of inline thunks or extern thunks.

Example:

  $ cat simple-retpoline.c
  __attribute__((noinline)) int bar(void)
  {
          return -1;
  }

  int foo(void)
  {
          return bar() + 1;
  }

  __attribute__((indirect_branch("thunk"))) int main()
  {
          int (*volatile fn)(void) = foo;

          fn();
          return fn();
  }
  $ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
  $ objdump -d simple-retpoline
  <SNIP>
  0000000000001040 <main>:
      1040:       48 83 ec 18             sub    $0x18,%rsp
      1044:       48 8d 05 25 01 00 00    lea    0x125(%rip),%rax        # 1170 <foo>
      104b:       48 89 44 24 08          mov    %rax,0x8(%rsp)
      1050:       48 8b 44 24 08          mov    0x8(%rsp),%rax
      1055:       e8 1f 01 00 00          callq  1179 <__x86_indirect_thunk_rax>
      105a:       48 8b 44 24 08          mov    0x8(%rsp),%rax
      105f:       48 83 c4 18             add    $0x18,%rsp
      1063:       e9 11 01 00 00          jmpq   1179 <__x86_indirect_thunk_rax>
  <SNIP>
  0000000000001160 <bar>:
      1160:       b8 ff ff ff ff          mov    $0xffffffff,%eax
      1165:       c3                      retq
  <SNIP>
  0000000000001170 <foo>:
      1170:       e8 eb ff ff ff          callq  1160 <bar>
      1175:       83 c0 01                add    $0x1,%eax
      1178:       c3                      retq
  0000000000001179 <__x86_indirect_thunk_rax>:
      1179:       e8 07 00 00 00          callq  1185 <__x86_indirect_thunk_rax+0xc>
      117e:       f3 90                   pause
      1180:       0f ae e8                lfence
      1183:       eb f9                   jmp    117e <__x86_indirect_thunk_rax+0x5>
      1185:       48 89 04 24             mov    %rax,(%rsp)
      1189:       c3                      retq
  <SNIP>
  $ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
  $ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
  2019-01-08 14:03:37.851655 Creating database...
  2019-01-08 14:03:37.863256 Writing records...
  2019-01-08 14:03:38.069750 Adding indexes
  2019-01-08 14:03:38.078799 Done
  $ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db

Before:

    main
        -> __x86_indirect_thunk_rax
            -> __x86_indirect_thunk_rax
                -> foo
                    -> bar

After:

    main
        -> foo
            -> bar
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-7-adrian.hunter@intel.com
[ Remove (sym->name != NULL) test, this is not a pointer and breaks the build with clang version 7.0.1 (Fedora 7.0.1-2.fc30) ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

3c0cd952

22 2月, 2019 2 次提交

perf thread-stack: Improve thread_stack__no_call_return() · 1f35cd65

由 Adrian Hunter 提交于 1月 09, 2019

Improve thread_stack__no_call_return() to better handle 'returns' that
do not match the stack i.e. 'no call'. See code comments for details.
The example below shows how retpolines are affected:

Example:

  $ cat simple-retpoline.c
  __attribute__((noinline)) int bar(void)
  {
          return -1;
  }

  int foo(void)
  {
          return bar() + 1;
  }

  __attribute__((indirect_branch("thunk"))) int main()
  {
          int (*volatile fn)(void) = foo;

          fn();
          return fn();
  }
  $ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
  $ objdump -d simple-retpoline
  <SNIP>
  0000000000001040 <main>:
      1040:       48 83 ec 18             sub    $0x18,%rsp
      1044:       48 8d 05 25 01 00 00    lea    0x125(%rip),%rax        # 1170 <foo>
      104b:       48 89 44 24 08          mov    %rax,0x8(%rsp)
      1050:       48 8b 44 24 08          mov    0x8(%rsp),%rax
      1055:       e8 1f 01 00 00          callq  1179 <__x86_indirect_thunk_rax>
      105a:       48 8b 44 24 08          mov    0x8(%rsp),%rax
      105f:       48 83 c4 18             add    $0x18,%rsp
      1063:       e9 11 01 00 00          jmpq   1179 <__x86_indirect_thunk_rax>
  <SNIP>
  0000000000001160 <bar>:
      1160:       b8 ff ff ff ff          mov    $0xffffffff,%eax
      1165:       c3                      retq
  <SNIP>
  0000000000001170 <foo>:
      1170:       e8 eb ff ff ff          callq  1160 <bar>
      1175:       83 c0 01                add    $0x1,%eax
      1178:       c3                      retq
  0000000000001179 <__x86_indirect_thunk_rax>:
      1179:       e8 07 00 00 00          callq  1185 <__x86_indirect_thunk_rax+0xc>
      117e:       f3 90                   pause
      1180:       0f ae e8                lfence
      1183:       eb f9                   jmp    117e <__x86_indirect_thunk_rax+0x5>
      1185:       48 89 04 24             mov    %rax,(%rsp)
      1189:       c3                      retq
  <SNIP>
  $ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
  $ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
  2019-01-08 14:03:37.851655 Creating database...
  2019-01-08 14:03:37.863256 Writing records...
  2019-01-08 14:03:38.069750 Adding indexes
  2019-01-08 14:03:38.078799 Done
  $ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db

Before:

    main
        -> __x86_indirect_thunk_rax
            -> __x86_indirect_thunk_rax
                -> __x86_indirect_thunk_rax
                    -> bar

After:

    main
        -> __x86_indirect_thunk_rax
            -> __x86_indirect_thunk_rax
                -> foo
                    -> bar

Committer testing:

Chose "Reports", Then "Context-Sensitive Call Graph" and then go on
expanding:

Before:

simple-retpolin
   PID:PID
      _start
         _start
            __libc_start_main
               main
                   __x86_indirect_thunk_rax
                      __x86_indirect_thunk_rax
                      bar

After:

Remove the "simple.retpoline.db" file, run again the 'perf script' line
to regenerate the .db file and run the exported-sql-viewer.py again to
get the same all the way to 'main', then, from there, including 'main':

               main
                   __x86_indirect_thunk_rax
                       __x86_indirect_thunk_rax
                           foo
                               bar
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-6-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

1f35cd65

perf annotate: Fix getting source line failure · 11db1ad4

由 Wei Li 提交于 2月 21, 2019

The output of "perf annotate -l --stdio xxx" changed since commit 425859ff
("perf annotate: No need to calculate notes->start twice") removed notes->start
assignment in symbol__calc_lines(). It will get failed in
find_address_in_section() from symbol__tty_annotate() subroutine as the
a2l->addr is wrong. So the annotate summary doesn't report the line number of
source code correctly.

Before fix:

  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c
  void hotspot_1(void)
  {
	volatile int i;

	for (i = 0; i < 0x10000000; i++);
	for (i = 0; i < 0x10000000; i++);
	for (i = 0; i < 0x10000000; i++);
  }

  int main(void)
  {
	hotspot_1();

	return 0;
  }
  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ gcc common_while_1.c -g -o common_while_1

  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.488 MB perf.data (12498 samples) ]
  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio

  Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
  ----------------------------------------------

   19.30 common_while_1[32]
   19.03 common_while_1[4e]
   19.01 common_while_1[16]
    5.04 common_while_1[13]
    4.99 common_while_1[4b]
    4.78 common_while_1[2c]
    4.77 common_while_1[10]
    4.66 common_while_1[2f]
    4.59 common_while_1[51]
    4.59 common_while_1[35]
    4.52 common_while_1[19]
    4.20 common_while_1[56]
    0.51 common_while_1[48]
   Percent |      Source code & Disassembly of common_while_1 for cycles:ppp (12480 samples, percent: local period)
  -----------------------------------------------------------------------------------------------------------------
         :
         :
         :
         :         Disassembly of section .text:
         :
         :         00000000000005fa <hotspot_1>:
         :         hotspot_1():
         :         void hotspot_1(void)
         :         {
    0.00 :   5fa:   push   %rbp
    0.00 :   5fb:   mov    %rsp,%rbp
         :                 volatile int i;
         :
         :                 for (i = 0; i < 0x10000000; i++);
    0.00 :   5fe:   movl   $0x0,-0x4(%rbp)
    0.00 :   605:   jmp    610 <hotspot_1+0x16>
    0.00 :   607:   mov    -0x4(%rbp),%eax
   common_while_1[10]    4.77 :   60a:   add    $0x1,%eax
   common_while_1[13]    5.04 :   60d:   mov    %eax,-0x4(%rbp)
   common_while_1[16]   19.01 :   610:   mov    -0x4(%rbp),%eax
   common_while_1[19]    4.52 :   613:   cmp    $0xfffffff,%eax
      0.00 :   618:   jle    607 <hotspot_1+0xd>
           :                 for (i = 0; i < 0x10000000; i++);
  ...

After fix:

  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.488 MB perf.data (12500 samples) ]
  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio

  Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
  ----------------------------------------------

   33.34 common_while_1.c:5
   33.34 common_while_1.c:6
   33.32 common_while_1.c:7
   Percent |      Source code & Disassembly of common_while_1 for cycles:ppp (12482 samples, percent: local period)
  -----------------------------------------------------------------------------------------------------------------
         :
         :
         :
         :         Disassembly of section .text:
         :
         :         00000000000005fa <hotspot_1>:
         :         hotspot_1():
         :         void hotspot_1(void)
         :         {
    0.00 :   5fa:   push   %rbp
    0.00 :   5fb:   mov    %rsp,%rbp
         :                 volatile int i;
         :
         :                 for (i = 0; i < 0x10000000; i++);
    0.00 :   5fe:   movl   $0x0,-0x4(%rbp)
    0.00 :   605:   jmp    610 <hotspot_1+0x16>
    0.00 :   607:   mov    -0x4(%rbp),%eax
   common_while_1.c:5    4.70 :   60a:   add    $0x1,%eax
    4.89 :   60d:   mov    %eax,-0x4(%rbp)
   common_while_1.c:5   19.03 :   610:   mov    -0x4(%rbp),%eax
   common_while_1.c:5    4.72 :   613:   cmp    $0xfffffff,%eax
    0.00 :   618:   jle    607 <hotspot_1+0xd>
         :                 for (i = 0; i < 0x10000000; i++);
    0.00 :   61a:   movl   $0x0,-0x4(%rbp)
    0.00 :   621:   jmp    62c <hotspot_1+0x32>
    0.00 :   623:   mov    -0x4(%rbp),%eax
   common_while_1.c:6    4.54 :   626:   add    $0x1,%eax
    4.73 :   629:   mov    %eax,-0x4(%rbp)
   common_while_1.c:6   19.54 :   62c:   mov    -0x4(%rbp),%eax
   common_while_1.c:6    4.54 :   62f:   cmp    $0xfffffff,%eax
  ...
Signed-off-by: NWei Li <liwei391@huawei.com>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 425859ff ("perf annotate: No need to calculate notes->start twice")
Link: http://lkml.kernel.org/r/20190221095716.39529-1-liwei391@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

11db1ad4

21 2月, 2019 6 次提交

perf tools: Make rm_rf() remove single file · b4409ae1

由 Jiri Olsa 提交于 2月 20, 2019

Let rm_rf() remove a file if it's provided by path, not just
directories.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-7-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

b4409ae1

perf cpumap: Increase debug level for cpu_map__snprint verbose output · deb83da1

由 Jiri Olsa 提交于 2月 20, 2019

So it does not screw up single -v verbose output.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

deb83da1

perf bpf-event: Add missing new line into pr_debug call · b20fe106

由 Jiri Olsa 提交于 2月 20, 2019

Add a missing new line into pr_debug call in perf_event__synthesize_bpf_events(),
so that the error message does not screw the verbose output.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lkml.kernel.org/r/20190220122800.864-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

b20fe106

perf script: Allow +- operator for type specific fields option · 6ef362fd

由 Jiri Olsa 提交于 2月 20, 2019

Add support to add/remove fields for specific event types in -F option.
It's now possible to use '+-' after event type, like:

  # cat > test.c
  #include <stdio.h>

  int main(void)
  {
     printf("Hello world\n");
     while(1) {}
  }
  ^D
  # gcc -g -o test test.c
  # perf probe -x test 'test.c:5'
  # perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
  ...

  # perf script -Ftrace:+period,-cpu
            test  3859 396291.117343:      10275 cpu/cpu-cycles,period=10000/:      7f..
            test  3859 396291.118234:      11041 cpu/cpu-cycles,period=10000/:  ffffff..
            test  3859 396291.118234:          1              probe_test:main:
            test  3859 396291.118248:       8668 cpu/cpu-cycles,period=10000/:  ffffff..
            test  3859 396291.118263:      10139 cpu/cpu-cycles,period=10000/:  ffffff..

Committer testing:

Couldn't make the test above work, but tested it with:

  # perf probe -x hello main
  Added new event:
    probe_hello:main     (on main in /home/acme/c/hello)

  You can now use it in all perf tools, such as:

	  perf record -e probe_hello:main -aR sleep 1

  # perf record -e probe_hello:main ./hello
  hello, world
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data (1 samples) ]
  # perf script
           hello 21454 [002] 254116.874005: probe_hello:main: (401126)
  #
  # perf script -Ftrace:+period,-cpu
           hello 21454 254116.874005:          1 probe_hello:main: (401126)
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

6ef362fd

perf evsel: Force sample_type for slave events · 6e7e8b9f

由 Jiri Olsa 提交于 2月 20, 2019

Force sample_type setup for slave events in group leader sessions.

We don't get sample for slave events, we make them when delivering group
leader sample. Set the slave event to follow the master sample_type to
ease up report.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

6e7e8b9f

perf session: Don't report zero period samples for slave events · 529c1a9e

由 Jiri Olsa 提交于 2月 20, 2019

There's no reason to deliver a sample with zero period.  It means there
was no value for slave event since its last group leader sample.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

529c1a9e

20 2月, 2019 5 次提交

perf trace: Allow dumping a BPF map after setting up BPF events · ff7a4f98

由 Arnaldo Carvalho de Melo 提交于 2月 19, 2019

Initial use case:

Dumping the maps setup by tools/perf/examples/bpf/augmented_raw_syscalls.c,
which so far are just booleans, showing just non-zeroed entries:

  # cat ~/.perfconfig
  [llvm]
	dump-obj = true
	clang-opt = -g
  [trace]
	#add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
	add_events = /wb/augmented_raw_syscalls.o
  $ date
  Tue Feb 19 16:29:33 -03 2019
  $ ls -la /wb/augmented_raw_syscalls.o
  -rwxr-xr-x. 1 root root 14048 Jan 24 12:09 /wb/augmented_raw_syscalls.o
  $ file /wb/augmented_raw_syscalls.o
  /wb/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
  $
  # trace -e recvmmsg,sendmmsg --map-dump foobar
  ERROR: BPF map "foobar" not found
  # trace -e recvmmsg,sendmmsg --map-dump filtered_pids
  ERROR: BPF map "filtered_pids" not found
  # trace -e recvmmsg,sendmmsg --map-dump pids_filtered
  [2583] = 1,
  [2267] = 1,
  ^Z
  [1]+  Stopped                 trace -e recvmmsg,sendmmsg --map-dump pids_filtered
  # pidof trace
  2267
  # ps ax|grep gnome-terminal|grep -v grep
  2583 ?        Ssl   58:33 /usr/libexec/gnome-terminal-server
  ^C
  # trace -e recvmmsg,sendmmsg --map-dump syscalls
  [299] = 1,
  [307] = 1,
  ^C
  # grep x64_recvmmsg arch/x86/entry/syscalls/syscall_64.tbl
  299	64	recvmmsg		__x64_sys_recvmmsg
  # grep x64_sendmmsg arch/x86/entry/syscalls/syscall_64.tbl
  307	64	sendmmsg		__x64_sys_sendmmsg
  #

Next step probably will be something like 'perf stat's --interval-print and
--interval-clear.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-ztxj25rtx37ixo9cfajt8ocy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

ff7a4f98

perf bpf: Add bpf_map dumper · d19f8564

由 Arnaldo Carvalho de Melo 提交于 2月 19, 2019

At some point I'll suggest moving this to libbpf, for now I'll
experiment with ways to dump BPF maps set by events in 'perf trace',
starting with a very basic dumper for the current very limited needs
of the augmented_raw_syscalls code: dumping booleans.

Having functions that apply to the map keys and values and do table
lookup in things like syscall id to string tables should come next.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-lz14w0esqyt1333aon05jpwc@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d19f8564

perf test: Fix failure of 'evsel-tp-sched' test on s390 · 03d30971

由 Thomas Richter 提交于 2月 19, 2019

Commit 489338a7 ("perf tests evsel-tp-sched: Fix bitwise operator")
causes test case 14 "Parse sched tracepoints fields" to fail on s390.

This test succeeds on x86.

In fact this test now fails on all architectures with type char treated
as type unsigned char.

The root cause is the signed-ness of character arrays in the tracepoints
sched_switch for structure members prev_comm and next_comm.

On s390 the output of:

 [root@m35lp76 perf]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
 name: sched_switch
 ID: 287
 format:
   field:unsigned short common_type; offset:0; size:2;	signed:0;
   ...
   field:char prev_comm[16]; offset:8; size:16;	signed:0;
   ...
   field:char next_comm[16]; offset:40; size:16; signed:0;

reveals the character arrays prev_comm and next_comm are per
default unsigned char and have values in the range of 0..255.

On x86 both fields are signed as this output shows:
 [root@f29]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
 name: sched_switch
 ID: 287
 format:
   field:unsigned short common_type; offset:0; size:2;	signed:0;
   ...
   field:char prev_comm[16]; offset:8; size:16;	signed:1;
   ...
   field:char next_comm[16]; offset:40; size:16; signed:1;

and the character arrays prev_comm and next_comm are per default signed
char and have values in the range of -1..127.  The implementation of
type char is architecture specific.

Since the character arrays in both tracepoints sched_switch and
sched_wakeup should contain ascii characters, simply omit the check for
signedness in the test case.

Output before:

  [root@m35lp76 perf]# ./perf test -F 14
  14: Parse sched tracepoints fields                        :
  --- start ---
  sched:sched_switch: "prev_comm" signedness(0) is wrong, should be 1
  sched:sched_switch: "next_comm" signedness(0) is wrong, should be 1
  sched:sched_wakeup: "comm" signedness(0) is wrong, should be 1
  ---- end ----
  14: Parse sched tracepoints fields                        : FAILED!
  [root@m35lp76 perf]#

Output after:

  [root@m35lp76 perf]# ./perf test -Fv 14
  14: Parse sched tracepoints fields                        :
  --- start ---
  ---- end ----
  Parse sched tracepoints fields: Ok
  [root@m35lp76 perf]#

Fixes: 489338a7 ("perf tests evsel-tp-sched: Fix bitwise operator")
Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20190219153639.31267-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

03d30971

perf doc: Fix documentation of the Flags section in perf.data · 8c23a522

由 Jonas Rabenstein 提交于 2月 19, 2019

According to the current documentation the flags section is placed after
the file header itself but the code assumes to find the flags section
after the data section. This change updates the documentation to that
assumption.
Signed-off-by: NJonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190219154515.3954-2-jonas.rabenstein@studium.uni-erlangen.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

8c23a522

perf doc: Fix HEADER_CMDLINE description in perf.data documentation · 7a663c0f

由 Jonas Rabenstein 提交于 2月 19, 2019

The content of the HEADER_CMDLINE feature header is a perf_header_string_list
of the argument vector and not a perf_header_string of the commandline.
Signed-off-by: NJonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190219154515.3954-1-jonas.rabenstein@studium.uni-erlangen.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

7a663c0f

19 2月, 2019 5 次提交

perf report: Don't shadow inlined symbol with different addr range · 7346195e

由 He Kuang 提交于 2月 19, 2019

We can't assume inlined symbols with the same name are equal, because
their address range may be different. This will cause the symbols with
different addresses be shadowed when adding to the hist entry, and lead
to ERANGE error when checking the symbol address during sample parse,
the addr should be within the range of [sym.start, sym.end].

The error message is like: "0x36aea60f [0x8]: failed to process type: 68".

The second parameter of symbol__new() is the length of the fake symbol
for the inline frame, which is the subtraction of the end and start
address of base_sym.
Signed-off-by: NHe Kuang <hekuang@huawei.com>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: aa441895 ("perf report: Compare symbol name for inlined frames when sorting")
Link: http://lkml.kernel.org/r/20190219130531.15692-1-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

7346195e

perf tools: Use sysfs__mountpoint() when reading cpu topology · e19a01c1

由 Jiri Olsa 提交于 2月 19, 2019

Use sysfs__mountpoint() when reading sysfs files to obtain cpu/numa
topologies.

Also use scnprintf instead of sprintf as suggested by Namhyung.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190219095815.15931-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

e19a01c1

perf tools: Add numa_topology object · 48e6c5ac

由 Jiri Olsa 提交于 2月 19, 2019

Add the numa_topology object to return the list of numa nodes together
with their cpus. It will replace the numa code in header.c and will be
used from 'perf record' in the following patches.

Add the following interface functions to load numa details:

  struct numa_topology *numa_topology__new(void);
  void numa_topology__delete(struct numa_topology *tp);

And replace the current (copied) local interface, with no functional
changes.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190219095815.15931-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

48e6c5ac

perf tools: Add cpu_topology object · 5135d5ef

由 Jiri Olsa 提交于 2月 19, 2019

Make struct cpu_topo global and rename it to 'struct cpu_topology', so
that it can be used from the 'perf record' command in the following
patches.

Add the following interface functions to load/free cpu topology details:

  struct cpu_topology *cpu_topology__new(void);
  void cpu_topology__delete(struct cpu_topology *tp);

Move it to a separate source file cputopo.c together with numa related
object in the following patches.

No functional change, the new interface will be used in upcoming changes.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190219095815.15931-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

5135d5ef

perf header: Fix wrong node write in NUMA_TOPOLOGY feature · b00ccb27

由 Jiri Olsa 提交于 2月 19, 2019

We are currently passing the node index instead of the real node number.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: fbe96f29 ("perf tools: Make perf.data more self-descriptive (v8)"
Link: http://lkml.kernel.org/r/20190219095815.15931-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

b00ccb27

16 2月, 2019 1 次提交

perf tests shell: Skip trace+probe_vfs_getname.sh if built without trace support · 83244772

由 Tommi Rantala 提交于 2月 15, 2019

If perf was built without trace support, the trace+probe_vfs_getname.sh
'perf test' entry fails:

  # perf trace -h
  perf: 'trace' is not a perf-command. See 'perf --help'

  # perf test 64
  64: Check open filename arg using perf trace + vfs_getname: FAILED!

Check trace support, so that we'll skip the test in that case:

  # perf test 64
  64: Check open filename arg using perf trace + vfs_getname: Skip
Signed-off-by: NTommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190215134253.11454-1-tt.rantala@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

83244772

15 2月, 2019 5 次提交

Merge tag 'perf-core-for-mingo-5.1-20190214' of... · 43f4e627

由 Ingo Molnar 提交于 2月 15, 2019

Merge tag 'perf-core-for-mingo-5.1-20190214' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf list:

Jiri Olsa:

- Display metric expressions for --details option

perf record:

Alexey Budankov:

- Implement --affinity=node|cpu option, leftover, the other patches
in this kit were already applied.

perf trace:

Arnaldo Carvalho de Melo:

- Fix segfaults due to not properly handling negative file descriptor syscall args.

- Fix segfault related to the 'waitid' 'options' prefix showing logic.

- Filter out 'gnome-terminal*' if it is a parent of 'perf trace', to reduce the
syscall feedback loop in system wide sessions.

BPF:

Song Liu:

- Silence "Couldn't synthesize bpf events" warning for EPERM.

Build system:

Arnaldo Carvalho de Melo:

- Fix the test-all.c feature detection fast path that was broken for
quite a while leading to longer build times.

Event parsing:

Jiri Olsa:

- Fix legacy events symbol separator parsing

cs-etm:

Mathieu Poirier:

- Fix some error path return errors and plug some memory leaks.

- Add proper header file for symbols

- Remove unused structure fields.

- Modularize auxtrace_buffer fetch, decoder and packet processing loop.

Vendor events:

Paul Clarke:

- Add assorted metrics for the Power8 and Power9 architectures.

perf report:

Thomas Richter:

- Add s390 diagnostic sampling descriptor size
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

43f4e627

tools build feature sched_getcpu: Undef _GNU_SOURCE at the end · 44ec8396

由 Arnaldo Carvalho de Melo 提交于 2月 14, 2019

Since this feature test is included in test-all.c, the feature detection
fast path compile/link phase, it can't leave any defines behind, as it
can affect the tests included after it, so remove it.
Reported-by: NJiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-lg3kpd9tzypc797vb1f42u6k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

44ec8396

perf header: Remove unused 'cpu_nr' field from 'struct cpu_topo' · aa4df30d

由 Jiri Olsa 提交于 2月 13, 2019

Not used at all.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-9-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

aa4df30d

perf header: Get rid of write_it label · a9aeb87b

由 Jiri Olsa 提交于 2月 13, 2019

Simplifying the code a bit.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-8-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

a9aeb87b

perf list: Display metric expressions for --details option · 33bbc571

由 Jiri Olsa 提交于 2月 13, 2019

Display metric expression itself when --details is specified.

Current list with no details:

  # perf list metrics
  ...
  TopDownL1:
    IPC
         [Instructions Per Cycle (per logical thread)]
    SLOTS
         [Total issue-pipeline slots]
  ...

Detailed output with metric formula:

  # perf list --details metrics
  ...
  TopDownL1:
    IPC
         [Instructions Per Cycle (per logical thread)]
         [inst_retired.any / cpu_clk_unhalted.thread]
    SLOTS
         [Total issue-pipeline slots]
         [4*(( cpu_clk_unhalted.thread_any / 2 ) if #smt_on else cycles)]
  ...
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

33bbc571

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功