1. 22 9月, 2017 4 次提交
    • A
      tools: Update asm-generic/mman-common.h copy from the kernel · 492e05b0
      Arnaldo Carvalho de Melo 提交于
      To get the defines introduced in the commit aafd4562 ("mm: arch:
      consolidate mmap hugetlb size encodings"), that doesn't brings anything
      interesting for tools/, but also the ones from d2cd9ede ("mm,fork:
      introduce MADV_WIPEONFORK"), which does, and ends up triggering an auto-update
      to the tools/perf/trace/beauty/generated/madvise_behavior_array.c file,
      supporting the newly introduced 'behavior' values.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman-common.h' differs from latest version at 'include/uapi/asm-generic/mman-common.h'
      
      Testing it:
      
        # cat madvise.c
        #include <sys/mman.h>
        #include <stdlib.h>
      
        #ifndef MADV_WIPEONFORK
        #define MADV_WIPEONFORK 18
        #endif
        #ifndef MADV_KEEPONFORK
        #define MADV_KEEPONFORK 19
        #endif
      
        int main(void)
        {
              void *ptr = mmap(NULL, 4096, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
      
              madvise(ptr, 4096, MADV_WIPEONFORK);
              madvise(ptr, 4096, MADV_KEEPONFORK);
      
      	return 0;
        }
        [root@jouet c]# perf trace -e mmap,madvise ./madvise
           0.000 ( 0.013 ms): madvise/11732 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS            ) = 0x7fba6e015000
           0.047 ( 0.004 ms): madvise/11732 mmap(len: 160164, prot: READ, flags: PRIVATE, fd: 3                   ) = 0x7fba6dfed000
           0.084 ( 0.009 ms): madvise/11732 mmap(len: 4000096, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7fba6da20000
           0.109 ( 0.006 ms): madvise/11732 mmap(addr: 0x7fba6dde7000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1863680) = 0x7fba6dde7000
           0.125 ( 0.004 ms): madvise/11732 mmap(addr: 0x7fba6dded000, len: 14688, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED) = 0x7fba6dded000
           0.150 ( 0.006 ms): madvise/11732 mmap(len: 12288, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS           ) = 0x7fba6dfea000
           0.288 ( 0.003 ms): madvise/11732 mmap(len: 4096, flags: PRIVATE|ANONYMOUS                              ) = 0x7fba6e014000
           0.292 ( 0.002 ms): madvise/11732 madvise(start: 0x7fba6e014000, len_in: 4096, behavior: MADV_WIPEONFORK) = 0
           0.295 ( 0.001 ms): madvise/11732 madvise(start: 0x7fba6e014000, len_in: 4096, behavior: MADV_KEEPONFORK) = 0
        # uname -a
        Linux jouet 4.13.0+ #2 SMP Mon Sep 18 17:22:46 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-yev9rexu02cl7cjeozzmrl9t@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      492e05b0
    • A
      perf trace beauty madvise: Generate 'behavior' string table from kernel headers · 5a54c2f5
      Arnaldo Carvalho de Melo 提交于
      This is one more case where the way that syscall parameter values are
      defined in kernel headers are easy to parse using a shell script that
      will then generate the string table that gets used by the madvise
      'behaviour' argument beautifier.
      
      This way as soon as the header syncronization mechanism in perf's build
      system detects a change in a copy of a kernel ABI header and that file
      is syncronized, we get 'perf trace' updated automagically.
      
      So, when we syncronize this:
      
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman-common.h' differs from latest version at 'include/uapi/asm-generic/mman-common.h'
      
      We'll get these:
      
        #define MADV_WIPEONFORK 18              /* Zero memory on fork, child only */
        #define MADV_KEEPONFORK 19              /* Undo MADV_WIPEONFORK */
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-dolb0ghds4ui7wc1npgkchvc@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5a54c2f5
    • X
      perf tests: Remove Intel CQM perf test · 5c9295bf
      Xiaochen Shen 提交于
      Intel CQM perf test is obsolete for perf PMU code has been removed in
      commit c39a0e2c ("x86/perf/cqm: Wipe out perf based cqm").
      Signed-off-by: NXiaochen Shen <xiaochen.shen@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Pei P Jia <pei.p.jia@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1505797057-16300-1-git-send-email-xiaochen.shen@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5c9295bf
    • A
      perf stat: Fix adding multiple event groups · 411bc316
      Andi Kleen 提交于
      The -M metric group parser threw away the events of earlier groups when
      multiple groups were specified. Fix this here by not overwriting the
      string incorrectly.
      
      Now this works correctly:
      
      % perf stat -M Summary,SMT --metric-only -a sleep 1
      
       Performance counter stats for 'system wide':
      
      Instructions CPI CLKS         CPU_Utilization GFLOPs SMT_2T_Utilization SMT_2T_Utilization Kernel_Utilization CoreIPC CORE_CLKS
      900907376.0  2.7 2398954144.0 0.1             0.0    0.2                0.2                0.1                0.4     2080822855.5
      
      while previously it would only show the SMT metrics.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170914205735.18431-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      411bc316
  2. 18 9月, 2017 4 次提交
  3. 13 9月, 2017 32 次提交
    • A
      perf vendor events: Add JSON metrics for Skylake server · 56de5b63
      Andi Kleen 提交于
      Add JSON metrics for Skylake server
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170908180133.GA20128@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      56de5b63
    • A
      perf vendor events: Add JSON metrics for Broadwell DE · 69e93213
      Andi Kleen 提交于
      Add JSON metrics for Broadwell DE
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170908180133.GA20128@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      69e93213
    • A
      perf vendor events: Add JSON metrics for Broadwell Server · 6d75abd3
      Andi Kleen 提交于
      Add JSON metrics for Broadwell Server.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170908180133.GA20128@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6d75abd3
    • A
      perf vendor events: Add JSON metrics for Haswell EP · 5e49f732
      Andi Kleen 提交于
      Add JSON metrics for Haswell EP.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170908180133.GA20128@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5e49f732
    • A
      perf vendor events: Add JSON metrics for Ivy Town · 43fd36a1
      Andi Kleen 提交于
      Add JSON metrics for Ivy Town.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170908180133.GA20128@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      43fd36a1
    • A
      perf vendor events: Add JSON metrics for Haswell · 2099f51d
      Andi Kleen 提交于
      Add JSON metrics for Haswell.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170908180133.GA20128@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2099f51d
    • A
      perf vendor events: Add JSON metrics for Ivy Bridge · 8853d2de
      Andi Kleen 提交于
      Add JSON metrics for Ivy Bridge.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170908180133.GA20128@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8853d2de
    • A
      perf vendor events: Add JSON metrics for Sandy Bridge EP · 28bc0ddb
      Andi Kleen 提交于
      Add JSON metrics for Sandy Bridge EP.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170908180133.GA20128@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      28bc0ddb
    • A
      perf vendor events: Add JSON metrics for Sandy Bridge · 97dca671
      Andi Kleen 提交于
      Add JSON metrics for Sandy Bridge.
      
      Committer testing:
      
        # grep "model name" /proc/cpuinfo | head -1
        model name	: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
          # perf list metricgroup
      
        List of pre-defined events (to be used in -e):
      
        Metric Groups:
      
        DSB
        FLOPS
        Frontend
        Frontend_Bandwidth
        Pipeline
        Ports_Utilization
        Power
        SMT
        Summary
        TopDownL1
        # perf stat -M Power --metric-only -a sleep 1
      
         Performance counter stats for 'system wide':
      
        Turbo_Utilization  C3_Core_Residency  C6_Core_Residency  C7_Core_Residency  C2_Pkg_Residency  C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
           0.8               0.0                98.1               0.0                0.0               0.0               23.4              0.0
      
             1.001153658 seconds time elapsed
      
        # perf stat -v -M Power --metric-only -a sleep 1
        Using CPUID GenuineIntel-6-2A
        metric expr cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc for Turbo_Utilization
        found event cpu_clk_unhalted.thread
        found event cpu_clk_unhalted.ref_tsc
        metric expr (cstate_core@c3\-residency@ / msr@tsc@) * 100 for C3_Core_Residency
        found event cstate_core/c3-residency/
        found event msr/tsc/
        metric expr (cstate_core@c6\-residency@ / msr@tsc@) * 100 for C6_Core_Residency
        found event cstate_core/c6-residency/
        found event msr/tsc/
        metric expr (cstate_core@c7\-residency@ / msr@tsc@) * 100 for C7_Core_Residency
        found event cstate_core/c7-residency/
        found event msr/tsc/
        metric expr (cstate_pkg@c2\-residency@ / msr@tsc@) * 100 for C2_Pkg_Residency
        found event cstate_pkg/c2-residency/
        found event msr/tsc/
        metric expr (cstate_pkg@c3\-residency@ / msr@tsc@) * 100 for C3_Pkg_Residency
        found event cstate_pkg/c3-residency/
        found event msr/tsc/
        metric expr (cstate_pkg@c6\-residency@ / msr@tsc@) * 100 for C6_Pkg_Residency
        found event cstate_pkg/c6-residency/
        found event msr/tsc/
        metric expr (cstate_pkg@c7\-residency@ / msr@tsc@) * 100 for C7_Pkg_Residency
        found event cstate_pkg/c7-residency/
        found event msr/tsc/
        adding {cpu_clk_unhalted.thread,cpu_clk_unhalted.ref_tsc}:W,{cstate_core/c3-residency/,msr/tsc/}:W,{cstate_core/c6-residency/,msr/tsc/}:W,{cstate_core/c7-residency/,msr/tsc/}:W,{cstate_pkg/c2-residency/,msr/tsc/}:W,{cstate_pkg/c3-residency/,msr/tsc/}:W,{cstate_pkg/c6-residency/,msr/tsc/}:W,{cstate_pkg/c7-residency/,msr/tsc/}:W
        cpu_clk_unhalted.thread -> cpu/event=0x3c/
        cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
        Weak group for cstate_pkg/c2-residency//2 failed
        Weak group for cstate_pkg/c3-residency//2 failed
        Weak group for cstate_pkg/c6-residency//2 failed
        Weak group for cstate_pkg/c7-residency//2 failed
        cpu_clk_unhalted.thread: 5564185 4002833569 4002833569
        cpu_clk_unhalted.ref_tsc: 7325424 4002833569 4002833569
        cstate_core/c3-residency/: 68293 4003027101 4003027101
        msr/tsc/: 12451294472 4003027101 4003027101
        cstate_core/c6-residency/: 12238830163 4003260984 4003260984
        msr/tsc/: 12452017806 4003260984 4003260984
        cstate_core/c7-residency/: 0 4003489648 4003489648
        msr/tsc/: 12452725162 4003489648 4003489648
        cstate_pkg/c2-residency/: 1830054 1000913138 1000913138
        msr/tsc/: 12453441079 4003717513 4003717513
        cstate_pkg/c3-residency/: 0 1000973570 1000973570
        msr/tsc/: 12454177865 4003954758 4003954758
        cstate_pkg/c6-residency/: 2940448859 1001032370 1001032370
        msr/tsc/: 12454833890 4004166118 4004166118
        cstate_pkg/c7-residency/: 0 1001049818 1001049818
        msr/tsc/: 12454919470 4004194204 4004194204
      
         Performance counter stats for 'system wide':
      
        Turbo_Utilization  C3_Core_Residency  C6_Core_Residency  C7_Core_Residency  C2_Pkg_Residency  C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
             0.8             0.0                98.3               0.0                0.0               0.0               23.6              0.0
      
               1.001126519 seconds time elapsed
      
        #
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170905195235.GW2482@two.firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      97dca671
    • A
      perf vendor events: Add JSON metrics for Skylake · 2e006a24
      Andi Kleen 提交于
      Add JSON metrics for Skylake.
      
      Committer testing:
      
        # grep "model name" /proc/cpuinfo | head -1
        model name	: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
        # uname -a
        Linux seventh 4.12.0-rc6+ #1 SMP Fri Jun 30 16:40:55 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
        # perf stat --metric-only -M Summary -a sleep 1
      
         Performance counter stats for 'system wide':
      
        Instructions         CPI                  CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
        34021097.0               0.0            119424171.0              0.0                 0.0                 0.0                 0.0
      
               1.001001793 seconds time elapsed
      
        # perf list metricgroup
      
        List of pre-defined events (to be used in -e):
      
        Metric Groups:
      
        DSB
        FLOPS
        Frontend
        Frontend_Bandwidth
        Memory_BW
        Memory_Bound
        Memory_Lat
        Pipeline
        Ports_Utilization
        Power
        SMT
        Summary
        TLB
        TopDownL1
        Unknown_Branches
        # perf stat --metric-only -M Ports_Utilization -a sleep 1
      
         Performance counter stats for 'system wide':
      
        ILP
        1475828.0
      
             1.000688547 seconds time elapsed
      
        # perf stat -v --metric-only -M Ports_Utilization -a sleep 1
        Using CPUID GenuineIntel-6-9E
        metric expr uops_executed.thread / ( uops_executed.core_cycles_ge_1 / 2) if #smt_on else uops_executed.core_cycles_ge_1 for ILP
        found event uops_executed.thread
        found event uops_executed.core_cycles_ge_1
        adding {uops_executed.thread,uops_executed.core_cycles_ge_1}:W
        uops_executed.thread -> cpu/umask=0x1,period=2000003,event=0xb1/
        uops_executed.core_cycles_ge_1 -> cpu/umask=0x2,period=2000003,cmask=1,event=0xb1/
        uops_executed.thread: 8115271 4002547654 4002547654
        uops_executed.core_cycles_ge_1: 3282969 4002547654 4002547654
      
         Performance counter stats for 'system wide':
      
        ILP
        3282969.0
      
               1.000719870 seconds time elapsed
      
        #
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170905195235.GW2482@two.firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2e006a24
    • A
      perf vendor events: Add JSON metrics for Broadwell · cf979623
      Andi Kleen 提交于
      Add JSON metrics for Broadwell.
      
      Commiter testing:
      
        # uname -a
        Linux jouet 4.13.0-rc7+ #3 SMP Sat Sep 2 09:04:44 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
        # grep "model name" /proc/cpuinfo  | head -1
        model name	: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
        # perf list metricgroup
      
        List of pre-defined events (to be used in -e):
      
        Metric Groups:
      
        DSB
        FLOPS
        Frontend
        Frontend_Bandwidth
        Memory_BW
        Memory_Bound
        Memory_Lat
        Pipeline
        Ports_Utilization
        Power
        SMT
        Summary
        TLB
        TopDownL1
        Unknown_Branches
        # perf stat -M Power --metric-only -a sleep 1
      
         Performance counter stats for 'system wide':
      
        Turbo_Utilization  C3_Core_Residency  C6_Core_Residency  C7_Core_Residency  C2_Pkg_Residency  C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
             1.1               0.0                 0.0               0.0                0.0               0.0               0.0               0.0
      
               1.003502904 seconds time elapsed
      
        #
        # perf stat -M Memory_BW --metric-only -a sleep 1
      
         Performance counter stats for 'system wide':
      
        MLP
             1.7
      
               1.001364525 seconds time elapsed
      
        #
        # perf stat -M TLB --metric-only -a sleep 1
      
         Performance counter stats for 'system wide':
      
        Page_Walks_Utilization
             0.1
      
               1.005962198 seconds time elapsed
      
        #
        # perf stat -M Summary --metric-only -a sleep 1
      
         Performance counter stats for 'system wide':
      
        Instructions   CPI          CLKS          CPU_Utilization   GFLOPs  SMT_2T_Utilization  Kernel_Utilization
        7281856697.0       0.0    11150898087.0     1.0              0.0    1.0                 0.7
      
               1.012134025 seconds time elapsed
      
        #
      
      Running in verbose mode shows which counters and expressions are being
      used:
      
        # perf stat -v -M Summary --metric-only -a sleep 1
        Using CPUID GenuineIntel-6-3D
        metric expr 1 / inst_retired.any / cycles for CPI
        found event inst_retired.any
        found event cycles
        metric expr cpu_clk_unhalted.thread for CLKS
        found event cpu_clk_unhalted.thread
        metric expr inst_retired.any for Instructions
        found event inst_retired.any
        metric expr cpu_clk_unhalted.ref_tsc / msr@tsc@ for CPU_Utilization
        found event cpu_clk_unhalted.ref_tsc
        found event msr/tsc/
        metric expr ( 1*( fp_arith_inst_retired.scalar_single + fp_arith_inst_retired.scalar_double ) + 2* fp_arith_inst_retired.128b_packed_double + 4*( fp_arith_inst_retired.128b_packed_single + fp_arith_inst_retired.256b_packed_double ) + 8* fp_arith_inst_retired.256b_packed_single ) / 1000000000 / duration_time for GFLOPs
        found event fp_arith_inst_retired.scalar_single
        found event fp_arith_inst_retired.scalar_double
        found event fp_arith_inst_retired.128b_packed_double
        found event fp_arith_inst_retired.128b_packed_single
        found event fp_arith_inst_retired.256b_packed_double
        found event fp_arith_inst_retired.256b_packed_single
        found event duration_time
        metric expr 1 - cpu_clk_thread_unhalted.one_thread_active / ( cpu_clk_thread_unhalted.ref_xclk_any / 2 ) if #smt_on else 0 for SMT_2T_Utilization
        found event cpu_clk_thread_unhalted.one_thread_active
        found event cpu_clk_thread_unhalted.ref_xclk_any
        metric expr cpu_clk_unhalted.ref_tsc:u / cpu_clk_unhalted.ref_tsc for Kernel_Utilization
        found event cpu_clk_unhalted.ref_tsc:u
        found event cpu_clk_unhalted.ref_tsc
        adding {inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W,{cpu_clk_thread_unhalted.one_thread_active,cpu_clk_thread_unhalted.ref_xclk_any}:W,{cpu_clk_unhalted.ref_tsc:u,cpu_clk_unhalted.ref_tsc}:W
        inst_retired.any -> cpu/event=0xc0/
        cpu_clk_unhalted.thread -> cpu/event=0x3c/
        inst_retired.any -> cpu/event=0xc0/
        cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
        fp_arith_inst_retired.scalar_single -> cpu/umask=0x2,period=2000003,event=0xc7/
        fp_arith_inst_retired.scalar_double -> cpu/umask=0x1,period=2000003,event=0xc7/
        fp_arith_inst_retired.128b_packed_double -> cpu/umask=0x4,period=2000003,event=0xc7/
        fp_arith_inst_retired.128b_packed_single -> cpu/umask=0x8,period=2000003,event=0xc7/
        fp_arith_inst_retired.256b_packed_double -> cpu/umask=0x10,period=2000003,event=0xc7/
        fp_arith_inst_retired.256b_packed_single -> cpu/umask=0x20,period=2000003,event=0xc7/
        cpu_clk_thread_unhalted.one_thread_active -> cpu/umask=0x2,period=2000003,event=0x3c/
        cpu_clk_thread_unhalted.ref_xclk_any -> cpu/umask=0x1,any=1,period=2000003,event=0x3c/
        cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
        cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
        Weak group for fp_arith_inst_retired.scalar_single/7 failed
        Weak group for cpu_clk_unhalted.ref_tsc:u/2 failed
        inst_retired.any: 8704146437 4026374016 619883741
        cycles: 11180800018 4026374016 619883741
        cpu_clk_unhalted.thread: 11140030295 4026323772 931621933
        inst_retired.any: 8643115117 4026260510 1243595906
        cpu_clk_unhalted.ref_tsc: 10201638510 4026184297 1247351077
        msr/tsc/: 10378022785 4026184297 1247351077
        fp_arith_inst_retired.scalar_single: 134697 4026102728 1559210545
        fp_arith_inst_retired.scalar_double: 274339 4026007348 1870014984
        fp_arith_inst_retired.128b_packed_double: 1639 4025886054 1866736918
        fp_arith_inst_retired.128b_packed_single: 0 4025776614 2175106569
        fp_arith_inst_retired.256b_packed_double: 0 4025681734 1235551129
        fp_arith_inst_retired.256b_packed_single: 0 4025582962 1232398454
        duration_time: 0 4025552913 4025552913
        cpu_clk_thread_unhalted.one_thread_active: 10505 4025474649 923893076
        cpu_clk_thread_unhalted.ref_xclk_any: 394992110 4025474649 923893076
        cpu_clk_unhalted.ref_tsc:u: 5341421014 4025360315 1231634198
        cpu_clk_unhalted.ref_tsc: 10258278508 4025252611 307909362
      
         Performance counter stats for 'system wide':
      
        Instructions         CPI                  CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
        8704146437.0             0.0            11140030295.0            1.0                 0.0                 1.0                 0.5
      
               1.006783654 seconds time elapsed
      
        #
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20170905195235.GW2482@two.firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cf979623
    • A
      perf stat: Fall weak group back even for EBADF · 35c1980e
      Andi Kleen 提交于
      It's not possible to run a package event and a per cpu event in the same
      group. This is used by some of the power metrics.  They work correctly
      when not using a group.
      
      Normally weak groups should handle that, but in this case EBADF is
      returned instead of the normal EINVAL.
      
        $ strace -e perf_event_open ./perf stat -v -e '{cstate_pkg/c2-residency/,msr/tsc/}:W' -a sleep 1
        Using CPUID GenuineIntel-6-3E
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = -1 EINVAL (Invalid argument)
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, 0) = -1 EINVAL (Invalid argument)
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, 0) = -1 EINVAL (Invalid argument)
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, 0) = -1 EINVAL (Invalid argument)
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, 0) = 3
        perf_event_open({type=0x7 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, 3, 0) = 4
        perf_event_open({type=0x7 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 1, 0, 0) = -1 EBADF (Bad file descriptor)
      
      and perf errors out.
      
      Make weak groups trigger a fall back for EBADF too. Then this case works correctly:
      
        $ perf stat -v -e '{cstate_pkg/c2-residency/,msr/tsc/}:W' -a sleep 1
        Using CPUID GenuineIntel-6-3E
        Weak group for cstate_pkg/c2-residency//2 failed
        cstate_pkg/c2-residency/: 476709882 1000598460 1000598460
        msr/tsc/: 39625837911 12007369110 12007369110
      
         Performance counter stats for 'system wide':
      
               476,709,882      cstate_pkg/c2-residency/
            39,625,837,911      msr/tsc/
      
               1.000697588 seconds time elapsed
      
        This fixes perf stat -M Power ...
      
        $ perf stat -M Power --metric-only -a sleep 1
      
         Performance counter stats for 'system wide':
      
        Turbo_Utilization  C3_Core_Residency  C6_Core_Residency C7_Core_Residency  C2_Pkg_Residency   C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
             1.0                 0.7                30.0               0.0               0.9                 0.1               0.4                 0.0
      
               1.001240740 seconds time elapsed
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170905211324.32427-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      35c1980e
    • A
      perf tools: Make copyfile_offset() static · c23c2a0f
      Arnaldo Carvalho de Melo 提交于
      There are no usage outside util.c and this is the only remaining reason
      for fcntl.h to be included in util.h, to get the loff_t definition in
      Alpine Linux, so make it static.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-2dzlsao7k6ihozs5karw6kpx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c23c2a0f
    • T
      perf config: Allow creating empty config set for config file autogeneration · 55421b4f
      Taeung Song 提交于
      When there isn't a config file (e.g. ~/.perfconfig) or it has nothing,
      the config set wasn't created.
      
      If the config set does not exist, a config file can't be autogenerated.
      
      So allow creating a empty config set in the above case,
      then we can support the config file autogeneration.
      
      Before:
      
        $ rm -f ~/.perfconfig
        $ perf config --user report.children=false
      
        $ cat ~/.perfconfig
        cat: /root/.perfconfig: No such file or directory
      
      But I think it should work even if there isn't a config file.
      
      After:
      
        $ rm -f ~/.perfconfig
        $ perf config --user report.children=false
      
        $ cat ~/.perfconfig
        # this file is auto-generated.
        [report]
            children = false
      
      NOTE:
      
      As a result, if perf_config_set__init() fails, it looks as if the config
      set isn't freed. But it isn't a problem.  Because the config set will be
      freed by perf_config_set__delete() at the end of cmd_config().
      Signed-off-by: NTaeung Song <treeze.taeung@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1504754336-9824-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      55421b4f
    • T
      perf config: Write a config file just once · 5c261555
      Taeung Song 提交于
      Currently set_config() can be repeatedly called for each input config on
      the below case:
      
        $ perf config kmem.default=slab report.children=false ...
      
      But it's a waste, so only once write a config file gathering all given
      config key=value pairs.
      Signed-off-by: NTaeung Song <treeze.taeung@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1504754331-9776-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5c261555
    • K
      perf tools: Use scandir() to replace readdir() · ecdad24d
      Kan Liang 提交于
      In perf_event__synthesize_threads() perf goes through all proc files
      serially by readdir.
      
      scandir() does a snapshoot of /proc, which is multithreading friendly.
      
      It's possible that some threads which are added during event synthesize.
      But the number of lost threads should be small.  They should not impact
      the final analysis.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1504806954-150842-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ecdad24d
    • J
      perf ui progress: Add size info into progress bar · 8233822f
      Jiri Olsa 提交于
      Adding the size values '[current/total]' into progress bar, to show more
      detailed progress of data reading.
      
      Adding new ui_progress__init_size function to specify we want to display
      the size.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908120510.22515-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8233822f
    • J
      perf ui progress: Add ui specific init function · 25cc4eb4
      Jiri Olsa 提交于
      Adding ui specific init function allowing to setup the progress bar
      width based on current screen scales.
      
      Adding TUI init function to get more grained update of the progress bar.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908120510.22515-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      25cc4eb4
    • J
      perf tools: Add python-clean target · 80f87355
      Jiri Olsa 提交于
      To be able to cleanup only python related binaries.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908084621.31595-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      80f87355
    • A
      perf script: Support user regs · b1491ace
      Andi Kleen 提交于
      Teach perf script to print user regs.
      
        % perf record --user-regs=ip,sp ...
        % perf script -F ip,sym,uregs
        ...
         ffffffff9e060c24 native_write_msr ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
         ffffffff9e060c24 native_write_msr ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
         ffffffff9e060c24 native_write_msr ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
         ffffffff9e060c24 native_write_msr ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
         ffffffff9e00cc12 intel_pmu_handle_irq ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
      
      v2: Rebased on top of phys-addr patches
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170905184057.26135-1-andi@firstfloor.org
      [ Use PRIu64 for regs->abi in print_sample_uregs() ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b1491ace
    • A
      perf record: Support direct --user-regs arguments · 84c41742
      Andi Kleen 提交于
      USER_REGS can currently only collected implicitely with call graph
      recording. Sometimes it is useful to see them separately, and filter
      them. Add a new --user-regs option to record that is similar to
      --intr-regs, but acts on user regs.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170905170029.19722-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      84c41742
    • A
      perf stat: Update walltime_nsecs_stats in interval mode · b90f1333
      Andi Kleen 提交于
      Some metrics (like GFLOPs) need walltime_nsecs_stats for each interval.
      Compute it for each interval instead of only at the end.
      
      Pointed out by Jiri.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-12-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b90f1333
    • A
      perf stat: Hide internal duration_time counter · e864c5ca
      Andi Kleen 提交于
      Some perf stat metrics use an internal "duration_time" metric. It is not
      correctly printed however. So hide it during output to avoid confusing
      users with 0 counts.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-11-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e864c5ca
    • A
      perf stat: Support duration_time for metrics · fd48aad9
      Andi Kleen 提交于
      Some of the metrics formulas (like GFLOPs) need to know how long the
      measurement period is. Support an internal event called duration_time,
      which reports time in second. It maps to the dummy event, but is special
      cased for statistics to report the walltime duration.
      
      So far it is not printed, but only used internally for metrics.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-10-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fd48aad9
    • A
      perf stat: Don't use ctx for saved values lookup · 4e1a0963
      Andi Kleen 提交于
      We don't need to use ctx to look up events for saved values.  The
      context is already part of the evsel pointer, which is the primary key.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-9-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4e1a0963
    • A
      perf list: Add metric groups to perf list · 71b0acce
      Andi Kleen 提交于
      Add code to perf list to print metric groups, and metrics
      that don't have an event name. The metricgroup code collects
      the eventgroups and events into a rblist, and then prints
      them according to the configured filters.
      
      The metricgroups are printed by default, but can be
      limited by perf list metric or perf list metricgroup
      
        % perf list metricgroup
        ..
        Metric Groups:
      
        DSB:
          DSB_Coverage
                [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
        FLOPS:
          GFLOPs
                [Giga Floating Point Operations Per Second]
        Frontend:
          IFetch_Line_Utilization
                [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
        Frontend_Bandwidth:
          DSB_Coverage
                [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
        Memory_BW:
          MLP
                [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]
      
      v2: Check return value of asprintf to fix warning on FC26
      Fix key in lookup/addition for the groups list
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-8-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      71b0acce
    • A
      perf stat: Support JSON metrics in perf stat · b18f3e36
      Andi Kleen 提交于
      Add generic support for standalone metrics specified in JSON files to
      perf stat. A metric is a formula that uses multiple events to compute a
      higher level result (e.g. IPC).
      
      Previously metrics were always tied to an event and automatically
      enabled with that event. But now change it that we can have standalone
      metrics. They are in the same JSON data structure as events, but don't
      have an event name.
      
      We also allow to organize the metrics in metric groups, which allows a
      short cut to select several related metrics at once.
      
      Add a new -M / --metrics option to perf stat that adds the metrics or
      metric groups specified.
      
      Add the core code to manage and parse the metric groups. They are
      collected from the JSON data structures into a separate rblist.  When
      computing shadow values look for metrics in that list.  Then they are
      computed using the existing saved values infrastructure in stat-shadow.c
      
      The actual JSON metrics are in a separate pull request.
      
        % perf stat -M Summary --metric-only -a sleep 1
      
         Performance counter stats for 'system wide':
      
        Instructions   CLKS          CPU_Utilization  GFLOPs   SMT_2T_Utilization   Kernel_Utilization
        317614222.0    1392930775.0  0.0              0.0      0.2                  0.1
      
             1.001497549 seconds time elapsed
      
        % perf stat -M GFLOPs flops
      
         Performance counter stats for 'flops':
      
           3,999,541,471  fp_comp_ops_exe.sse_scalar_single #  1.2 GFLOPs   (66.65%)
                      14  fp_comp_ops_exe.sse_scalar_double                 (66.65%)
                       0  fp_comp_ops_exe.sse_packed_double                 (66.67%)
                       0  fp_comp_ops_exe.sse_packed_single                 (66.70%)
                       0  simd_fp_256.packed_double                         (66.70%)
                       0  simd_fp_256.packed_single                         (66.67%)
                       0  duration_time
      
             3.238372845 seconds time elapsed
      
      v2: Add missing header file
      v3: Move find_map to pmu.c
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b18f3e36
    • A
      perf pmu: Extract function to get JSON alias map · d77ade9f
      Andi Kleen 提交于
      Extract the code to get the per cpu JSON alias into a separate function
      for reuse. No behavior changes.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-6-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d77ade9f
    • A
      perf stat: Print generic metric header even for failed expressions · 4ed962eb
      Andi Kleen 提交于
      Print the generic metric header even when the expression evaluation
      failed. Otherwise an expression that fails on the first collections due
      to division by zero may suddenly reappear later without an header.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-5-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4ed962eb
    • A
      perf stat: Factor out generic metric printing · bba49af8
      Andi Kleen 提交于
      The 'perf stat' shadow metric printing already supports generic metrics.
      Factor out the code doing that into a separate function that can be
      re-used in a later patch.
      
      No behavior changes.
      
      v2: Fix indentation
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-4-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bba49af8
    • A
      perf vendor events: Support metric_group and no event name in JSON parser · 3ba36d36
      Andi Kleen 提交于
      Some enhancements to the JSON parser to prepare for metrics support
      
      - Parse the new MetricGroup field
      - Support JSON events with no event name, that have only MetricName.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-3-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3ba36d36
    • A
      perf tools: Support weak groups in 'perf stat' · 5a5dfe4b
      Andi Kleen 提交于
      Setting up groups can be complicated due to the complicated scheduling
      restrictions of different PMUs.
      
      User tools usually don't understand all these restrictions.
      
      Still in many cases it is useful to set up groups and they work most of
      the time. However if the group is set up wrong some members will not
      report any value because they never get scheduled.
      
      Add a concept of a 'weak group': try to set up a group, but if it's not
      schedulable fallback to not using a group. That gives us the best of
      both worlds: groups if they work, but still a usable fallback if they
      don't.
      
      In theory it would be possible to have more complex fallback strategies
      (e.g. try to split the group in half), but the simple fallback of not
      using a group seems to work for now.
      
      So far the weak group is only implemented for perf stat, not for record.
      
      Here's an unschedulable group (on IvyBridge with SMT on)
      
        % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
      
              73,806,067      branches
               4,848,144      branch-misses             #    6.57% of all branches
              14,754,458      l1d.replacement
              24,905,558      l2_lines_in.all
         <not supported>      l2_rqsts.all_code_rd         <------- will never report anything
      
      With the weak group:
      
        % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1
      
             125,366,055      branches                                                      (80.02%)
               9,208,402      branch-misses             #    7.35% of all branches          (80.01%)
              24,560,249      l1d.replacement                                               (80.00%)
              43,174,971      l2_lines_in.all                                               (80.05%)
              31,891,457      l2_rqsts.all_code_rd                                          (79.92%)
      
      The extra event scheduled with some extra multiplexing
      
      v2: Move fallback code to separate function.
      Add comment on for_each_group_member
      Adjust to new perf_evsel__close interface
      v3: Fix debug print out.
      
      Committer testing:
      
      Before:
      
        # perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
      
         Performance counter stats for 'system wide':
      
           <not counted>      branches
           <not counted>      branch-misses
           <not counted>      l1d.replacement
           <not counted>      l2_lines_in.all
         <not supported>      l2_rqsts.all_code_rd
      
             1.002147212 seconds time elapsed
      
        # perf stat -e '{branches,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
      
         Performance counter stats for 'system wide':
      
              83,207,892      branches
              11,065,444      l1d.replacement
              28,484,024      l2_lines_in.all
              12,186,179      l2_rqsts.all_code_rd
      
             1.001739493 seconds time elapsed
      
      After:
      
        # perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}':W -a sleep 1
      
         Performance counter stats for 'system wide':
      
             543,323,909      branches                                                      (80.01%)
              27,100,512      branch-misses             #    4.99% of all branches          (80.02%)
              50,402,905      l1d.replacement                                               (80.03%)
              67,385,892      l2_lines_in.all                                               (80.01%)
              21,352,885      l2_rqsts.all_code_rd                                          (79.94%)
      
             1.001086658 seconds time elapsed
      
        #
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/20170831194036.30146-2-andi@firstfloor.org
      [ Add a "'perf stat' only, for now" comment in the man page, suggested by Jiri ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5a5dfe4b
新手
引导
客服 返回
顶部