1. 14 1月, 2020 2 次提交
  2. 06 1月, 2020 16 次提交
  3. 21 12月, 2019 3 次提交
  4. 17 12月, 2019 2 次提交
  5. 11 12月, 2019 5 次提交
    • M
      perf header: Fix false warning when there are no duplicate cache entries · 28707826
      Michael Petlan 提交于
      Before this patch, perf expected that there might be NPROC*4 unique
      cache entries at max, however, it also expected that some of them would
      be shared and/or of the same size, thus the final number of entries
      would be reduced to be lower than NPROC*4. In case the number of entries
      hadn't been reduced (was NPROC*4), the warning was printed.
      
      However, some systems might have unusual cache topology, such as the
      following two-processor KVM guest:
      
      	cpu  level  shared_cpu_list  size
      	  0     1         0           32K
      	  0     1         0           64K
      	  0     2         0           512K
      	  0     3         0           8192K
      	  1     1         1           32K
      	  1     1         1           64K
      	  1     2         1           512K
      	  1     3         1           8192K
      
      This KVM guest has 8 (NPROC*4) unique cache entries, which used to make
      perf printing the message, although there actually aren't "way too many
      cpu caches".
      
      v2: Removing unused argument.
      
      v3: Unifying the way we obtain number of cpus.
      
      v4: Removed '& UINT_MAX' construct which is redundant.
      Signed-off-by: NMichael Petlan <mpetlan@redhat.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      LPU-Reference: 20191208162056.20772-1-mpetlan@redhat.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      28707826
    • K
      perf metricgroup: Fix printing event names of metric group with multiple events · eb573e74
      Kajol Jain 提交于
      Commit f01642e4 ("perf metricgroup: Support multiple events for
      metricgroup") introduced support for multiple events in a metric group.
      But with the current upstream, metric events names are not printed
      properly
      
      In power9 platform:
      
      command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
           1.000208486
           2.000368863
           2.001400558
      
      Similarly in skylake platform:
      
      command:./perf stat --metric-only -M Power -I 1000
           1.000579994
           2.002189493
      
      With current upstream version, issue is with event name comparison logic
      in find_evsel_group(). Current logic is to compare events belonging to a
      metric group to the events in perf_evlist.  Since the break statement is
      missing in the loop used for comparison between metric group and
      perf_evlist events, the loop continues to execute even after getting a
      pattern match, and end up in discarding the matches.
      
      Incase of single metric event belongs to metric group, its working fine,
      because in case of single event once it compare all events it reaches to
      end of perf_evlist.
      
      Example for single metric event in power9 platform:
      
      command:# ./perf stat --metric-only  -M branches_per_inst -I 1000 sleep 1
           1.000094653                  0.2
           1.001337059                  0.0
      
      This patch fixes the issue by making sure once we found all events
      belongs to that metric event matched in find_evsel_group(), we
      successfully break from that loop by adding corresponding condition.
      
      With this patch:
      In power9 platform:
      
      command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
      result:#
                  time  derat_4k_miss_rate_percent  derat_4k_miss_ratio derat_miss_ratio derat_64k_miss_rate_percent  derat_64k_miss_ratio dslb_miss_rate_percent islb_miss_rate_percent
           1.000135672                         0.0                  0.3              1.0                         0.0                   0.2                    0.0                    0.0
           2.000380617                         0.0                  0.0              0.0                         0.0                   0.0                    0.0                    0.0
      
      command:# ./perf stat --metric-only -M Power -I 1000
      
      Similarly in skylake platform:
      result:#
                  time    Turbo_Utilization    C3_Core_Residency  C6_Core_Residency  C7_Core_Residency    C2_Pkg_Residency  C3_Pkg_Residency     C6_Pkg_Residency   C7_Pkg_Residency
           1.000563580                  0.3                  0.0                2.6               44.2                21.9               0.0                  0.0               0.0
           2.002235027                  0.4                  0.0                2.7               43.0                20.7               0.0                  0.0               0.0
      
      Committer testing:
      
        Before:
      
        [root@seventh ~]# perf stat --metric-only -M Power -I 1000
        #           time
             1.000383223
             2.001168182
             3.001968545
             4.002741200
             5.003442022
        ^C     5.777687244
      
        [root@seventh ~]#
      
        After the patch:
      
        [root@seventh ~]# perf stat --metric-only -M Power -I 1000
        #           time    Turbo_Utilization    C3_Core_Residency    C6_Core_Residency    C7_Core_Residency     C2_Pkg_Residency     C3_Pkg_Residency     C6_Pkg_Residency     C7_Pkg_Residency
             1.000406577                  0.4                  0.1                  1.4                 97.0                  0.0                  0.0                  0.0                  0.0
             2.001481572                  0.3                  0.0                  0.6                 97.9                  0.0                  0.0                  0.0                  0.0
             3.002332585                  0.2                  0.0                  1.0                 97.5                  0.0                  0.0                  0.0                  0.0
             4.003196624                  0.2                  0.0                  0.3                 98.6                  0.0                  0.0                  0.0                  0.0
             5.004063851                  0.3                  0.0                  0.7                 97.7                  0.0                  0.0                  0.0                  0.0
        ^C     5.471260276                  0.2                  0.0                  0.5                 49.3                  0.0                  0.0                  0.0                  0.0
      
        [root@seventh ~]#
        [root@seventh ~]# dmesg | grep -i skylake
        [    0.187807] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
        [root@seventh ~]#
      
      Fixes: f01642e4 ("perf metricgroup: Support multiple events for metricgroup")
      Signed-off-by: NKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191120084059.24458-1-kjain@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eb573e74
    • R
      perf/x86/pmu-events: Fix Kernel_Utilization metric · 0dd674ef
      Ravi Bangoria 提交于
      Kernel Utilization should divide ref cycles spent in kernel with total
      ref cycles.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Haiyan Song <haiyanx.song@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20191204162121.29998-1-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0dd674ef
    • A
      perf top: Do not bail out when perf_env__read_cpuid() returns ENOSYS · 61208e6e
      Arnaldo Carvalho de Melo 提交于
      'perf top' stopped working on hw architectures that do not provide a
      get_cpuid() implementation and thus fallback to the weak get_cpuid()
      default function.
      
      This is done because at annotation time we may need it in the arch
      specific annotation init routine, but that is only being used by arches
      that do provide a get_cpuid() implementation:
      
        $ find tools/  -name "*.[ch]" | xargs grep 'evlist->env'
        tools/perf/builtin-top.c:	top.evlist->env = &perf_env;
        tools/perf/util/evsel.c:		return evsel->evlist->env;
        tools/perf/util/s390-cpumsf.c:	sf->machine_type = s390_cpumsf_get_type(session->evlist->env->cpuid);
        tools/perf/util/header.c:	session->evlist->env = &header->env;
        tools/perf/util/sample-raw.c:	const char *arch_pf = perf_env__arch(evlist->env);
        $
      
        $ find tools/perf/arch  -name "*.[ch]" | xargs grep -w get_cpuid
        tools/perf/arch/x86/util/auxtrace.c:	ret = get_cpuid(buffer, sizeof(buffer));
        tools/perf/arch/x86/util/header.c:get_cpuid(char *buffer, size_t sz)
        tools/perf/arch/powerpc/util/header.c:get_cpuid(char *buffer, size_t sz)
        tools/perf/arch/s390/util/header.c: * Implementation of get_cpuid().
        tools/perf/arch/s390/util/header.c:int get_cpuid(char *buffer, size_t sz)
        tools/perf/arch/s390/util/header.c:	if (buf && get_cpuid(buf, 128))
        $
      
      For 'report' or 'script', i.e. tools working on perf.data files, that is
      setup while reading the header, its just top that needs to explicitely
      read it at tool start.
      
      Fixes: 608127f7 ("perf top: Initialize perf_env->cpuid, needed by the per arch annotation init routine")
      Reported-by: NJohn Garry <john.garry@huawei.com>
      Analysed-by: NJiri Olsa <jolsa@kernel.org>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: John Garry <john.garry@huawei.com> # arm64
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lkml.kernel.org/n/tip-lxwjr0cd2eggzx04a780ffrv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61208e6e
    • A
      perf arch: Make the default get_cpuid() return compatible error · 05267c7e
      Arnaldo Carvalho de Melo 提交于
      Some of the functions calling get_cpuid() propagate back the error it
      returns, and all are using errno (positive) values, make the weak
      default get_cpuid() function return ENOSYS to be consistent and to allow
      checking if this is an arch not providing this function or if a provided
      one is having trouble getting the cpuid, to decide if the warning should
      be provided to the user or just a debug message should be emitted.
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: John Garry <john.garry@huawei.com> # arm64
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lkml.kernel.org/n/tip-lxwjr0cd2eggzx04a780ffrv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      05267c7e
  6. 04 12月, 2019 4 次提交
  7. 03 12月, 2019 2 次提交
  8. 02 12月, 2019 1 次提交
    • A
      perf bench: Update the copies of x86's mem{cpy,set}_64.S · bd5c6b81
      Arnaldo Carvalho de Melo 提交于
      And update linux/linkage.h, which requires in turn that we make these
      files switch from ENTRY()/ENDPROC() to SYM_FUNC_START()/SYM_FUNC_END():
      
        tools/perf/arch/arm64/tests/regs_load.S
        tools/perf/arch/arm/tests/regs_load.S
        tools/perf/arch/powerpc/tests/regs_load.S
        tools/perf/arch/x86/tests/regs_load.S
      
      We also need to switch SYM_FUNC_START_LOCAL() to SYM_FUNC_START() for
      the functions used directly by 'perf bench', and update
      tools/perf/check_headers.sh to ignore those changes when checking if the
      kernel original files drifted from the copies we carry.
      
      This is to get the changes from:
      
        6dcc5627 ("x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*")
        ef1e0315 ("x86/asm: Make some functions local")
        e9b9d020 ("x86/asm: Annotate aliases")
      
      And address these tools/perf build warnings:
      
        Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
        diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
        Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
        diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-tay3l8x8k11p7y3qcpqh9qh5@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bd5c6b81
  9. 30 11月, 2019 1 次提交
  10. 29 11月, 2019 4 次提交