1. 30 7月, 2019 11 次提交
  2. 23 7月, 2019 1 次提交
    • J
      perf stat: Fix segfault for event group in repeat mode · 08ef3af1
      Jiri Olsa 提交于
      Numfor Mbiziwo-Tiapo reported segfault on stat of event group in repeat
      mode:
      
        # perf stat -e '{cycles,instructions}' -r 10 ls
      
      It's caused by memory corruption due to not cleaned evsel's id array and
      index, which needs to be rebuilt in every stat iteration. Currently the
      ids index grows, while the array (which is also not freed) has the same
      size.
      
      Fixing this by releasing id array and zeroing ids index in
      perf_evsel__close function.
      
      We also need to keep the evsel_list alive for stat record (which is
      disabled in repeat mode).
      Reported-by: NNumfor Mbiziwo-Tiapo <nums@google.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Drayton <mbd@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20190715142121.GC6032@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      08ef3af1
  3. 09 7月, 2019 3 次提交
  4. 26 6月, 2019 1 次提交
    • A
      tools perf: Move from sane_ctype.h obtained from git to the Linux's original · 3052ba56
      Arnaldo Carvalho de Melo 提交于
      We got the sane_ctype.h headers from git and kept using it so far, but
      since that code originally came from the kernel sources to the git
      sources, perhaps its better to just use the one in the kernel, so that
      we can leverage tools/perf/check_headers.sh to be notified when our copy
      gets out of sync, i.e. when fixes or goodies are added to the code we've
      copied.
      
      This will help with things like tools/lib/string.c where we want to have
      more things in common with the kernel, such as strim(), skip_spaces(),
      etc so as to go on removing the things that we have in tools/perf/util/
      and instead using the code in the kernel, indirectly and removing things
      like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
      are made to the original code.
      
      Hopefully this also should help with reducing the difference of code
      hosted in tools/ to the one in the kernel proper.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3052ba56
  5. 11 6月, 2019 1 次提交
    • K
      perf stat: Support per-die aggregation · db5742b6
      Kan Liang 提交于
      It is useful to aggregate counts per die. E.g. Uncore becomes die-scope
      on Xeon Cascade Lake-AP.
      
      Introduce a new option "--per-die" to support per-die aggregation.
      
      The global id for each core has been changed to socket + die id + core
      id. The global id for each die is socket + die id.
      
      Add die information for per-core aggregation. The output of per-core
      aggregation will be changed from "S0-C0" to "S0-D0-C0". Any scripts
      which rely on the output format of per-core aggregation probably be
      broken.
      
      For 'perf stat record/report', there is no die information when
      processing the old perf.data. The per-die result will be the same as
      per-socket.
      
      Committer notes:
      
      Renamed 'die' variable to 'die_id' to fix the build in some systems:
      
          CC       /tmp/build/perf/builtin-script.o
        cc1: warnings being treated as errors
        builtin-stat.c: In function 'perf_env__get_die':
        builtin-stat.c:963: error: declaration of 'die' shadows a global declaration
        util/util.h:19: error: shadowed declaration is here
        mv: cannot stat `/tmp/build/perf/.builtin-stat.o.tmp': No such file or directory
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/n/tip-bsnhx7vgsuu6ei307mw60mbj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      db5742b6
  6. 05 6月, 2019 1 次提交
  7. 17 5月, 2019 1 次提交
    • J
      perf stat: Support 'percore' event qualifier · 4fc4d8df
      Jin Yao 提交于
      With this patch, we can use the 'percore' event qualifier in perf-stat.
      
        root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
          1.000773050 S0-C0   98,352,832 cpu/event=0,umask=0x3,percore=1/  (50.01%)
          1.000773050 S0-C1  103,763,057 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 S0-C2  196,776,995 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 S0-C3  176,493,779 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 CPU0    47,699,641 cpu/event=0,umask=0x3/            (50.02%)
          1.000773050 CPU1    49,052,451 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU2   102,771,422 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU3   100,784,662 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU4    43,171,342 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU5    54,152,158 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU6    93,618,410 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU7    74,477,589 cpu/event=0,umask=0x3/            (49.99%)
      
      In this example, we count the event 'ref-cycles' per-core and per-CPU in
      one perf stat command-line. From the output, we can see:
      
        S0-C0 = CPU0 + CPU4
        S0-C1 = CPU1 + CPU5
        S0-C2 = CPU2 + CPU6
        S0-C3 = CPU3 + CPU7
      
      So the result is expected (tiny difference is ignored).
      
      Note that, the 'percore' event qualifier needs to use with option '-A'.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4fc4d8df
  8. 16 4月, 2019 1 次提交
  9. 02 4月, 2019 1 次提交
    • A
      perf stat: Implement duration_time as a proper event · f0fbb114
      Andi Kleen 提交于
      The perf metric expression use 'duration_time' internally to normalize
      events.  Normal 'perf stat' without -x also prints the duration time.
      But when using -x, the interval is not output anywhere, which is
      inconvenient for any post processing which often wants to normalize
      values to time.
      
      So implement 'duration_time' as a proper perf event that can be
      specified explicitely with -e.
      
      The previous implementation of 'duration_time' only worked for metric
      processing. This adds the concept of a tool event that is handled by the
      tool. On the kernel level it is still mapped to the dummy software
      event, but the values are not read anymore, but instead computed by the
      tool.
      
      Add proper plumbing to handle this in the event parser, and display it
      in 'perf stat'. We don't want 'duration_time' to be added up, so it's
      only printed for the first CPU.
      
      % perf stat -e duration_time,cycles true
      
       Performance counter stats for 'true':
      
                 555,476 ns   duration_time
                 771,958      cycles
      
             0.000555476 seconds time elapsed
      
             0.000644000 seconds user
             0.000000000 seconds sys
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190326221823.11518-3-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f0fbb114
  10. 20 3月, 2019 1 次提交
    • A
      perf stat: Fix --no-scale · 75998bb2
      Andi Kleen 提交于
      The -c option to enable multiplex scaling has been useless for quite
      some time because scaling is default.
      
      It's only useful as --no-scale to disable scaling. But the non scaling
      code path has bitrotted and doesn't print anything because perf output
      code relies on value run/ena information.
      
      Also even when we don't want to scale a value it's still useful to show
      its multiplex percentage.
      
      This patch:
        - Fixes help and documentation to show --no-scale instead of -c
        - Removes -c, only keeps the long option because -c doesn't support negatives.
        - Enables running/enabled even with --no-scale
        - And fixes some other problems in the no-scale output.
      
      Before:
      
        $ perf stat --no-scale -e cycles true
      
         Performance counter stats for 'true':
      
             <not counted>      cycles
      
               0.000984154 seconds time elapsed
      
      After:
      
        $ ./perf stat --no-scale -e cycles true
      
         Performance counter stats for 'true':
      
                   706,070      cycles
      
               0.001219821 seconds time elapsed
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LPU-Reference: 20190314225002.30108-9-andi@firstfloor.org
      Link: https://lkml.kernel.org/n/tip-xggjvwcdaj2aqy8ib3i4b1g6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      75998bb2
  11. 23 2月, 2019 1 次提交
    • J
      perf data: Add global path holder · 2d4f2799
      Jiri Olsa 提交于
      Add a 'path' member to 'struct perf_data'. It will keep the configured
      path for the data (const char *). The path in struct perf_data_file is
      now dynamically allocated (duped) from it.
      
      This scheme is useful/used in following patches where struct
      perf_data::path holds the 'configure' directory path and struct
      perf_data_file::path holds the allocated path for specific files.
      
      Also it actually makes the code little simpler.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
      [ Fixup data-convert-bt.c missing conversion ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2d4f2799
  12. 06 2月, 2019 1 次提交
  13. 22 1月, 2019 1 次提交
  14. 03 1月, 2019 1 次提交
    • J
      perf stat: Fix endless wait for child process · 8a99255a
      Jin Yao 提交于
      We hit a 'perf stat' issue by using following script:
      
        #!/bin/bash
      
        sleep 1000 &
        exec perf stat -a -e cycles -I1000 -- sleep 5
      
      Since "perf stat" is launched by exec, the "sleep 1000" would be the
      child process of "perf stat". The wait4() call will not return because
      it's waiting for the child process "sleep 1000" to end. So 'perf stat'
      doesn't return even after 5s passes.
      
      This patch lets 'perf stat' return when the specified child process ends
      (in this case, the specified child process is "sleep 5").
      
      Committer testing:
      
        # cat test.sh
        #!/bin/bash
      
        sleep 10 &
        exec perf stat -a -e cycles -I1000 -- sleep 5
        #
      
      Before:
      
        # time ./test.sh
        #           time             counts unit events
             1.001113090        108,453,351      cycles
             2.002062196        142,075,435      cycles
             3.002896194        164,801,068      cycles
             4.003731666        107,062,140      cycles
             5.002068867        112,241,832      cycles
      
        real	0m10.066s
        user	0m0.016s
        sys	0m0.101s
        #
      
      After:
      
        # time ./test.sh
        #           time             counts unit events
             1.001016096         91,412,027      cycles
             2.002014963        124,063,708      cycles
             3.002883964        125,993,929      cycles
             4.003706470        120,465,734      cycles
             5.002006778        163,560,355      cycles
      
        real	0m5.123s
        user	0m0.014s
        sys	0m0.105s
        #
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1546501245-4512-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8a99255a
  15. 18 12月, 2018 1 次提交
  16. 06 11月, 2018 1 次提交
  17. 22 10月, 2018 1 次提交
  18. 19 9月, 2018 1 次提交
  19. 31 8月, 2018 10 次提交