1. 26 11月, 2020 4 次提交
    • L
      perf arm-spe: Refactor address packet handling · 09935ca7
      Leo Yan 提交于
      This patch is to refactor address packet handling, it defines macros for
      address packet's header and payload, these macros are used by decoder
      and the dump flow.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Link: https://lore.kernel.org/r/20201119152441.6972-5-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      09935ca7
    • L
      perf arm-spe: Add new function arm_spe_pkt_desc_addr() · ab2aa439
      Leo Yan 提交于
      This patch moves out the address parsing code from arm_spe_pkt_desc()
      and uses the new introduced function arm_spe_pkt_desc_addr() to process
      address packet.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Link: https://lore.kernel.org/r/20201119152441.6972-4-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab2aa439
    • L
      perf arm-spe: Refactor packet header parsing · 11695142
      Leo Yan 提交于
      The packet header parsing uses the hard coded values and it uses nested
      if-else statements.
      
      To improve the readability, this patch refactors the macros for packet
      header format so it removes the hard coded values.  Furthermore, based
      on the new mask macros it reduces the nested if-else statements and
      changes to use the flat conditions checking, this is directive and can
      easily map to the descriptions in ARMv8-a architecture reference manual
      (ARM DDI 0487E.a), chapter 'D10.1.5 Statistical Profiling Extension
      protocol packet headers'.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Link: https://lore.kernel.org/r/20201119152441.6972-3-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      11695142
    • L
      perf arm-spe: Refactor printing string to buffer · 75eeaddd
      Leo Yan 提交于
      When outputs strings to the decoding buffer with function snprintf(),
      SPE decoder needs to detects if any error returns from snprintf() and if
      so needs to directly bail out.  If snprintf() returns success, it needs
      to update buffer pointer and reduce the buffer length so can continue to
      output the next string into the consequent memory space.
      
      This complex logics are spreading in the function arm_spe_pkt_desc() so
      there has many duplicate codes for handling error detecting, increment
      buffer pointer and decrement buffer size.
      
      To avoid the duplicate code, this patch introduces a new helper function
      arm_spe_pkt_out_string() which is used to wrap up the complex logics,
      and it's used by the caller arm_spe_pkt_desc().  This patch moves the
      variable 'blen' as the function's local variable so allows to remove
      the unnecessary braces and improve the readability.
      
      This patch simplifies the return value for arm_spe_pkt_desc(): '0' means
      success and other values mean an error has occurred.  To realize this,
      it relies on arm_spe_pkt_out_string()'s parameter 'err', the 'err' is a
      cumulative value, returns its final value if printing buffer is called
      for one time or multiple times.  Finally, the error is handled in a
      central place, rather than directly bailing out in switch-cases, it
      returns error at the end of arm_spe_pkt_desc().
      
      This patch changes the caller arm_spe_dump() to respect the updated
      return value semantics of arm_spe_pkt_desc().
      Suggested-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: NDave Martin <Dave.Martin@arm.com>
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Link: https://lore.kernel.org/r/20201119152441.6972-2-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      75eeaddd
  2. 17 11月, 2020 3 次提交
    • I
      perf expr: Force encapsulation on expr_id_data · 29396cd5
      Ian Rogers 提交于
      This patch resolves some undefined behavior where variables in
      expr_id_data were accessed (for debugging) without being defined. To
      better enforce the tagged union behavior, the struct is moved into
      expr.c and accessors provided. Tag values (kinds) are explicitly
      identified.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Reviewed-By: Kajol Jain<kjain@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20200826153055.2067780-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      29396cd5
    • J
      perf vendor events: Update Skylake client events to v50 · 3d05181a
      Jin Yao 提交于
      - Update Skylake events to v50.
      - Update Skylake JSON metrics from TMAM 4.0.
      - Fix the issue in DRAM_Parallel_Reads
      - Fix the perf test warning
      
      Before:
      
        root@kbl-ppc:~# perf stat -M DRAM_Parallel_Reads -- sleep 1
        event syntax error: '{arb/event=0x80,umask=0x2/,arb/event=0x80,umask=0x2,thresh=1/}:W'
                             \___ unknown term 'thresh' for pmu 'uncore_arb'
      
        valid terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,percore
      
        Initial error:
        event syntax error: '..umask=0x2/,arb/event=0x80,umask=0x2,thresh=1/}:W'
                                          \___ Cannot find PMU `arb'. Missing kernel support?
      
        root@kbl-ppc:~# perf test metrics
        10: PMU events                                 :
        10.3: Parsing of PMU event table metrics               : Skip (some metrics failed)
        10.4: Parsing of PMU event table metrics with fake PMUs: Ok
        67: Parse and process metrics                  : Ok
      
      After:
      
        root@kbl-ppc:~# perf stat -M MEM_Parallel_Reads -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 4,951,646      arb/event=0x80,umask=0x2/ #    26.30 MEM_Parallel_Reads       (50.04%)
                   188,251      arb/event=0x80,umask=0x2,cmask=1/                                     (49.96%)
      
               1.000867010 seconds time elapsed
      
        root@kbl-ppc:~# perf test metrics
        10: PMU events                                 :
        10.3: Parsing of PMU event table metrics               : Ok
        10.4: Parsing of PMU event table metrics with fake PMUs: Ok
        67: Parse and process metrics                  : Ok
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/lkml/93fae76f-ce2b-ab0b-3ae9-cc9a2b4cbaec@linux.intel.com/Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3d05181a
    • N
      perf data: Allow to use stdio functions for pipe mode · 60136667
      Namhyung Kim 提交于
      When perf data is in a pipe, it reads each event separately using
      read(2) syscall.  This is a huge performance bottleneck when
      processing large data like in perf inject.  Also perf inject needs to
      use write(2) syscall for the output.
      
      So convert it to use buffer I/O functions in stdio library for pipe
      data.  This makes inject-build-id bench time drops from 20ms to 8ms.
      
        $ perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.074 msec (+- 0.013 msec)
          Average time per event: 0.792 usec (+- 0.001 usec)
          Average memory usage: 8328 KB (+- 0 KB)
          Average build-id-all injection took: 5.490 msec (+- 0.008 msec)
          Average time per event: 0.538 usec (+- 0.001 usec)
          Average memory usage: 7563 KB (+- 0 KB)
      
      This patch enables it just for perf inject when used with pipe (it's a
      default behavior).  Maybe we could do it for perf record and/or report
      later..
      
      Committer testing:
      
      Before:
      
        $ perf stat -r 5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.605 msec (+- 0.064 msec)
          Average time per event: 1.334 usec (+- 0.006 usec)
          Average memory usage: 12220 KB (+- 7 KB)
          Average build-id-all injection took: 11.458 msec (+- 0.058 msec)
          Average time per event: 1.123 usec (+- 0.006 usec)
          Average memory usage: 11546 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.673 msec (+- 0.057 msec)
          Average time per event: 1.341 usec (+- 0.006 usec)
          Average memory usage: 12508 KB (+- 8 KB)
          Average build-id-all injection took: 11.437 msec (+- 0.046 msec)
          Average time per event: 1.121 usec (+- 0.004 usec)
          Average memory usage: 11812 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.641 msec (+- 0.069 msec)
          Average time per event: 1.337 usec (+- 0.007 usec)
          Average memory usage: 12302 KB (+- 8 KB)
          Average build-id-all injection took: 10.820 msec (+- 0.106 msec)
          Average time per event: 1.061 usec (+- 0.010 usec)
          Average memory usage: 11616 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.379 msec (+- 0.074 msec)
          Average time per event: 1.312 usec (+- 0.007 usec)
          Average memory usage: 12334 KB (+- 8 KB)
          Average build-id-all injection took: 11.288 msec (+- 0.071 msec)
          Average time per event: 1.107 usec (+- 0.007 usec)
          Average memory usage: 11657 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.534 msec (+- 0.058 msec)
          Average time per event: 1.327 usec (+- 0.006 usec)
          Average memory usage: 12264 KB (+- 8 KB)
          Average build-id-all injection took: 11.557 msec (+- 0.076 msec)
          Average time per event: 1.133 usec (+- 0.007 usec)
          Average memory usage: 11593 KB (+- 8 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  4,060.05 msec task-clock:u              #    1.566 CPUs utilized            ( +-  0.65% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   101,888      page-faults:u             #    0.025 M/sec                    ( +-  0.12% )
             3,745,833,163      cycles:u                  #    0.923 GHz                      ( +-  0.10% )  (83.22%)
               194,346,613      stalled-cycles-frontend:u #    5.19% frontend cycles idle     ( +-  0.57% )  (83.30%)
               708,495,034      stalled-cycles-backend:u  #   18.91% backend cycles idle      ( +-  0.48% )  (83.48%)
             5,629,328,628      instructions:u            #    1.50  insn per cycle
                                                          #    0.13  stalled cycles per insn  ( +-  0.21% )  (83.57%)
             1,236,697,927      branches:u                #  304.602 M/sec                    ( +-  0.16% )  (83.44%)
                17,564,877      branch-misses:u           #    1.42% of all branches          ( +-  0.23% )  (82.99%)
      
                    2.5934 +- 0.0128 seconds time elapsed  ( +-  0.49% )
      
        $
      
      After:
      
        $ perf stat -r 5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.560 msec (+- 0.125 msec)
          Average time per event: 0.839 usec (+- 0.012 usec)
          Average memory usage: 12520 KB (+- 8 KB)
          Average build-id-all injection took: 5.789 msec (+- 0.054 msec)
          Average time per event: 0.568 usec (+- 0.005 usec)
          Average memory usage: 11919 KB (+- 9 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.639 msec (+- 0.111 msec)
          Average time per event: 0.847 usec (+- 0.011 usec)
          Average memory usage: 12732 KB (+- 8 KB)
          Average build-id-all injection took: 5.647 msec (+- 0.069 msec)
          Average time per event: 0.554 usec (+- 0.007 usec)
          Average memory usage: 12093 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.551 msec (+- 0.096 msec)
          Average time per event: 0.838 usec (+- 0.009 usec)
          Average memory usage: 12739 KB (+- 8 KB)
          Average build-id-all injection took: 5.617 msec (+- 0.061 msec)
          Average time per event: 0.551 usec (+- 0.006 usec)
          Average memory usage: 12105 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.403 msec (+- 0.097 msec)
          Average time per event: 0.824 usec (+- 0.010 usec)
          Average memory usage: 12770 KB (+- 8 KB)
          Average build-id-all injection took: 5.611 msec (+- 0.085 msec)
          Average time per event: 0.550 usec (+- 0.008 usec)
          Average memory usage: 12134 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.518 msec (+- 0.102 msec)
          Average time per event: 0.835 usec (+- 0.010 usec)
          Average memory usage: 12518 KB (+- 10 KB)
          Average build-id-all injection took: 5.503 msec (+- 0.073 msec)
          Average time per event: 0.540 usec (+- 0.007 usec)
          Average memory usage: 11882 KB (+- 8 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  2,394.88 msec task-clock:u              #    1.577 CPUs utilized            ( +-  0.83% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   103,181      page-faults:u             #    0.043 M/sec                    ( +-  0.11% )
             3,548,172,030      cycles:u                  #    1.482 GHz                      ( +-  0.30% )  (83.26%)
                81,537,700      stalled-cycles-frontend:u #    2.30% frontend cycles idle     ( +-  1.54% )  (83.24%)
               876,631,544      stalled-cycles-backend:u  #   24.71% backend cycles idle      ( +-  1.14% )  (83.45%)
             5,960,361,707      instructions:u            #    1.68  insn per cycle
                                                          #    0.15  stalled cycles per insn  ( +-  0.27% )  (83.26%)
             1,269,413,491      branches:u                #  530.054 M/sec                    ( +-  0.10% )  (83.48%)
                11,372,453      branch-misses:u           #    0.90% of all branches          ( +-  0.52% )  (83.31%)
      
                   1.51874 +- 0.00642 seconds time elapsed  ( +-  0.42% )
      
        $
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201030054742.87740-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      60136667
  3. 12 11月, 2020 5 次提交
  4. 11 11月, 2020 9 次提交
  5. 04 11月, 2020 19 次提交