1. 20 7月, 2022 1 次提交
    • A
      perf inject: Add support for injecting guest sideband events · 97406a7e
      Adrian Hunter 提交于
      Inject events from a perf.data file recorded in a virtual machine into
      a perf.data file recorded on the host at the same time.
      
      Only side band events (e.g. mmap, comm, fork, exit etc) and build IDs are
      injected.  Additionally, the guest kcore_dir is copied as kcore_dir__
      appended to the machine PID.
      
      This is non-trivial because:
       o It is not possible to process 2 sessions simultaneously so instead
       events are first written to a temporary file.
       o To avoid conflict, guest sample IDs are replaced with new unused sample
       IDs.
       o Guest event's CPU is changed to be the host CPU because it is more
       useful for reporting and analysis.
       o Sample ID is mapped to machine PID which is recorded with VCPU in the
       id index. This is important to allow guest events to be related to the
       guest machine and VCPU.
       o Timestamps must be converted.
       o Events are inserted to obey finished-round ordering.
      
      The anticipated use-case is:
       - start recording sideband events in a guest machine
       - start recording an AUX area trace on the host which can trace also the
       guest (e.g. Intel PT)
       - run test case on the guest
       - stop recording on the host
       - stop recording on the guest
       - copy the guest perf.data file to the host
       - inject the guest perf.data file sideband events into the host perf.data
       file using perf inject
       - the resulting perf.data file can now be used
      
      Subsequent patches provide Intel PT support for this.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: kvm@vger.kernel.org
      Link: https://lore.kernel.org/r/20220711093218.10967-25-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      97406a7e
  2. 26 6月, 2022 2 次提交
  3. 25 6月, 2022 1 次提交
    • R
      perf header: Record non-CPU PMU capabilities · 2139f742
      Ravi Bangoria 提交于
      PMUs advertise their capabilities via sysfs attribute files but
      the perf tool currently parses only core(CPU) or hybrid core PMU
      capabilities. Add support of recording non-core PMU capabilities
      int perf.data header.
      
      Note that a newly proposed HEADER_PMU_CAPS is replacing existing
      HEADER_HYBRID_CPU_PMU_CAPS. Special care is taken for hybrid core
      PMUs by writing their capabilities first in the perf.data header
      to make sure new perf.data file being read by old perf tool does
      not break.
      Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NRavi Bangoria <ravi.bangoria@amd.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rrichter@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: like.xu.linux@gmail.com
      Cc: x86@kernel.org
      Link: https://lore.kernel.org/r/20220604044519.594-6-ravi.bangoria@amd.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2139f742
  4. 23 6月, 2022 1 次提交
    • A
      perf record: Add finished init event · 3812d298
      Adrian Hunter 提交于
      In preparation for recording sideband events in a virtual machine guest so
      that they can be injected into a host perf.data file.
      
      This is needed to enable injecting events after the initial synthesized
      user events (that have an all zero id sample) but before regular events.
      
      Committer notes:
      
      Add entry about PERF_RECORD_FINISHED_INIT to
      tools/perf/Documentation/perf.data-file-format.txt.
      
      Committer testing:
      
      Before:
      
        # perf report -D | grep FINISHED
        0 0x5910 [0x8]: PERF_RECORD_FINISHED_ROUND
          FINISHED_ROUND events:          1  ( 0.5%)
        #
      
      After:
      
        # perf record -- sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ]
        # perf report -D | grep FINISHED
        0 0x5068 [0x8]: PERF_RECORD_FINISHED_INIT: unhandled!
        0 0x5390 [0x8]: PERF_RECORD_FINISHED_ROUND
          FINISHED_ROUND events:          1  ( 0.5%)
           FINISHED_INIT events:          1  ( 0.5%)
        #
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NIan Rogers <irogers@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220610113316.6682-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3812d298
  5. 23 5月, 2022 2 次提交
    • A
      perf inject: Keep a copy of kcore_dir · d8fc0855
      Adrian Hunter 提交于
      If the input perf.data has a kcore_dir, copy it into the output, since
      at least the kallsyms in the kcore_dir will be useful to the output.
      
      Example:
      
       Before:
      
        $ ls -lR perf.data-from-desktop
        perf.data-from-desktop:
        total 916
        -rw------- 1 user user 931756 May 19 09:55 data
        drwx------ 2 user user   4096 May 19 09:55 kcore_dir
      
        perf.data-from-desktop/kcore_dir:
        total 42952
        -r-------- 1 user user  7582467 May 19 09:55 kallsyms
        -r-------- 1 user user 36388864 May 19 09:55 kcore
        -r-------- 1 user user     4828 May 19 09:55 modules
      
        $ perf inject -i perf.data-from-desktop -o injected-perf.data
      
        $ ls -lR injected-perf.data
        -rw------- 1 user user 931320 May 20 15:08 injected-perf.data
      
       After:
      
        $ perf inject -i perf.data-from-desktop -o injected-perf.data
      
        $ ls -lR injected-perf.data
        injected-perf.data:
        total 916
        -rw------- 1 user user 931320 May 20 15:21 data
        drwx------ 2 user user   4096 May 20 15:21 kcore_dir
      
        injected-perf.data/kcore_dir:
        total 42952
        -r-------- 1 user user  7582467 May 20 15:21 kallsyms
        -r-------- 1 user user 36388864 May 20 15:21 kcore
        -r-------- 1 user user     4828 May 20 15:21 modules
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220520132404.25853-6-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d8fc0855
    • A
      perf inject: Keep some features sections from input file · 180b3d06
      Adrian Hunter 提交于
      perf inject overwrites feature sections with information from the current
      machine. It makes more sense to keep original information that describes
      the machine or software when perf record was run.
      
      Example: perf.data from "Desktop" injected on "nuc11"
      
       Before:
      
        $ perf script --header-only -i perf.data-from-desktop | head -15
        # ========
        # captured on    : Thu May 19 09:55:50 2022
        # header version : 1
        # data offset    : 1208
        # data size      : 837480
        # feat offset    : 838688
        # hostname : Desktop
        # os release : 5.13.0-41-generic
        # perf version : 5.18.rc5.gac837f7ca7ed
        # arch : x86_64
        # nrcpus online : 28
        # nrcpus avail : 28
        # cpudesc : Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
        # cpuid : GenuineIntel,6,85,4
        # total memory : 65548656 kB
      
        $ perf inject -i perf.data-from-desktop -o injected-perf.data
      
        $ perf script --header-only -i injected-perf.data | head -15
        # ========
        # captured on    : Fri May 20 15:06:55 2022
        # header version : 1
        # data offset    : 1208
        # data size      : 837480
        # feat offset    : 838688
        # hostname : nuc11
        # os release : 5.17.5-local
        # perf version : 5.18.rc5.g0f828fdeb9af
        # arch : x86_64
        # nrcpus online : 8
        # nrcpus avail : 8
        # cpudesc : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
        # cpuid : GenuineIntel,6,140,1
        # total memory : 16012124 kB
      
       After:
      
        $ perf inject -i perf.data-from-desktop -o injected-perf.data
      
        $ perf script --header-only -i injected-perf.data | head -15
        # ========
        # captured on    : Fri May 20 15:08:54 2022
        # header version : 1
        # data offset    : 1208
        # data size      : 837480
        # feat offset    : 838688
        # hostname : Desktop
        # os release : 5.13.0-41-generic
        # perf version : 5.18.rc5.gac837f7ca7ed
        # arch : x86_64
        # nrcpus online : 28
        # nrcpus avail : 28
        # cpudesc : Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
        # cpuid : GenuineIntel,6,85,4
        # total memory : 65548656 kB
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220520132404.25853-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      180b3d06
  6. 12 2月, 2022 1 次提交
    • I
      perf namespaces: Add functions to access nsinfo · bcaf0a97
      Ian Rogers 提交于
      Having functions to access nsinfo reduces the places where reference
      counting checking needs to be added.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: http://lore.kernel.org/lkml/20220211103415.2737789-14-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bcaf0a97
  7. 11 2月, 2022 2 次提交
  8. 23 1月, 2022 1 次提交
  9. 18 12月, 2021 2 次提交
    • A
      perf inject: Fix segfault due to perf_data__fd() without open · c271a55b
      Adrian Hunter 提交于
      The fixed commit attempts to get the output file descriptor even if the
      file was never opened e.g.
      
        $ perf record uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
        $ perf inject -i perf.data --vm-time-correlation=dry-run
        Segmentation fault (core dumped)
        $ gdb --quiet perf
        Reading symbols from perf...
        (gdb) r inject -i perf.data --vm-time-correlation=dry-run
        Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
        [Thread debugging using libthread_db enabled]
        Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
      
        Program received signal SIGSEGV, Segmentation fault.
        __GI___fileno (fp=0x0) at fileno.c:35
        35      fileno.c: No such file or directory.
        (gdb) bt
        #0  __GI___fileno (fp=0x0) at fileno.c:35
        #1  0x00005621e48dd987 in perf_data__fd (data=0x7fff4c68bd08) at util/data.h:72
        #2  perf_data__fd (data=0x7fff4c68bd08) at util/data.h:69
        #3  cmd_inject (argc=<optimized out>, argv=0x7fff4c69c1f0) at builtin-inject.c:1017
        #4  0x00005621e4936783 in run_builtin (p=0x5621e4ee6878 <commands+600>, argc=4, argv=0x7fff4c69c1f0) at perf.c:313
        #5  0x00005621e4897d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
        #6  run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
        #7  main (argc=4, argv=0x7fff4c69c1f0) at perf.c:539
        (gdb)
      
      Fixes: 0ae03893 ("perf tools: Pass a fd to perf_file_header__read_pipe()")
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: stable@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20211213084829.114772-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c271a55b
    • A
      perf inject: Fix segfault due to close without open · 0c8e32fe
      Adrian Hunter 提交于
      The fixed commit attempts to close inject.output even if it was never
      opened e.g.
      
        $ perf record uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
        $ perf inject -i perf.data --vm-time-correlation=dry-run
        Segmentation fault (core dumped)
        $ gdb --quiet perf
        Reading symbols from perf...
        (gdb) r inject -i perf.data --vm-time-correlation=dry-run
        Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
        [Thread debugging using libthread_db enabled]
        Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
      
        Program received signal SIGSEGV, Segmentation fault.
        0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
        48      iofclose.c: No such file or directory.
        (gdb) bt
        #0  0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
        #1  0x0000557fc7b74f92 in perf_data__close (data=data@entry=0x7ffcdafa6578) at util/data.c:376
        #2  0x0000557fc7a6b807 in cmd_inject (argc=<optimized out>, argv=<optimized out>) at builtin-inject.c:1085
        #3  0x0000557fc7ac4783 in run_builtin (p=0x557fc8074878 <commands+600>, argc=4, argv=0x7ffcdafb6a60) at perf.c:313
        #4  0x0000557fc7a25d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
        #5  run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
        #6  main (argc=4, argv=0x7ffcdafb6a60) at perf.c:539
        (gdb)
      
      Fixes: 02e6246f ("perf inject: Close inject.output on exit")
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: stable@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20211213084829.114772-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0c8e32fe
  10. 07 12月, 2021 1 次提交
  11. 07 11月, 2021 1 次提交
  12. 20 10月, 2021 1 次提交
  13. 02 8月, 2021 4 次提交
  14. 16 7月, 2021 2 次提交
  15. 26 5月, 2021 2 次提交
  16. 12 5月, 2021 2 次提交
  17. 05 4月, 2021 1 次提交
  18. 04 2月, 2021 1 次提交
    • Y
      perf inject jit: Add namespaces support · 67dec926
      Yonatan Goldschmidt 提交于
      This patch fixes "perf inject --jit" to properly operate on
      namespaced/containerized processes:
      
      * jitdump files are generated by the process, thus they should be
        looked up in its mount NS.
      
      * DSOs of injected MMAP events will later be looked up in the process
        mount NS, so write them into its NS.
      
      * PIDs & TIDs from jitdump events need to be translated to the PID as
        seen by "perf record" before written into MMAP events.
      
      For a process in a different PID NS, the TID & PID given in the jitdump
      event are actually ignored; I use the TID & PID of the thread which
      mmap()ed the jitdump file. This is simplified and won't do for forks of
      the initial process, if they continue using the same jitdump file.
      Future patches might improve it.
      
      This was tested by recording a NodeJS process running with
      "--perf-prof", inside a Docker container, and by recording another
      NodeJS process running in the same namespaces as perf itself, to make
      sure it's not broken for non-containerized processes.
      Signed-off-by: NYonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20201105015604.1726943-1-yonatan.goldschmidt@granulate.ioSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      67dec926
  19. 17 11月, 2020 2 次提交
    • A
      perf inject: Fix file corruption due to event deletion · 1c756cd4
      Al Grant 提交于
      "perf inject" can create corrupt files when synthesizing sample events from AUX
      data. This happens when in the input file, the first event (for the AUX data)
      has a different sample_type from the second event (generally dummy).
      
      Specifically, they differ in the bits that indicate the standard fields
      appended to perf records in the mmap buffer. "perf inject" deletes the first
      event and moves up the second event to first position.
      
      The problem is with the synthetic PERF_RECORD_MMAP (etc.) events created
      by "perf record".
      
      Since these are synthetic versions of events which are normally produced
      by the kernel, they have to have the standard fields appended as
      described by sample_type.
      
      "perf record" fills these in with zeroes, including the IDENTIFIER
      field; perf readers interpret records with zero IDENTIFIER using the
      descriptor for the first event in the file.
      
      Since "perf inject" changes the first event, these synthetic records are
      then processed with the wrong value of sample_type, and the perf reader
      reads bad data, reports on incorrect length records etc.
      
      Mismatching sample_types are seen with "perf record -e cs_etm//", where the AUX
      event has TID|TIME|CPU|IDENTIFIER and the dummy event has TID|TIME|IDENTIFIER.
      
      Perhaps they could be the same, but it isn't normally a problem if they aren't
      - perf has no problems reading the file.
      
      The sample_types have to agree on the position of IDENTIFIER, because
      that's how perf finds the right event descriptor in the first place, but
      they don't normally have to agree on other fields, and perf doesn't
      check that they do.
      
      The problem is specific to the way "perf inject" reorganizes the events
      and the way synthetic MMAP events are recorded with a zero identifier. A
      simple solution is to stop "perf inject" deleting the tracing event.
      
      Committer testing
      
      Removed the now unused 'evsel' variable, update the comment about the
      evsel removal not being performed anymore, and apply the patch manually
      as it failed with this warning:
      
        warning: Patch sent with format=flowed; space at the end of lines might be lost.
      
      Testing it with:
      
        $ perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.543 msec (+- 0.130 msec)
          Average time per event: 0.838 usec (+- 0.013 usec)
          Average memory usage: 12717 KB (+- 9 KB)
          Average build-id-all injection took: 5.710 msec (+- 0.058 msec)
          Average time per event: 0.560 usec (+- 0.006 usec)
          Average memory usage: 12079 KB (+- 7 KB)
        $
      Signed-off-by: NAl Grant <al.grant@arm.com>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LPU-Reference: b9cf5611-daae-2390-3439-6617f8f0a34b@foss.arm.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1c756cd4
    • N
      perf data: Allow to use stdio functions for pipe mode · 60136667
      Namhyung Kim 提交于
      When perf data is in a pipe, it reads each event separately using
      read(2) syscall.  This is a huge performance bottleneck when
      processing large data like in perf inject.  Also perf inject needs to
      use write(2) syscall for the output.
      
      So convert it to use buffer I/O functions in stdio library for pipe
      data.  This makes inject-build-id bench time drops from 20ms to 8ms.
      
        $ perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.074 msec (+- 0.013 msec)
          Average time per event: 0.792 usec (+- 0.001 usec)
          Average memory usage: 8328 KB (+- 0 KB)
          Average build-id-all injection took: 5.490 msec (+- 0.008 msec)
          Average time per event: 0.538 usec (+- 0.001 usec)
          Average memory usage: 7563 KB (+- 0 KB)
      
      This patch enables it just for perf inject when used with pipe (it's a
      default behavior).  Maybe we could do it for perf record and/or report
      later..
      
      Committer testing:
      
      Before:
      
        $ perf stat -r 5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.605 msec (+- 0.064 msec)
          Average time per event: 1.334 usec (+- 0.006 usec)
          Average memory usage: 12220 KB (+- 7 KB)
          Average build-id-all injection took: 11.458 msec (+- 0.058 msec)
          Average time per event: 1.123 usec (+- 0.006 usec)
          Average memory usage: 11546 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.673 msec (+- 0.057 msec)
          Average time per event: 1.341 usec (+- 0.006 usec)
          Average memory usage: 12508 KB (+- 8 KB)
          Average build-id-all injection took: 11.437 msec (+- 0.046 msec)
          Average time per event: 1.121 usec (+- 0.004 usec)
          Average memory usage: 11812 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.641 msec (+- 0.069 msec)
          Average time per event: 1.337 usec (+- 0.007 usec)
          Average memory usage: 12302 KB (+- 8 KB)
          Average build-id-all injection took: 10.820 msec (+- 0.106 msec)
          Average time per event: 1.061 usec (+- 0.010 usec)
          Average memory usage: 11616 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.379 msec (+- 0.074 msec)
          Average time per event: 1.312 usec (+- 0.007 usec)
          Average memory usage: 12334 KB (+- 8 KB)
          Average build-id-all injection took: 11.288 msec (+- 0.071 msec)
          Average time per event: 1.107 usec (+- 0.007 usec)
          Average memory usage: 11657 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.534 msec (+- 0.058 msec)
          Average time per event: 1.327 usec (+- 0.006 usec)
          Average memory usage: 12264 KB (+- 8 KB)
          Average build-id-all injection took: 11.557 msec (+- 0.076 msec)
          Average time per event: 1.133 usec (+- 0.007 usec)
          Average memory usage: 11593 KB (+- 8 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  4,060.05 msec task-clock:u              #    1.566 CPUs utilized            ( +-  0.65% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   101,888      page-faults:u             #    0.025 M/sec                    ( +-  0.12% )
             3,745,833,163      cycles:u                  #    0.923 GHz                      ( +-  0.10% )  (83.22%)
               194,346,613      stalled-cycles-frontend:u #    5.19% frontend cycles idle     ( +-  0.57% )  (83.30%)
               708,495,034      stalled-cycles-backend:u  #   18.91% backend cycles idle      ( +-  0.48% )  (83.48%)
             5,629,328,628      instructions:u            #    1.50  insn per cycle
                                                          #    0.13  stalled cycles per insn  ( +-  0.21% )  (83.57%)
             1,236,697,927      branches:u                #  304.602 M/sec                    ( +-  0.16% )  (83.44%)
                17,564,877      branch-misses:u           #    1.42% of all branches          ( +-  0.23% )  (82.99%)
      
                    2.5934 +- 0.0128 seconds time elapsed  ( +-  0.49% )
      
        $
      
      After:
      
        $ perf stat -r 5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.560 msec (+- 0.125 msec)
          Average time per event: 0.839 usec (+- 0.012 usec)
          Average memory usage: 12520 KB (+- 8 KB)
          Average build-id-all injection took: 5.789 msec (+- 0.054 msec)
          Average time per event: 0.568 usec (+- 0.005 usec)
          Average memory usage: 11919 KB (+- 9 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.639 msec (+- 0.111 msec)
          Average time per event: 0.847 usec (+- 0.011 usec)
          Average memory usage: 12732 KB (+- 8 KB)
          Average build-id-all injection took: 5.647 msec (+- 0.069 msec)
          Average time per event: 0.554 usec (+- 0.007 usec)
          Average memory usage: 12093 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.551 msec (+- 0.096 msec)
          Average time per event: 0.838 usec (+- 0.009 usec)
          Average memory usage: 12739 KB (+- 8 KB)
          Average build-id-all injection took: 5.617 msec (+- 0.061 msec)
          Average time per event: 0.551 usec (+- 0.006 usec)
          Average memory usage: 12105 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.403 msec (+- 0.097 msec)
          Average time per event: 0.824 usec (+- 0.010 usec)
          Average memory usage: 12770 KB (+- 8 KB)
          Average build-id-all injection took: 5.611 msec (+- 0.085 msec)
          Average time per event: 0.550 usec (+- 0.008 usec)
          Average memory usage: 12134 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.518 msec (+- 0.102 msec)
          Average time per event: 0.835 usec (+- 0.010 usec)
          Average memory usage: 12518 KB (+- 10 KB)
          Average build-id-all injection took: 5.503 msec (+- 0.073 msec)
          Average time per event: 0.540 usec (+- 0.007 usec)
          Average memory usage: 11882 KB (+- 8 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  2,394.88 msec task-clock:u              #    1.577 CPUs utilized            ( +-  0.83% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   103,181      page-faults:u             #    0.043 M/sec                    ( +-  0.11% )
             3,548,172,030      cycles:u                  #    1.482 GHz                      ( +-  0.30% )  (83.26%)
                81,537,700      stalled-cycles-frontend:u #    2.30% frontend cycles idle     ( +-  1.54% )  (83.24%)
               876,631,544      stalled-cycles-backend:u  #   24.71% backend cycles idle      ( +-  1.14% )  (83.45%)
             5,960,361,707      instructions:u            #    1.68  insn per cycle
                                                          #    0.15  stalled cycles per insn  ( +-  0.27% )  (83.26%)
             1,269,413,491      branches:u                #  530.054 M/sec                    ( +-  0.10% )  (83.48%)
                11,372,453      branch-misses:u           #    0.90% of all branches          ( +-  0.52% )  (83.31%)
      
                   1.51874 +- 0.00642 seconds time elapsed  ( +-  0.42% )
      
        $
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201030054742.87740-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      60136667
  20. 14 10月, 2020 2 次提交
  21. 13 10月, 2020 5 次提交
    • N
      perf inject: Add --buildid-all option · 27c9c342
      Namhyung Kim 提交于
      Like 'perf record', we can even more speedup build-id processing by just
      using all DSOs.  Then we don't need to look at all the sample events
      anymore.  The following patch will update 'perf bench' to show the result
      of the --buildid-all option too.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Original-patch-by: NStephane Eranian <eranian@google.com>
      Acked-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20201012070214.2074921-6-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      27c9c342
    • N
      perf inject: Do not load map/dso when injecting build-id · e7b60c5a
      Namhyung Kim 提交于
      No need to load symbols in a DSO when injecting build-id.  I guess the
      reason was to check the DSO is a special file like anon files.  Use some
      helper functions in map.c to check them before reading build-id.  Also
      pass sample event's cpumode to a new build-id event.
      
      It brought a speedup in the benchmark of 25 -> 21 msec on my laptop.
      Also the memory usage (Max RSS) went down by ~200 KB.
      
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 21.389 msec (+- 0.138 msec)
          Average time per event: 2.097 usec (+- 0.014 usec)
          Average memory usage: 8225 KB (+- 0 KB)
      
      Committer notes:
      
      Before:
      
        $ perf stat -r5 perf bench internals inject-build-id > /dev/null
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  4,020.56 msec task-clock:u              #    1.271 CPUs utilized            ( +-  0.74% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   123,354      page-faults:u             #    0.031 M/sec                    ( +-  0.81% )
             7,119,951,568      cycles:u                  #    1.771 GHz                      ( +-  1.74% )  (83.27%)
               230,086,969      stalled-cycles-frontend:u #    3.23% frontend cycles idle     ( +-  1.97% )  (83.41%)
             1,168,298,765      stalled-cycles-backend:u  #   16.41% backend cycles idle      ( +-  1.13% )  (83.44%)
            11,173,083,669      instructions:u            #    1.57  insn per cycle
                                                          #    0.10  stalled cycles per insn  ( +-  1.58% )  (83.31%)
             2,413,908,936      branches:u                #  600.392 M/sec                    ( +-  1.69% )  (83.26%)
                46,576,289      branch-misses:u           #    1.93% of all branches          ( +-  2.20% )  (83.31%)
      
                    3.1638 +- 0.0309 seconds time elapsed  ( +-  0.98% )
      
        $
      
      After:
      
        $ perf stat -r5 perf bench internals inject-build-id > /dev/null
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  2,379.94 msec task-clock:u              #    1.473 CPUs utilized            ( +-  0.18% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                    62,584      page-faults:u             #    0.026 M/sec                    ( +-  0.07% )
             2,372,389,668      cycles:u                  #    0.997 GHz                      ( +-  0.29% )  (83.14%)
               106,937,862      stalled-cycles-frontend:u #    4.51% frontend cycles idle     ( +-  4.89% )  (83.20%)
               581,697,915      stalled-cycles-backend:u  #   24.52% backend cycles idle      ( +-  0.71% )  (83.47%)
             3,659,692,199      instructions:u            #    1.54  insn per cycle
                                                          #    0.16  stalled cycles per insn  ( +-  0.10% )  (83.63%)
               791,372,961      branches:u                #  332.518 M/sec                    ( +-  0.27% )  (83.39%)
                10,648,083      branch-misses:u           #    1.35% of all branches          ( +-  0.22% )  (83.16%)
      
                   1.61570 +- 0.00172 seconds time elapsed  ( +-  0.11% )
      
        $
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Original-patch-by: NStephane Eranian <eranian@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20201012070214.2074921-5-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e7b60c5a
    • N
      perf inject: Enter namespace when reading build-id · 336c95b2
      Namhyung Kim 提交于
      It should be in a proper mnt namespace when accessing the file.
      
      I think this had no problem since the build-id was actually read from
      map__load() -> dso__load() already.  But I'd like to change it in the
      following commit.
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20201012070214.2074921-4-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      336c95b2
    • N
      perf inject: Add missing callbacks in perf_tool · 2946eced
      Namhyung Kim 提交于
      I found some events (like PERF_RECORD_CGROUP) are not copied by perf
      inject due to the missing callbacks.  Let's add them.
      
      While at it, I've changed the order of the callbacks to match with
      struct perf_tool so that we can compare them easily.
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20201012070214.2074921-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2946eced
    • N
      perf bench: Add build-id injection benchmark · 0bf02a0d
      Namhyung Kim 提交于
      Sometimes I can see that 'perf record' piped with 'perf inject' take a
      long time processing build-ids.
      
      So introduce a inject-build-id benchmark to the internals benchmark
      suite to measure its overhead regularly.
      
      It runs the 'perf inject' command internally and feeds the given number
      of synthesized events (MMAP2 + SAMPLE basically).
      
        Usage: perf bench internals inject-build-id <options>
      
          -i, --iterations <n>  Number of iterations used to compute average (default: 100)
          -m, --nr-mmaps <n>    Number of mmap events for each iteration (default: 100)
          -n, --nr-samples <n>  Number of sample events per mmap event (default: 100)
          -v, --verbose         be more verbose (show iteration count, DSO name, etc)
      
      By default, it measures average processing time of 100 MMAP2 events
      and 10000 SAMPLE events.  Below is a result on my laptop.
      
        $ perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 25.789 msec (+- 0.202 msec)
          Average time per event: 2.528 usec (+- 0.020 usec)
          Average memory usage: 8411 KB (+- 7 KB)
      
      Committer testing:
      
        $ perf bench
        Usage:
        	perf bench [<common options>] <collection> <benchmark> [<options>]
      
                # List of all available benchmark collections:
      
                 sched: Scheduler and IPC benchmarks
               syscall: System call benchmarks
                   mem: Memory access benchmarks
                  numa: NUMA scheduling and MM benchmarks
                 futex: Futex stressing benchmarks
                 epoll: Epoll stressing benchmarks
             internals: Perf-internals benchmarks
                   all: All benchmarks
      
        $ perf bench internals
      
                # List of available benchmarks for collection 'internals':
      
            synthesize: Benchmark perf event synthesis
        kallsyms-parse: Benchmark kallsyms parsing
        inject-build-id: Benchmark build-id injection
      
        $ perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 14.202 msec (+- 0.059 msec)
          Average time per event: 1.392 usec (+- 0.006 usec)
          Average memory usage: 12650 KB (+- 10 KB)
          Average build-id-all injection took: 12.831 msec (+- 0.071 msec)
          Average time per event: 1.258 usec (+- 0.007 usec)
          Average memory usage: 11895 KB (+- 10 KB)
        $
      
        $ perf stat -r5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 14.380 msec (+- 0.056 msec)
          Average time per event: 1.410 usec (+- 0.006 usec)
          Average memory usage: 12608 KB (+- 11 KB)
          Average build-id-all injection took: 11.889 msec (+- 0.064 msec)
          Average time per event: 1.166 usec (+- 0.006 usec)
          Average memory usage: 11838 KB (+- 10 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 14.246 msec (+- 0.065 msec)
          Average time per event: 1.397 usec (+- 0.006 usec)
          Average memory usage: 12744 KB (+- 10 KB)
          Average build-id-all injection took: 12.019 msec (+- 0.066 msec)
          Average time per event: 1.178 usec (+- 0.006 usec)
          Average memory usage: 11963 KB (+- 10 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 14.321 msec (+- 0.067 msec)
          Average time per event: 1.404 usec (+- 0.007 usec)
          Average memory usage: 12690 KB (+- 10 KB)
          Average build-id-all injection took: 11.909 msec (+- 0.041 msec)
          Average time per event: 1.168 usec (+- 0.004 usec)
          Average memory usage: 11938 KB (+- 10 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 14.287 msec (+- 0.059 msec)
          Average time per event: 1.401 usec (+- 0.006 usec)
          Average memory usage: 12864 KB (+- 10 KB)
          Average build-id-all injection took: 11.862 msec (+- 0.058 msec)
          Average time per event: 1.163 usec (+- 0.006 usec)
          Average memory usage: 12103 KB (+- 10 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 14.402 msec (+- 0.053 msec)
          Average time per event: 1.412 usec (+- 0.005 usec)
          Average memory usage: 12876 KB (+- 10 KB)
          Average build-id-all injection took: 11.826 msec (+- 0.061 msec)
          Average time per event: 1.159 usec (+- 0.006 usec)
          Average memory usage: 12111 KB (+- 10 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  4,267.48 msec task-clock:u              #    1.502 CPUs utilized            ( +-  0.14% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   102,092      page-faults:u             #    0.024 M/sec                    ( +-  0.08% )
             3,894,589,578      cycles:u                  #    0.913 GHz                      ( +-  0.19% )  (83.49%)
               140,078,421      stalled-cycles-frontend:u #    3.60% frontend cycles idle     ( +-  0.77% )  (83.34%)
               948,581,189      stalled-cycles-backend:u  #   24.36% backend cycles idle      ( +-  0.46% )  (83.25%)
             5,835,587,719      instructions:u            #    1.50  insn per cycle
                                                          #    0.16  stalled cycles per insn  ( +-  0.21% )  (83.24%)
             1,267,423,636      branches:u                #  296.996 M/sec                    ( +-  0.22% )  (83.12%)
                17,484,290      branch-misses:u           #    1.38% of all branches          ( +-  0.12% )  (83.55%)
      
                   2.84176 +- 0.00222 seconds time elapsed  ( +-  0.08% )
      
        $
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20201012070214.2074921-2-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0bf02a0d
  22. 09 7月, 2020 1 次提交
    • S
      perf inject jit: Remove //anon mmap events · c8f6ae1f
      Steve MacLean 提交于
      **perf-<pid>.map and jit-<pid>.dump designs:
      
      When a JIT generates code to be executed, it must allocate memory and
      mark it executable using an mmap call.
      
      *** perf-<pid>.map design
      
      The perf-<pid>.map assumes that any sample recorded in an anonymous
      memory page is JIT code. It then tries to resolve the symbol name by
      looking at the process' perf-<pid>.map.
      
      *** jit-<pid>.dump design
      
      The jit-<pid>.dump mechanism takes a different approach. It requires a
      JIT to write a `<path>/jit-<pid>.dump` file. This file must also be
      mmapped so that perf inject -jit can find the file. The JIT must also
      add JIT_CODE_LOAD records for any functions it generates. The records
      are timestamped using a clock which can be correlated to the perf record
      clock.
      
      After perf record,  the `perf inject -jit` pass parses the recording
      looking for a `<path>/jit-<pid>.dump` file. When it finds the file, it
      parses it and for each JIT_CODE_LOAD record:
      * creates an elf file `<path>/jitted-<pid>-<code_index>.so
      * injects a new mmap record mapping the new elf file into the process.
      
      *** Coexistence design
      
      The kernel and perf support both of these mechanisms. We need to make
      sure perf works on an app supporting either or both of these mechanisms.
      Both designs rely on mmap records to determine how to resolve an ip
      address.
      
      The mmap records of both techniques by definition overlap. When the JIT
      compiles a method, it must:
      
      * allocate memory (mmap)
      * add execution privilege (mprotect or mmap. either will
      generate an mmap event form the kernel to perf)
      * compile code into memory
      * add a function record to perf-<pid>.map and/or jit-<pid>.dump
      
      Because the jit-<pid>.dump mechanism supports greater capabilities, perf
      prefers the symbols from jit-<pid>.dump. It implements this based on
      timestamp ordering of events. There is an implicit ASSUMPTION that the
      JIT_CODE_LOAD record timestamp will be after the // anon mmap event that
      was generated during memory allocation or adding the execution privilege setting.
      
      *** Problems with the ASSUMPTION
      
      The ASSUMPTION made in the Coexistence design section above is violated
      in the following scenario.
      
      *** Scenario
      
      While a JIT is jitting code it will eventually need to commit more
      pages and change these pages to executable permissions. Typically the
      JIT will want these collocated to minimize branch displacements.
      
      The kernel will coalesce these anonymous mapping with identical
      permissions before sending an MMAP event for the new pages. The address
      range of the new mmap will not be just the most recently mmap pages.
      It will include the entire coalesced mmap region.
      
      See mm/mmap.c
      
      unsigned long mmap_region(struct file *file, unsigned long addr,
                      unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
                      struct list_head *uf)
      {
      ...
              /*
               * Can we just expand an old mapping?
               */
      ...
              perf_event_mmap(vma);
      ...
      }
      
      *** Symptoms
      
      The coalesced // anon mmap event will be timestamped after the
      JIT_CODE_LOAD records. This means it will be used as the most recent
      mapping for that entire address range. For remaining events it will look
      at the inferior perf-<pid>.map for symbols.
      
      If both mechanisms are supported, the symbol will appear twice with
      different module names. This causes weird behavior in reporting.
      
      If only jit-<pid>.dump is supported, the symbol will no longer be resolved.
      
      ** Implemented solution
      
      This patch solves the issue by removing // anon mmap events for any
      process which has a valid jit-<pid>.dump file.
      
      It tracks on a per process basis to handle the case where some running
      apps support jit-<pid>.dump, but some only support perf-<pid>.map.
      
      It adds new assumptions:
      * // anon mmap events are only required for perf-<pid>.map support.
      * An app that uses jit-<pid>.dump, no longer needs
      perf-<pid>.map support. It assumes that any perf-<pid>.map info is
      inferior.
      
      *** Details
      
      Use thread->priv to store whether a jitdump file has been processed
      
      During "perf inject --jit", discard "//anon*" mmap events for any pid which
      has sucessfully processed a jitdump file.
      
      ** Testing:
      
      // jitdump case
      
        perf record <app with jitdump>
        perf inject --jit --input perf.data --output perfjit.data
      
      // verify mmap "//anon" events present initially
      
        perf script --input perf.data --show-mmap-events | grep '//anon'
      
      // verify mmap "//anon" events removed
      
        perf script --input perfjit.data --show-mmap-events | grep '//anon'
      
      // no jitdump case
      
        perf record <app without jitdump>
        perf inject --jit --input perf.data --output perfjit.data
      
      // verify mmap "//anon" events present initially
      
        perf script --input perf.data --show-mmap-events | grep '//anon'
      
      // verify mmap "//anon" events not removed
      
        perf script --input perfjit.data --show-mmap-events | grep '//anon'
      
      ** Repro:
      
      This issue was discovered while testing the initial CoreCLR jitdump
      implementation. https://github.com/dotnet/coreclr/pull/26897.
      
      ** Alternate solutions considered
      
      These were also briefly considered:
      
      * Change kernel to not coalesce mmap regions.
      
      * Change kernel reporting of coalesced mmap regions to perf. Only
      include newly mapped memory.
      
      * Only strip parts of // anon mmap events overlapping existing
      jitted-<pid>-<code_index>.so mmap events.
      Signed-off-by: NSteve MacLean <Steve.MacLean@Microsoft.com>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/1590544271-125795-1-git-send-email-steve.maclean@linux.microsoft.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c8f6ae1f
  23. 28 5月, 2020 1 次提交
    • G
      perf tools: Replace zero-length array with flexible-array · 6549a8c0
      Gustavo A. R. Silva 提交于
      The current codebase makes use of the zero-length array language
      extension to the C90 standard, but the preferred mechanism to declare
      variable-length types such as these ones is a flexible array
      member[1][2], introduced in C99:
      
      struct foo {
              int stuff;
              struct boo array[];
      };
      
      By making use of the mechanism above, we will get a compiler warning in
      case the flexible array does not occur last in the structure, which will
      help us prevent some kind of undefined behavior bugs from being
      inadvertently introduced[3] to the codebase from now on.
      
      Also, notice that, dynamic memory allocations won't be affected by this
      change:
      
      "Flexible array members have incomplete type, and so the sizeof operator
      may not be applied. As a quirk of the original implementation of
      zero-length arrays, sizeof evaluates to zero."[1]
      
      sizeof(flexible-array-member) triggers a warning because flexible array
      members have incomplete type[1]. There are some instances of code in
      which the sizeof operator is being incorrectly/erroneously applied to
      zero-length arrays and the result is zero. Such instances may be hiding
      some bugs. So, this work (flexible-array member conversions) will also
      help to get completely rid of those sorts of issues.
      
      This issue was found with the help of Coccinelle.
      
      [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
      [2] https://github.com/KSPP/linux/issues/21
      [3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour")
      Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200515172926.GA31976@embeddedorSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6549a8c0
  24. 06 5月, 2020 1 次提交