1. 23 2月, 2023 1 次提交
    • N
      perf inject: Fix --buildid-all not to eat up MMAP2 · ce9f1c05
      Namhyung Kim 提交于
      When MMAP2 has the PERF_RECORD_MISC_MMAP_BUILD_ID flag, it means the
      record already has the build-id info.  So it marks the DSO as hit, to
      skip if the same DSO is not processed if it happens to miss the build-id
      later.
      
      But it missed to copy the MMAP2 record itself so it'd fail to symbolize
      samples for those regions.
      
      For example, the following generates 249 MMAP2 events.
      
        $ perf record --buildid-mmap -o- true | perf report --stat -i- | grep MMAP2
                 MMAP2 events:        249  (86.8%)
      
      Adding perf inject should not change the number of events like this
      
        $ perf record --buildid-mmap -o- true | perf inject -b | \
        > perf report --stat -i- | grep MMAP2
                 MMAP2 events:        249  (86.5%)
      
      But when --buildid-all is used, it eats most of the MMAP2 events.
      
        $ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
        > perf report --stat -i- | grep MMAP2
                 MMAP2 events:          1  ( 2.5%)
      
      With this patch, it shows the original number now.
      
        $ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
        > perf report --stat -i- | grep MMAP2
                 MMAP2 events:        249  (86.5%)
      
      Committer testing:
      
      Before:
      
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (36.2%)
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (36.2%)
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2
                 MMAP2 events:          2  ( 1.9%)
        $
      
      After:
      
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (29.3%)
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (34.3%)
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (38.4%)
        $
      
      Fixes: f7fc0d1c ("perf inject: Do not inject BUILD_ID record if MMAP2 has it")
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230223070155.54251-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ce9f1c05
  2. 02 2月, 2023 1 次提交
  3. 14 12月, 2022 1 次提交
    • I
      perf build: Use libtraceevent from the system · 378ef0f5
      Ian Rogers 提交于
      Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
      line variables.
      
      If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
      build, don't compile in libtraceevent and libtracefs support.
      
      This also disables CONFIG_TRACE that controls "perf trace".
      
      CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
      HAVE_LIBTRACEEVENT is used in C code.
      
      Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
      commands kmem, kwork, lock, sched and timechart are removed.  The
      majority of commands continue to work including "perf test".
      
      Committer notes:
      
      Fixed up a tools/perf/util/Build reject and added:
      
        #include <traceevent/event-parse.h>
      
      to tools/perf/util/scripting-engines/trace-event-perl.c.
      
      Committer testing:
      
        $ rpm -qi libtraceevent-devel
        Name        : libtraceevent-devel
        Version     : 1.5.3
        Release     : 2.fc36
        Architecture: x86_64
        Install Date: Mon 25 Jul 2022 03:20:19 PM -03
        Group       : Unspecified
        Size        : 27728
        License     : LGPLv2+ and GPLv2+
        Signature   : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
        Source RPM  : libtraceevent-1.5.3-2.fc36.src.rpm
        Build Date  : Fri 15 Apr 2022 10:57:01 AM -03
        Build Host  : buildvm-x86-05.iad2.fedoraproject.org
        Packager    : Fedora Project
        Vendor      : Fedora Project
        URL         : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
        Bug URL     : https://bugz.fedoraproject.org/libtraceevent
        Summary     : Development headers of libtraceevent
        Description :
        Development headers of libtraceevent-libs
        $
      
      Default build:
      
        $ ldd ~/bin/perf | grep tracee
        	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
        $
      
        # perf trace -e sched:* --max-events 10
             0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
             0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
             0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
             1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
             1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
             0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
             0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
             0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
             1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
             1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
        #
      
      Had to tweak tools/perf/util/setup.py to make sure the python binding
      shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
      present in CFLAGS.
      
      Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:
      
      - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y
      
      - perf-$(CONFIG_LIBTRACEEVENT) += scripts/
      
      - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y
      
      - The python binding needed some fixups and util/trace-event.c can't be
        built and linked with the python binding shared object, so remove it
        in tools/perf/util/setup.py and exclude it from the list of
        dependencies in the python/perf.so Makefile.perf target.
      
      Building without libtraceevent-devel installed uncovered more build
      failures:
      
      - The python binding tools/perf/util/python.c was assuming that
        traceevent/parse-events.h was always available, which was the case
        when we defaulted to using the in-kernel tools/lib/traceevent/ files,
        now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
        the other parts of it that deal with tracepoints.
      
      - We have to ifdef the rules in the Build files with
        CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
        tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
        setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
        detect libtraceevent-devel installed in the system. Simplification here
        to avoid these two ways of disabling builtin-trace.c and not having
        CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
        way.
      
      From Athira:
      
      <quote>
      tools/perf/arch/powerpc/util/Build
      -perf-y += kvm-stat.o
      +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
      </quote>
      
      Then, ditto for arm64 and s390, detected by container cross build tests.
      
      - s/390 uses test__checkevent_tracepoint() that is now only available if
        HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.
      
      Also from Athira:
      
      <quote>
      With this change, I could successfully compile in these environment:
      - Without libtraceevent-devel installed
      - With libtraceevent-devel installed
      - With “make NO_LIBTRACEEVENT=1”
      </quote>
      
      Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
      consistency with other libraries detected in tools/perf/.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      378ef0f5
  4. 04 10月, 2022 2 次提交
    • I
      perf dso: Hold lock when accessing nsinfo · e54dea69
      Ian Rogers 提交于
      There may be threads racing to update dso->nsinfo:
      
        https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
      
      Holding the dso->lock avoids use-after-free, memory leaks and other such
      bugs. Apply the fix in:
      
        https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
      
      of there being a missing nsinfo__put now that the accesses are data race
      free. Fixes test "Lookup mmap thread" when compiled with address
      sanitizer.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Reviewed-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Truong <alexandre.truong@arm.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andres Freund <andres@anarazel.de>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Colin Ian King <colin.king@intel.com>
      Cc: Dario Petrillo <dario.pk1@gmail.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Dave Marchevsky <davemarchevsky@fb.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Hewenliang <hewenliang4@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jason Wang <wangborong@cdjrlc.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Monnet <quentin@isovalent.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Remi Bernon <rbernon@codeweavers.com>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: Weiguo Li <liwg06@foxmail.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Zechuan Chen <chenzechuan1@huawei.com>
      Cc: bpf@vger.kernel.org
      Cc: llvm@lists.linux.dev
      Cc: yaowenbin <yaowenbin1@huawei.com>
      Link: https://lore.kernel.org/r/20220826164242.43412-15-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e54dea69
    • R
      perf inject: Add a command line option to specify build ids. · 8012243e
      Raul Silvera 提交于
      This commit adds the option --known-build-ids to perf inject.
      It allows the user to explicitly specify the build id for a given
      path, instead of retrieving it from the current system. This is
      useful in cases where a perf.data file is processed on a different
      system from where it was collected, or if some of the binaries are
      no longer available.
      
      The build ids and paths are specified in pairs in the command line.
      Using the file:// specifier, build ids can be loaded from a file
      directly generated by perf buildid-list. This is convenient to copy
      build ids from one perf.data file to another.
      
      ** Example: In this example we use perf record to create two
      perf.data files, one with build ids and another without, and use
      perf buildid-list and perf inject to copy the build ids from the
      first file to the second.
      
       $ perf record ls /tmp
       $ perf record --no-buildid -o perf.data.no-buildid ls /tmp
       $ perf buildid-list > build-ids.txt
       $ perf inject -b --known-build-ids='file://build-ids.txt' \
              -i perf.data.no-buildid -o perf.data.buildid
      Signed-off-by: NRaul Silvera <rsilvera@google.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220815225922.2118745-1-rsilvera@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8012243e
  5. 27 7月, 2022 1 次提交
  6. 20 7月, 2022 1 次提交
    • A
      perf inject: Add support for injecting guest sideband events · 97406a7e
      Adrian Hunter 提交于
      Inject events from a perf.data file recorded in a virtual machine into
      a perf.data file recorded on the host at the same time.
      
      Only side band events (e.g. mmap, comm, fork, exit etc) and build IDs are
      injected.  Additionally, the guest kcore_dir is copied as kcore_dir__
      appended to the machine PID.
      
      This is non-trivial because:
       o It is not possible to process 2 sessions simultaneously so instead
       events are first written to a temporary file.
       o To avoid conflict, guest sample IDs are replaced with new unused sample
       IDs.
       o Guest event's CPU is changed to be the host CPU because it is more
       useful for reporting and analysis.
       o Sample ID is mapped to machine PID which is recorded with VCPU in the
       id index. This is important to allow guest events to be related to the
       guest machine and VCPU.
       o Timestamps must be converted.
       o Events are inserted to obey finished-round ordering.
      
      The anticipated use-case is:
       - start recording sideband events in a guest machine
       - start recording an AUX area trace on the host which can trace also the
       guest (e.g. Intel PT)
       - run test case on the guest
       - stop recording on the host
       - stop recording on the guest
       - copy the guest perf.data file to the host
       - inject the guest perf.data file sideband events into the host perf.data
       file using perf inject
       - the resulting perf.data file can now be used
      
      Subsequent patches provide Intel PT support for this.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: kvm@vger.kernel.org
      Link: https://lore.kernel.org/r/20220711093218.10967-25-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      97406a7e
  7. 26 6月, 2022 2 次提交
  8. 25 6月, 2022 1 次提交
    • R
      perf header: Record non-CPU PMU capabilities · 2139f742
      Ravi Bangoria 提交于
      PMUs advertise their capabilities via sysfs attribute files but
      the perf tool currently parses only core(CPU) or hybrid core PMU
      capabilities. Add support of recording non-core PMU capabilities
      int perf.data header.
      
      Note that a newly proposed HEADER_PMU_CAPS is replacing existing
      HEADER_HYBRID_CPU_PMU_CAPS. Special care is taken for hybrid core
      PMUs by writing their capabilities first in the perf.data header
      to make sure new perf.data file being read by old perf tool does
      not break.
      Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NRavi Bangoria <ravi.bangoria@amd.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rrichter@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: like.xu.linux@gmail.com
      Cc: x86@kernel.org
      Link: https://lore.kernel.org/r/20220604044519.594-6-ravi.bangoria@amd.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2139f742
  9. 23 6月, 2022 1 次提交
    • A
      perf record: Add finished init event · 3812d298
      Adrian Hunter 提交于
      In preparation for recording sideband events in a virtual machine guest so
      that they can be injected into a host perf.data file.
      
      This is needed to enable injecting events after the initial synthesized
      user events (that have an all zero id sample) but before regular events.
      
      Committer notes:
      
      Add entry about PERF_RECORD_FINISHED_INIT to
      tools/perf/Documentation/perf.data-file-format.txt.
      
      Committer testing:
      
      Before:
      
        # perf report -D | grep FINISHED
        0 0x5910 [0x8]: PERF_RECORD_FINISHED_ROUND
          FINISHED_ROUND events:          1  ( 0.5%)
        #
      
      After:
      
        # perf record -- sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ]
        # perf report -D | grep FINISHED
        0 0x5068 [0x8]: PERF_RECORD_FINISHED_INIT: unhandled!
        0 0x5390 [0x8]: PERF_RECORD_FINISHED_ROUND
          FINISHED_ROUND events:          1  ( 0.5%)
           FINISHED_INIT events:          1  ( 0.5%)
        #
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NIan Rogers <irogers@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220610113316.6682-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3812d298
  10. 23 5月, 2022 2 次提交
    • A
      perf inject: Keep a copy of kcore_dir · d8fc0855
      Adrian Hunter 提交于
      If the input perf.data has a kcore_dir, copy it into the output, since
      at least the kallsyms in the kcore_dir will be useful to the output.
      
      Example:
      
       Before:
      
        $ ls -lR perf.data-from-desktop
        perf.data-from-desktop:
        total 916
        -rw------- 1 user user 931756 May 19 09:55 data
        drwx------ 2 user user   4096 May 19 09:55 kcore_dir
      
        perf.data-from-desktop/kcore_dir:
        total 42952
        -r-------- 1 user user  7582467 May 19 09:55 kallsyms
        -r-------- 1 user user 36388864 May 19 09:55 kcore
        -r-------- 1 user user     4828 May 19 09:55 modules
      
        $ perf inject -i perf.data-from-desktop -o injected-perf.data
      
        $ ls -lR injected-perf.data
        -rw------- 1 user user 931320 May 20 15:08 injected-perf.data
      
       After:
      
        $ perf inject -i perf.data-from-desktop -o injected-perf.data
      
        $ ls -lR injected-perf.data
        injected-perf.data:
        total 916
        -rw------- 1 user user 931320 May 20 15:21 data
        drwx------ 2 user user   4096 May 20 15:21 kcore_dir
      
        injected-perf.data/kcore_dir:
        total 42952
        -r-------- 1 user user  7582467 May 20 15:21 kallsyms
        -r-------- 1 user user 36388864 May 20 15:21 kcore
        -r-------- 1 user user     4828 May 20 15:21 modules
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220520132404.25853-6-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d8fc0855
    • A
      perf inject: Keep some features sections from input file · 180b3d06
      Adrian Hunter 提交于
      perf inject overwrites feature sections with information from the current
      machine. It makes more sense to keep original information that describes
      the machine or software when perf record was run.
      
      Example: perf.data from "Desktop" injected on "nuc11"
      
       Before:
      
        $ perf script --header-only -i perf.data-from-desktop | head -15
        # ========
        # captured on    : Thu May 19 09:55:50 2022
        # header version : 1
        # data offset    : 1208
        # data size      : 837480
        # feat offset    : 838688
        # hostname : Desktop
        # os release : 5.13.0-41-generic
        # perf version : 5.18.rc5.gac837f7ca7ed
        # arch : x86_64
        # nrcpus online : 28
        # nrcpus avail : 28
        # cpudesc : Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
        # cpuid : GenuineIntel,6,85,4
        # total memory : 65548656 kB
      
        $ perf inject -i perf.data-from-desktop -o injected-perf.data
      
        $ perf script --header-only -i injected-perf.data | head -15
        # ========
        # captured on    : Fri May 20 15:06:55 2022
        # header version : 1
        # data offset    : 1208
        # data size      : 837480
        # feat offset    : 838688
        # hostname : nuc11
        # os release : 5.17.5-local
        # perf version : 5.18.rc5.g0f828fdeb9af
        # arch : x86_64
        # nrcpus online : 8
        # nrcpus avail : 8
        # cpudesc : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
        # cpuid : GenuineIntel,6,140,1
        # total memory : 16012124 kB
      
       After:
      
        $ perf inject -i perf.data-from-desktop -o injected-perf.data
      
        $ perf script --header-only -i injected-perf.data | head -15
        # ========
        # captured on    : Fri May 20 15:08:54 2022
        # header version : 1
        # data offset    : 1208
        # data size      : 837480
        # feat offset    : 838688
        # hostname : Desktop
        # os release : 5.13.0-41-generic
        # perf version : 5.18.rc5.gac837f7ca7ed
        # arch : x86_64
        # nrcpus online : 28
        # nrcpus avail : 28
        # cpudesc : Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
        # cpuid : GenuineIntel,6,85,4
        # total memory : 65548656 kB
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220520132404.25853-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      180b3d06
  11. 12 2月, 2022 1 次提交
    • I
      perf namespaces: Add functions to access nsinfo · bcaf0a97
      Ian Rogers 提交于
      Having functions to access nsinfo reduces the places where reference
      counting checking needs to be added.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: http://lore.kernel.org/lkml/20220211103415.2737789-14-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bcaf0a97
  12. 11 2月, 2022 2 次提交
  13. 23 1月, 2022 1 次提交
  14. 18 12月, 2021 2 次提交
    • A
      perf inject: Fix segfault due to perf_data__fd() without open · c271a55b
      Adrian Hunter 提交于
      The fixed commit attempts to get the output file descriptor even if the
      file was never opened e.g.
      
        $ perf record uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
        $ perf inject -i perf.data --vm-time-correlation=dry-run
        Segmentation fault (core dumped)
        $ gdb --quiet perf
        Reading symbols from perf...
        (gdb) r inject -i perf.data --vm-time-correlation=dry-run
        Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
        [Thread debugging using libthread_db enabled]
        Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
      
        Program received signal SIGSEGV, Segmentation fault.
        __GI___fileno (fp=0x0) at fileno.c:35
        35      fileno.c: No such file or directory.
        (gdb) bt
        #0  __GI___fileno (fp=0x0) at fileno.c:35
        #1  0x00005621e48dd987 in perf_data__fd (data=0x7fff4c68bd08) at util/data.h:72
        #2  perf_data__fd (data=0x7fff4c68bd08) at util/data.h:69
        #3  cmd_inject (argc=<optimized out>, argv=0x7fff4c69c1f0) at builtin-inject.c:1017
        #4  0x00005621e4936783 in run_builtin (p=0x5621e4ee6878 <commands+600>, argc=4, argv=0x7fff4c69c1f0) at perf.c:313
        #5  0x00005621e4897d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
        #6  run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
        #7  main (argc=4, argv=0x7fff4c69c1f0) at perf.c:539
        (gdb)
      
      Fixes: 0ae03893 ("perf tools: Pass a fd to perf_file_header__read_pipe()")
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: stable@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20211213084829.114772-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c271a55b
    • A
      perf inject: Fix segfault due to close without open · 0c8e32fe
      Adrian Hunter 提交于
      The fixed commit attempts to close inject.output even if it was never
      opened e.g.
      
        $ perf record uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
        $ perf inject -i perf.data --vm-time-correlation=dry-run
        Segmentation fault (core dumped)
        $ gdb --quiet perf
        Reading symbols from perf...
        (gdb) r inject -i perf.data --vm-time-correlation=dry-run
        Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
        [Thread debugging using libthread_db enabled]
        Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
      
        Program received signal SIGSEGV, Segmentation fault.
        0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
        48      iofclose.c: No such file or directory.
        (gdb) bt
        #0  0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
        #1  0x0000557fc7b74f92 in perf_data__close (data=data@entry=0x7ffcdafa6578) at util/data.c:376
        #2  0x0000557fc7a6b807 in cmd_inject (argc=<optimized out>, argv=<optimized out>) at builtin-inject.c:1085
        #3  0x0000557fc7ac4783 in run_builtin (p=0x557fc8074878 <commands+600>, argc=4, argv=0x7ffcdafb6a60) at perf.c:313
        #4  0x0000557fc7a25d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
        #5  run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
        #6  main (argc=4, argv=0x7ffcdafb6a60) at perf.c:539
        (gdb)
      
      Fixes: 02e6246f ("perf inject: Close inject.output on exit")
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: stable@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20211213084829.114772-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0c8e32fe
  15. 07 12月, 2021 1 次提交
  16. 07 11月, 2021 1 次提交
  17. 20 10月, 2021 1 次提交
  18. 02 8月, 2021 4 次提交
  19. 16 7月, 2021 2 次提交
  20. 26 5月, 2021 2 次提交
  21. 12 5月, 2021 2 次提交
  22. 05 4月, 2021 1 次提交
  23. 04 2月, 2021 1 次提交
    • Y
      perf inject jit: Add namespaces support · 67dec926
      Yonatan Goldschmidt 提交于
      This patch fixes "perf inject --jit" to properly operate on
      namespaced/containerized processes:
      
      * jitdump files are generated by the process, thus they should be
        looked up in its mount NS.
      
      * DSOs of injected MMAP events will later be looked up in the process
        mount NS, so write them into its NS.
      
      * PIDs & TIDs from jitdump events need to be translated to the PID as
        seen by "perf record" before written into MMAP events.
      
      For a process in a different PID NS, the TID & PID given in the jitdump
      event are actually ignored; I use the TID & PID of the thread which
      mmap()ed the jitdump file. This is simplified and won't do for forks of
      the initial process, if they continue using the same jitdump file.
      Future patches might improve it.
      
      This was tested by recording a NodeJS process running with
      "--perf-prof", inside a Docker container, and by recording another
      NodeJS process running in the same namespaces as perf itself, to make
      sure it's not broken for non-containerized processes.
      Signed-off-by: NYonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20201105015604.1726943-1-yonatan.goldschmidt@granulate.ioSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      67dec926
  24. 17 11月, 2020 2 次提交
    • A
      perf inject: Fix file corruption due to event deletion · 1c756cd4
      Al Grant 提交于
      "perf inject" can create corrupt files when synthesizing sample events from AUX
      data. This happens when in the input file, the first event (for the AUX data)
      has a different sample_type from the second event (generally dummy).
      
      Specifically, they differ in the bits that indicate the standard fields
      appended to perf records in the mmap buffer. "perf inject" deletes the first
      event and moves up the second event to first position.
      
      The problem is with the synthetic PERF_RECORD_MMAP (etc.) events created
      by "perf record".
      
      Since these are synthetic versions of events which are normally produced
      by the kernel, they have to have the standard fields appended as
      described by sample_type.
      
      "perf record" fills these in with zeroes, including the IDENTIFIER
      field; perf readers interpret records with zero IDENTIFIER using the
      descriptor for the first event in the file.
      
      Since "perf inject" changes the first event, these synthetic records are
      then processed with the wrong value of sample_type, and the perf reader
      reads bad data, reports on incorrect length records etc.
      
      Mismatching sample_types are seen with "perf record -e cs_etm//", where the AUX
      event has TID|TIME|CPU|IDENTIFIER and the dummy event has TID|TIME|IDENTIFIER.
      
      Perhaps they could be the same, but it isn't normally a problem if they aren't
      - perf has no problems reading the file.
      
      The sample_types have to agree on the position of IDENTIFIER, because
      that's how perf finds the right event descriptor in the first place, but
      they don't normally have to agree on other fields, and perf doesn't
      check that they do.
      
      The problem is specific to the way "perf inject" reorganizes the events
      and the way synthetic MMAP events are recorded with a zero identifier. A
      simple solution is to stop "perf inject" deleting the tracing event.
      
      Committer testing
      
      Removed the now unused 'evsel' variable, update the comment about the
      evsel removal not being performed anymore, and apply the patch manually
      as it failed with this warning:
      
        warning: Patch sent with format=flowed; space at the end of lines might be lost.
      
      Testing it with:
      
        $ perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.543 msec (+- 0.130 msec)
          Average time per event: 0.838 usec (+- 0.013 usec)
          Average memory usage: 12717 KB (+- 9 KB)
          Average build-id-all injection took: 5.710 msec (+- 0.058 msec)
          Average time per event: 0.560 usec (+- 0.006 usec)
          Average memory usage: 12079 KB (+- 7 KB)
        $
      Signed-off-by: NAl Grant <al.grant@arm.com>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LPU-Reference: b9cf5611-daae-2390-3439-6617f8f0a34b@foss.arm.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1c756cd4
    • N
      perf data: Allow to use stdio functions for pipe mode · 60136667
      Namhyung Kim 提交于
      When perf data is in a pipe, it reads each event separately using
      read(2) syscall.  This is a huge performance bottleneck when
      processing large data like in perf inject.  Also perf inject needs to
      use write(2) syscall for the output.
      
      So convert it to use buffer I/O functions in stdio library for pipe
      data.  This makes inject-build-id bench time drops from 20ms to 8ms.
      
        $ perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.074 msec (+- 0.013 msec)
          Average time per event: 0.792 usec (+- 0.001 usec)
          Average memory usage: 8328 KB (+- 0 KB)
          Average build-id-all injection took: 5.490 msec (+- 0.008 msec)
          Average time per event: 0.538 usec (+- 0.001 usec)
          Average memory usage: 7563 KB (+- 0 KB)
      
      This patch enables it just for perf inject when used with pipe (it's a
      default behavior).  Maybe we could do it for perf record and/or report
      later..
      
      Committer testing:
      
      Before:
      
        $ perf stat -r 5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.605 msec (+- 0.064 msec)
          Average time per event: 1.334 usec (+- 0.006 usec)
          Average memory usage: 12220 KB (+- 7 KB)
          Average build-id-all injection took: 11.458 msec (+- 0.058 msec)
          Average time per event: 1.123 usec (+- 0.006 usec)
          Average memory usage: 11546 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.673 msec (+- 0.057 msec)
          Average time per event: 1.341 usec (+- 0.006 usec)
          Average memory usage: 12508 KB (+- 8 KB)
          Average build-id-all injection took: 11.437 msec (+- 0.046 msec)
          Average time per event: 1.121 usec (+- 0.004 usec)
          Average memory usage: 11812 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.641 msec (+- 0.069 msec)
          Average time per event: 1.337 usec (+- 0.007 usec)
          Average memory usage: 12302 KB (+- 8 KB)
          Average build-id-all injection took: 10.820 msec (+- 0.106 msec)
          Average time per event: 1.061 usec (+- 0.010 usec)
          Average memory usage: 11616 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.379 msec (+- 0.074 msec)
          Average time per event: 1.312 usec (+- 0.007 usec)
          Average memory usage: 12334 KB (+- 8 KB)
          Average build-id-all injection took: 11.288 msec (+- 0.071 msec)
          Average time per event: 1.107 usec (+- 0.007 usec)
          Average memory usage: 11657 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.534 msec (+- 0.058 msec)
          Average time per event: 1.327 usec (+- 0.006 usec)
          Average memory usage: 12264 KB (+- 8 KB)
          Average build-id-all injection took: 11.557 msec (+- 0.076 msec)
          Average time per event: 1.133 usec (+- 0.007 usec)
          Average memory usage: 11593 KB (+- 8 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  4,060.05 msec task-clock:u              #    1.566 CPUs utilized            ( +-  0.65% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   101,888      page-faults:u             #    0.025 M/sec                    ( +-  0.12% )
             3,745,833,163      cycles:u                  #    0.923 GHz                      ( +-  0.10% )  (83.22%)
               194,346,613      stalled-cycles-frontend:u #    5.19% frontend cycles idle     ( +-  0.57% )  (83.30%)
               708,495,034      stalled-cycles-backend:u  #   18.91% backend cycles idle      ( +-  0.48% )  (83.48%)
             5,629,328,628      instructions:u            #    1.50  insn per cycle
                                                          #    0.13  stalled cycles per insn  ( +-  0.21% )  (83.57%)
             1,236,697,927      branches:u                #  304.602 M/sec                    ( +-  0.16% )  (83.44%)
                17,564,877      branch-misses:u           #    1.42% of all branches          ( +-  0.23% )  (82.99%)
      
                    2.5934 +- 0.0128 seconds time elapsed  ( +-  0.49% )
      
        $
      
      After:
      
        $ perf stat -r 5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.560 msec (+- 0.125 msec)
          Average time per event: 0.839 usec (+- 0.012 usec)
          Average memory usage: 12520 KB (+- 8 KB)
          Average build-id-all injection took: 5.789 msec (+- 0.054 msec)
          Average time per event: 0.568 usec (+- 0.005 usec)
          Average memory usage: 11919 KB (+- 9 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.639 msec (+- 0.111 msec)
          Average time per event: 0.847 usec (+- 0.011 usec)
          Average memory usage: 12732 KB (+- 8 KB)
          Average build-id-all injection took: 5.647 msec (+- 0.069 msec)
          Average time per event: 0.554 usec (+- 0.007 usec)
          Average memory usage: 12093 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.551 msec (+- 0.096 msec)
          Average time per event: 0.838 usec (+- 0.009 usec)
          Average memory usage: 12739 KB (+- 8 KB)
          Average build-id-all injection took: 5.617 msec (+- 0.061 msec)
          Average time per event: 0.551 usec (+- 0.006 usec)
          Average memory usage: 12105 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.403 msec (+- 0.097 msec)
          Average time per event: 0.824 usec (+- 0.010 usec)
          Average memory usage: 12770 KB (+- 8 KB)
          Average build-id-all injection took: 5.611 msec (+- 0.085 msec)
          Average time per event: 0.550 usec (+- 0.008 usec)
          Average memory usage: 12134 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.518 msec (+- 0.102 msec)
          Average time per event: 0.835 usec (+- 0.010 usec)
          Average memory usage: 12518 KB (+- 10 KB)
          Average build-id-all injection took: 5.503 msec (+- 0.073 msec)
          Average time per event: 0.540 usec (+- 0.007 usec)
          Average memory usage: 11882 KB (+- 8 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  2,394.88 msec task-clock:u              #    1.577 CPUs utilized            ( +-  0.83% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   103,181      page-faults:u             #    0.043 M/sec                    ( +-  0.11% )
             3,548,172,030      cycles:u                  #    1.482 GHz                      ( +-  0.30% )  (83.26%)
                81,537,700      stalled-cycles-frontend:u #    2.30% frontend cycles idle     ( +-  1.54% )  (83.24%)
               876,631,544      stalled-cycles-backend:u  #   24.71% backend cycles idle      ( +-  1.14% )  (83.45%)
             5,960,361,707      instructions:u            #    1.68  insn per cycle
                                                          #    0.15  stalled cycles per insn  ( +-  0.27% )  (83.26%)
             1,269,413,491      branches:u                #  530.054 M/sec                    ( +-  0.10% )  (83.48%)
                11,372,453      branch-misses:u           #    0.90% of all branches          ( +-  0.52% )  (83.31%)
      
                   1.51874 +- 0.00642 seconds time elapsed  ( +-  0.42% )
      
        $
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201030054742.87740-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      60136667
  25. 14 10月, 2020 2 次提交
  26. 13 10月, 2020 2 次提交
    • N
      perf inject: Add --buildid-all option · 27c9c342
      Namhyung Kim 提交于
      Like 'perf record', we can even more speedup build-id processing by just
      using all DSOs.  Then we don't need to look at all the sample events
      anymore.  The following patch will update 'perf bench' to show the result
      of the --buildid-all option too.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Original-patch-by: NStephane Eranian <eranian@google.com>
      Acked-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20201012070214.2074921-6-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      27c9c342
    • N
      perf inject: Do not load map/dso when injecting build-id · e7b60c5a
      Namhyung Kim 提交于
      No need to load symbols in a DSO when injecting build-id.  I guess the
      reason was to check the DSO is a special file like anon files.  Use some
      helper functions in map.c to check them before reading build-id.  Also
      pass sample event's cpumode to a new build-id event.
      
      It brought a speedup in the benchmark of 25 -> 21 msec on my laptop.
      Also the memory usage (Max RSS) went down by ~200 KB.
      
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 21.389 msec (+- 0.138 msec)
          Average time per event: 2.097 usec (+- 0.014 usec)
          Average memory usage: 8225 KB (+- 0 KB)
      
      Committer notes:
      
      Before:
      
        $ perf stat -r5 perf bench internals inject-build-id > /dev/null
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  4,020.56 msec task-clock:u              #    1.271 CPUs utilized            ( +-  0.74% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   123,354      page-faults:u             #    0.031 M/sec                    ( +-  0.81% )
             7,119,951,568      cycles:u                  #    1.771 GHz                      ( +-  1.74% )  (83.27%)
               230,086,969      stalled-cycles-frontend:u #    3.23% frontend cycles idle     ( +-  1.97% )  (83.41%)
             1,168,298,765      stalled-cycles-backend:u  #   16.41% backend cycles idle      ( +-  1.13% )  (83.44%)
            11,173,083,669      instructions:u            #    1.57  insn per cycle
                                                          #    0.10  stalled cycles per insn  ( +-  1.58% )  (83.31%)
             2,413,908,936      branches:u                #  600.392 M/sec                    ( +-  1.69% )  (83.26%)
                46,576,289      branch-misses:u           #    1.93% of all branches          ( +-  2.20% )  (83.31%)
      
                    3.1638 +- 0.0309 seconds time elapsed  ( +-  0.98% )
      
        $
      
      After:
      
        $ perf stat -r5 perf bench internals inject-build-id > /dev/null
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  2,379.94 msec task-clock:u              #    1.473 CPUs utilized            ( +-  0.18% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                    62,584      page-faults:u             #    0.026 M/sec                    ( +-  0.07% )
             2,372,389,668      cycles:u                  #    0.997 GHz                      ( +-  0.29% )  (83.14%)
               106,937,862      stalled-cycles-frontend:u #    4.51% frontend cycles idle     ( +-  4.89% )  (83.20%)
               581,697,915      stalled-cycles-backend:u  #   24.52% backend cycles idle      ( +-  0.71% )  (83.47%)
             3,659,692,199      instructions:u            #    1.54  insn per cycle
                                                          #    0.16  stalled cycles per insn  ( +-  0.10% )  (83.63%)
               791,372,961      branches:u                #  332.518 M/sec                    ( +-  0.27% )  (83.39%)
                10,648,083      branch-misses:u           #    1.35% of all branches          ( +-  0.22% )  (83.16%)
      
                   1.61570 +- 0.00172 seconds time elapsed  ( +-  0.11% )
      
        $
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Original-patch-by: NStephane Eranian <eranian@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20201012070214.2074921-5-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e7b60c5a