1. 09 9月, 2022 1 次提交
    • J
      perf tools: Don't install data files with x permissions · 0a9eaf61
      Jiri Slaby 提交于
      install(1), by default, installs with rwxr-xr-x permissions. Modify
      perf's Makefile to pass '-m 644' when installing:
      
        * Documentation/tips.txt
        * examples/bpf/*
        * perf-completion.sh
        * perf_dlfilter.h header
        * scripts/perl/Perf-Trace-Util/lib/Perf/Trace/*
        * scripts/perl/*.pl
        * tests/attr/*
        * tests/attr.py
        * tests/shell/lib/*.sh
        * trace/strace/groups/*
      
      All those are supposed to be non-executable. Either they are not scripts
      at all, or they don't have shebang.
      
      Signed-off-by: <jslaby@suse.cz>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220908060426.9619-1-jslaby@suse.czSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0a9eaf61
  2. 10 8月, 2022 1 次提交
    • C
      perf test: JSON format checking · 0c343af2
      Claire Jensen 提交于
      Add field checking tests for perf stat JSON output.
      
      Sanity checks the expected number of fields are present, that the
      expected keys are present and they have the correct values.
      
      Committer notes:
      
      Had to fix this:
      
        -               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib' \
        +               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
      
      Committer testing:
      
        [root@quaco ~]# perf test json
         90: perf stat JSON output linter                                    : Ok
        [root@quaco ~]# set -o vi
        [root@quaco ~]# perf test -v json
         90: perf stat JSON output linter                                    :
        --- start ---
        test child forked, pid 560794
        Checking json output: no args [Success]
        Checking json output: system wide [Success]
        Checking json output: system wide Checking json output: system wide no aggregation [Success]
        Checking json output: interval [Success]
        Checking json output: event [Success]
        Checking json output: per core [Success]
        Checking json output: per thread [Success]
        Checking json output: per die [Success]
        Checking json output: per node [Success]
        Checking json output: per socket [Success]
        test child finished with 0
        ---- end ----
        perf stat JSON output linter: Ok
        [root@quaco ~]#
      Signed-off-by: NClaire Jensen <cjense@google.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alyssa Ross <hi@alyssa.is>
      Cc: Claire Jensen <clairej735@gmail.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Like Xu <likexu@tencent.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220805200105.2020995-3-irogers@google.comSigned-off-by: NIan Rogers <irogers@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0c343af2
  3. 01 8月, 2022 1 次提交
    • N
      perf lock: Use BPF for lock contention analysis · 407b36f6
      Namhyung Kim 提交于
      Add -b/--use-bpf option to use BPF to collect lock contention stats.
      For simplicity it now runs system-wide and requires C-c to stop.
      Upcoming changes will add the usual filtering.
      
        $ sudo perf lock con -b
        ^C
         contended   total wait     max wait     avg wait         type   caller
      
                42    192.67 us     13.64 us      4.59 us     spinlock   queue_work_on+0x20
                23     85.54 us     10.28 us      3.72 us     spinlock   worker_thread+0x14a
                 6     13.92 us      6.51 us      2.32 us        mutex   kernfs_iop_permission+0x30
                 3     11.59 us     10.04 us      3.86 us        mutex   kernfs_dop_revalidate+0x3c
                 1      7.52 us      7.52 us      7.52 us     spinlock   kthread+0x115
                 1      7.24 us      7.24 us      7.24 us     rwlock:W   sys_epoll_wait+0x148
                 2      7.08 us      3.99 us      3.54 us     spinlock   delayed_work_timer_fn+0x1b
                 1      6.41 us      6.41 us      6.41 us     spinlock   idle_balance+0xa06
                 2      2.50 us      1.83 us      1.25 us        mutex   kernfs_iop_lookup+0x2f
                 1      1.71 us      1.71 us      1.71 us        mutex   kernfs_iop_getattr+0x2c
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      407b36f6
  4. 27 7月, 2022 1 次提交
    • Y
      perf kwork: Implement BPF trace · daf07d22
      Yang Jihong 提交于
      'perf record' generates perf.data, which generates extra interrupts
      for hard disk, amount of data to be collected increases with time.
      
      Using eBPF trace can process the data in kernel, which solves the
      preceding two problems.
      
      Add -b/--use-bpf option for latency and report to support
      tracing kwork events using eBPF:
      
      1. Create bpf prog and attach to tracepoints,
      2. Start tracing after command is entered,
      3. After user hit "ctrl+c", stop tracing and report,
      4. Support CPU and name filtering.
      
      This commit implements the framework code and
      does not add specific event support.
      
      Test cases:
      
        # perf kwork rep -h
      
         Usage: perf kwork report [<options>]
      
            -b, --use-bpf         Use BPF to measure kwork runtime
            -C, --cpu <cpu>       list of cpus to profile
            -i, --input <file>    input file name
            -n, --name <name>     event name to profile
            -s, --sort <key[,key2...]>
                                  sort by key(s): runtime, max, count
            -S, --with-summary    Show summary with statistics
                --time <str>      Time span for analysis (start,stop)
      
        # perf kwork lat -h
      
         Usage: perf kwork latency [<options>]
      
            -b, --use-bpf         Use BPF to measure kwork latency
            -C, --cpu <cpu>       list of cpus to profile
            -i, --input <file>    input file name
            -n, --name <name>     event name to profile
            -s, --sort <key[,key2...]>
                                  sort by key(s): avg, max, count
                --time <str>      Time span for analysis (start,stop)
      
        # perf kwork lat -b
        Unsupported bpf trace class irq
      
        # perf kwork rep -b
        Unsupported bpf trace class irq
      Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com
      [ Simplify work_findnew() ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      daf07d22
  5. 30 6月, 2022 3 次提交
    • I
      perf jevents: Remove jevents.c · 5a059790
      Ian Rogers 提交于
      Remove files and build rules.
      
      Remove test for comparing with jevents.py as there is no longer a binary
      to compare with.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Tested-by: NJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Kilroy <andrew.kilroy@arm.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
      Cc: Like Xu <likexu@tencent.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Forrington <nick.forrington@arm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Liu <liuqi115@huawei.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220629182505.406269-5-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5a059790
    • I
      perf jevents: Switch build to use jevents.py · 00facc76
      Ian Rogers 提交于
      Generate pmu-events.c using jevents.py rather than the binary built from
      jevents.c.
      
      Add a new config variable NO_JEVENTS that is set when there is no
      architecture json or an appropriate python interpreter isn't present.
      
      When NO_JEVENTS is defined the file pmu-events/empty-pmu-events.c is
      copied and used as the pmu-events.c file.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Tested-by: NJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Kilroy <andrew.kilroy@arm.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: Ian Rogers <rogers.email@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
      Cc: Like Xu <likexu@tencent.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Forrington <nick.forrington@arm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Liu <liuqi115@huawei.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220629182505.406269-4-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      00facc76
    • I
      perf jevents: Add python converter script · ffc606ad
      Ian Rogers 提交于
      jevents.c is large, has a dependency on an old forked version of jsmn,
      and is challenging to work upon. A lot of jevents.c's complexity comes
      from needing to write json and csv parsing from first principles. In
      contrast python has this functionality in standard libraries and is
      already a build pre-requisite for tools like asciidoc (that builds all
      of the perf man pages).
      
      Introduce jevents.py that produces identical output to jevents.c. Add a
      test that runs both converter tools and validates there are no output
      differences. The test can be invoked with a phony build target like:
      
        $ make -C tools/perf jevents-py-test
      
      The python code deliberately tries to replicate the behavior of
      jevents.c so that the output matches and transitioning tools shouldn't
      introduce regressions. In some cases the code isn't as elegant as hoped,
      but fixing this can be done as follow up.
      
      Committer testing:
      
        $ make -C tools/perf jevents-py-test
        make: Entering directory '/var/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j32' parallel build
          HOSTCC  fixdep.o
          HOSTLD  fixdep-in.o
          LINK    fixdep
      
        Auto-detecting system features:
        ...                         dwarf: [ on  ]
        ...            dwarf_getlocations: [ on  ]
        ...                         glibc: [ on  ]
        ...                        libbfd: [ on  ]
        ...                libbfd-buildid: [ on  ]
        ...                        libcap: [ on  ]
        ...                        libelf: [ on  ]
        ...                       libnuma: [ on  ]
        ...        numa_num_possible_cpus: [ on  ]
        ...                       libperl: [ on  ]
        ...                     libpython: [ on  ]
        ...                     libcrypto: [ OFF ]
        ...                     libunwind: [ on  ]
        ...            libdw-dwarf-unwind: [ on  ]
        ...                          zlib: [ on  ]
        ...                          lzma: [ on  ]
        ...                     get_cpuid: [ on  ]
        ...                           bpf: [ on  ]
        ...                        libaio: [ on  ]
        ...                       libzstd: [ on  ]
        ...        disassembler-four-args: [ on  ]
      
          HOSTCC  pmu-events/json.o
          HOSTCC  pmu-events/jsmn.o
          HOSTCC  pmu-events/jevents.o
          HOSTLD  pmu-events/jevents-in.o
          LINK    pmu-events/jevents
        Checking architecture: arm64
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        Checking architecture: nds32
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        Checking architecture: powerpc
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        Checking architecture: s390
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        Checking architecture: x86
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        make: Leaving directory '/var/home/acme/git/perf/tools/perf'
        $
      Signed-off-by: NIan Rogers <irogers@google.com>
      Tested-by: NJohn Garry <john.garry@huawei.com>
      Tested-by: NThomas Richter <tmricht@linux.ibm.com>
      Tested-by: NXing Zhengjun <zhengjun.xing@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Kilroy <andrew.kilroy@arm.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
      Cc: Like Xu <likexu@tencent.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Forrington <nick.forrington@arm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Liu <liuqi115@huawei.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20220629182505.406269-3-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ffc606ad
  6. 26 5月, 2022 1 次提交
    • N
      perf record: Enable off-cpu analysis with BPF · edc41a10
      Namhyung Kim 提交于
      Add --off-cpu option to enable the off-cpu profiling with BPF.  It'd
      use a bpf_output event and rename it to "offcpu-time".  Samples will
      be synthesized at the end of the record session using data from a BPF
      map which contains the aggregated off-cpu time at context switches.
      So it needs root privilege to get the off-cpu profiling.
      
      Each sample will have a separate user stacktrace so it will skip
      kernel threads.  The sample ip will be set from the stacktrace and
      other sample data will be updated accordingly.  Currently it only
      handles some basic sample types.
      
      The sample timestamp is set to a dummy value just not to bother with
      other events during the sorting.  So it has a very big initial value
      and increase it on processing each samples.
      
      Good thing is that it can be used together with regular profiling like
      cpu cycles.  If you don't want to that, you can use a dummy event to
      enable off-cpu profiling only.
      
      Example output:
        $ sudo perf record --off-cpu perf bench sched messaging -l 1000
      
        $ sudo perf report --stdio --call-graph=no
        # Total Lost Samples: 0
        #
        # Samples: 41K of event 'cycles'
        # Event count (approx.): 42137343851
        ...
      
        # Samples: 1K of event 'offcpu-time'
        # Event count (approx.): 587990831640
        #
        # Children      Self  Command          Shared Object       Symbol
        # ........  ........  ...............  ..................  .........................
        #
            81.66%     0.00%  sched-messaging  libc-2.33.so        [.] __libc_start_main
            81.66%     0.00%  sched-messaging  perf                [.] cmd_bench
            81.66%     0.00%  sched-messaging  perf                [.] main
            81.66%     0.00%  sched-messaging  perf                [.] run_builtin
            81.43%     0.00%  sched-messaging  perf                [.] bench_sched_messaging
            40.86%    40.86%  sched-messaging  libpthread-2.33.so  [.] __read
            37.66%    37.66%  sched-messaging  libpthread-2.33.so  [.] __write
             2.91%     2.91%  sched-messaging  libc-2.33.so        [.] __poll
        ...
      
      As you can see it spent most of off-cpu time in read and write in
      bench_sched_messaging().  The --call-graph=no was added just to make
      the output concise here.
      
      It uses perf hooks facility to control BPF program during the record
      session rather than adding new BPF/off-cpu specific calls.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NIan Rogers <irogers@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      edc41a10
  7. 28 4月, 2022 1 次提交
  8. 02 4月, 2022 1 次提交
    • J
      perf tools: Stop depending on .git files for building PERF-VERSION-FILE · d4ff9265
      John Garry 提交于
      This essentially reverts commit c72e3f04 ("tools/perf/build:
      Speed up git-version test on re-make") and commit 4e666cdb
      ("perf tools: Fix dependency for version file creation")
      
      In commit c72e3f04 ("tools/perf/build: Speed up git-version test
      on re-make"), a makefile dependency on .git/HEAD was added. The
      background is that running PERF-VERSION-FILE is relatively slow, and
      commands like "git describe" are particularly slow.
      
      In commit 4e666cdb ("perf tools: Fix dependency for version file
      creation"), an additional dependency on .git/ORIG_HEAD was added, as
      .git/HEAD may not change for "git reset --hard HEAD^" command. However,
      depending on whether we're on a branch or not, a "git cherry-pick" may
      not lead to the version being updated.
      
      As discussed with the git community in [0], using git internal files for
      dependencies is not reliable. Commit 4e666cdb also breaks some build
      scenarios [1].
      
      As mentioned, c72e3f04 ("tools/perf/build: Speed up git-version
      test on re-make") was added to speed up the build. However in commit
      7572733b ("perf tools: Fix version kernel tag") we removed the
      call to "git describe", so just revert Makefile.perf back to same as pre
      c72e3f04 ("tools/perf/build: Speed up git-version test on
      re-make") and the build should not be so slow, as below:
      
      Pre 7572733b:
      
        $> time util/PERF-VERSION-GEN
          PERF_VERSION = 5.17.rc8.g4e666cdb
      
        real    0m0.110s
        user    0m0.091s
        sys     0m0.019s
      
      Post 7572733b:
      
        $> time util/PERF-VERSION-GEN
          PERF_VERSION = 5.17.rc8.g7572733b
      
        real    0m0.039s
        user    0m0.036s
        sys     0m0.007s
      
      [0] https://lore.kernel.org/git/87wngkpddp.fsf@igel.home/T/#m4a4dd6de52fdbe21179306cd57b3761eb07f45f8
      [1] https://lore.kernel.org/linux-perf-users/20220329093120.4173283-1-matthieu.baerts@tessares.net/T/#u
      
      Committer testing:
      
      After a fresh rebuild using 'make -C tools/perf O=/tmp/build/perf install-bin':
      
        $ perf -v
        perf version 5.17.g162f9db407b6
        $ git log --oneline -1
        162f9db407b6a6e5 (HEAD -> perf/core) perf tools: Stop depending on .git files for building PERF-VERSION-FILE
        $
      
      Now using a detached tarball, i.e. outside the kernel source tree:
      
        $ ls -la perf*tar
        ls: cannot access 'perf*tar': No such file or directory
        $ make perf-tar-src-pkg
          TAR
          PERF_VERSION = 5.17.g31d10b3ef133
        $ ls -la perf*tar
        -rw-r--r--. 1 acme acme 22241280 Mar 30 13:26 perf-5.17.0.tar
        $ mv perf-5.17.0.tar /tmp
        $ cd /tmp
        $ tar xf perf-5.17.0.tar
        $ cd perf-5.17.0/
        $ make -C tools/perf |& tail
          CC      util/pmu.o
          CC      util/pmu-flex.o
          CC      util/expr-flex.o
          CC      util/expr.o
          LD      util/scripting-engines/perf-in.o
          LD      util/intel-pt-decoder/perf-in.o
          LD      util/perf-in.o
          LD      perf-in.o
          LINK    perf
        make: Leaving directory '/tmp/perf-5.17.0/tools/perf'
        $ tools/perf/perf -v
        perf version 5.17.g31d10b3ef133
        $ pwd
        /tmp/perf-5.17.0
        $ cat PERF-VERSION-FILE
        #define PERF_VERSION "5.17.g31d10b3ef133"
        $
      
      Fixes: 4e666cdb ("perf tools: Fix dependency for version file creation")
      Reported-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/1648635774-14581-1-git-send-email-john.garry@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d4ff9265
  9. 22 3月, 2022 1 次提交
    • J
      perf tools: Fix dependency for version file creation · 4e666cdb
      John Garry 提交于
      The version generated by perf may not be correct by just changing the
      head commit, like this:
      
        $ git log --pretty=format:"%H" -n 1
        b5d9d4708a24ac1889a30e9aedf8af8d73102139
        $ perf -v
        perf version 5.16.gb5d9d4708a24
        $ git reset --hard HEAD^
        HEAD is now at 629f520b265f
        $ make
        ...
        $ ./perf -v
        perf version 5.16.gb5d9d4708a24
      
      The dependency to building PERF-VERSION-FILE should also include ORIG_HEAD,
      as this changes when changing the head commit (while HEAD does not).
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Link: https://lore.kernel.org/r/1645449409-158238-2-git-send-email-john.garry@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4e666cdb
  10. 15 2月, 2022 1 次提交
    • M
      kbuild: replace $(if A,A,B) with $(or A,B) · 5c816641
      Masahiro Yamada 提交于
      $(or ...) is available since GNU Make 3.81, and useful to shorten the
      code in some places.
      
      Covert as follows:
      
        $(if A,A,B)  -->  $(or A,B)
      
      This patch also converts:
      
        $(if A, A, B) --> $(or A, B)
      
      Strictly speaking, the latter is not an equivalent conversion because
      GNU Make keeps spaces after commas; if A is not empty, $(if A, A, B)
      expands to " A", while $(or A, B) expands to "A".
      
      Anyway, preceding spaces are not significant in the code hunks I touched.
      Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: NNicolas Schier <nicolas@fjasle.eu>
      5c816641
  11. 16 12月, 2021 1 次提交
    • N
      perf ftrace: Add -b/--use-bpf option for latency subcommand · 177f4eac
      Namhyung Kim 提交于
      The -b/--use-bpf option is to use BPF to get latency info of kernel
      functions.  It'd have better performance impact and I observed that
      latency of same function is smaller than before when using BPF.
      
      Committer testing:
      
        # strace -e bpf perf ftrace latency -b -T __handle_mm_fault -a sleep 1
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914e00, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\08\0\0\08\0\0\0\t\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=89, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\f\0\0\0\f\0\0\0\7\0\0\0\1\0\0\0\0\0\0\20"..., btf_log_buf=NULL, btf_size=43, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\5\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=77, btf_log_size=0, btf_log_level=0}, 128) = -1 EINVAL (Invalid argument)
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\350\2\0\0\350\2\0\0\353\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1515, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=32, max_entries=1, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=5, insns=0x7fff51914c30, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 5
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="test", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 4
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=8, value_size=8, max_entries=10000, map_flags=0, inner_map_fd=0, map_name="functime", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="cpu_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 5
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="task_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 7
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=8, max_entries=22, map_flags=0, inner_map_fd=0, map_name="latency", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 8
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="func_lat.bss", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=30, btf_vmlinux_value_type_id=0}, 128) = 9
        bpf(BPF_MAP_UPDATE_ELEM, {map_fd=9, key=0x7fff51914c40, value=0x7f6e99be2000, flags=BPF_ANY}, 128) = 0
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=18, insns=0x11e4160, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_begin", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc50, func_info_cnt=1, line_info_rec_size=16, line_info=0x11e04c0, line_info_cnt=9, attach_btf_id=0, attach_prog_fd=0}, 128) = 10
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=99, insns=0x11ded70, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_end", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc70, func_info_cnt=1, line_info_rec_size=16, line_info=0x11f6e10, line_info_cnt=20, attach_btf_id=0, attach_prog_fd=0}, 128) = 11
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 13
        bpf(BPF_LINK_CREATE, {link_create={prog_fd=13, target_fd=-1, attach_type=0x29 /* BPF_??? */, flags=0}}, 128) = -1 EINVAL (Invalid argument)
        --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1699992, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        #   DURATION     |      COUNT | GRAPH                                          |
             0 - 1    us |         52 | ###################                            |
             1 - 2    us |         36 | #############                                  |
             2 - 4    us |         24 | #########                                      |
             4 - 8    us |          7 | ##                                             |
             8 - 16   us |          1 |                                                |
            16 - 32   us |          0 |                                                |
            32 - 64   us |          0 |                                                |
            64 - 128  us |          0 |                                                |
           128 - 256  us |          0 |                                                |
           256 - 512  us |          0 |                                                |
           512 - 1024 us |          0 |                                                |
             1 - 2    ms |          0 |                                                |
             2 - 4    ms |          0 |                                                |
             4 - 8    ms |          0 |                                                |
             8 - 16   ms |          0 |                                                |
            16 - 32   ms |          0 |                                                |
            32 - 64   ms |          0 |                                                |
            64 - 128  ms |          0 |                                                |
           128 - 256  ms |          0 |                                                |
           256 - 512  ms |          0 |                                                |
           512 - 1024 ms |          0 |                                                |
             1 - ...   s |          0 |                                                |
        +++ exited with 0 +++
        #
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211215185154.360314-5-namhyung@kernel.org
      [ Add missing util/cpumap.h include and removed unused 'fd' variable ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      177f4eac
  12. 08 12月, 2021 1 次提交
  13. 12 11月, 2021 3 次提交
  14. 08 11月, 2021 1 次提交
  15. 31 10月, 2021 1 次提交
  16. 28 10月, 2021 1 次提交
    • A
      perf dlfilter: Add dlfilter-show-cycles · c3afd6e5
      Adrian Hunter 提交于
      Add a new dlfilter to show cycles.
      
      Cycle counts are accumulated per CPU (or per thread if CPU is not recorded)
      from IPC information, and printed together with the change since the last
      print, at the start of each line. Separate counts are kept for branches,
      instructions or other events.
      
      Note also, the itrace A option can be useful to provide higher granularity
      cycle information.
      
      Example:
      
        $ perf record -e intel_pt/cyc/u uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.044 MB perf.data ]
        $ perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so --deltatime | head
               0                   perf-exec  8509 [001]     0.000000000:  psb offs: 0
               0                   perf-exec  8509 [001]     0.000000000:  cbr: 42 freq: 4219 MHz (156%)
             833        833            uname  8509 [001]     0.000047689: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )        _start
             833                       uname  8509 [001]     0.000003261: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            2015       1182            uname  8509 [001]     0.000000282: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            2676        661            uname  8509 [001]     0.000002629: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            3612        936            uname  8509 [001]     0.000001232: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            4579        967            uname  8509 [001]     0.000002519: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            6145       1566            uname  8509 [001]     0.000001050: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )                _dl_setup_hash
            6239         94            uname  8509 [001]     0.000000023: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )                _dl_sysdep_start
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20211027080334.365596-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c3afd6e5
  17. 26 10月, 2021 1 次提交
  18. 06 10月, 2021 1 次提交
  19. 29 9月, 2021 1 次提交
  20. 01 9月, 2021 1 次提交
  21. 11 8月, 2021 2 次提交
    • A
      perf tests: Add dlfilter test · 9f9c9a8d
      Adrian Hunter 提交于
      Add a perf test to test the dlfilter C API.
      
      A perf.data file is synthesized and then processed by perf script with a
      dlfilter named dlfilter-test-api-v0.so. Also a C file is compiled to
      provide a dso to match the synthesized perf.data file.
      
      Committer testing:
      
        [root@five ~]# perf test dlfilter
        72: dlfilter C API                                                  : Ok
        [root@five ~]# perf test -v dlfilter
        72: dlfilter C API                                                  :
        --- start ---
        test child forked, pid 3387712
        Checking for gcc
        Command: gcc --version
        gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3)
        Copyright (C) 2021 Free Software Foundation, Inc.
        This is free software; see the source for copying conditions.  There is NO
        warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
      
        dlfilters path: /var/home/acme/libexec/perf-core/dlfilters
        Command: gcc -g -o /tmp/dlfilter-test-3387712-prog /tmp/dlfilter-test-3387712-prog.c
        Creating new host machine structure
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 0 --dlarg last
        start API
        filter_event_early API
        filter_event API
        stop API
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 1 --dlarg last
        start API
        filter_event_early API
        filter_event API
        stop API
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 2 --dlarg last
        start API
        filter_event_early API
        stop API
        test child finished with 0
        ---- end ----
        dlfilter C API: Ok
        [root@five ~]#
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https //lore.kernel.org/r/20210811101036.17986-7-adrian.hunter@intel.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9f9c9a8d
    • A
      perf build: Move perf_dlfilters.h in the source tree · 3af1dfdd
      Adrian Hunter 提交于
      Move perf_dlfilters.h in the source tree so that it will be found when
      building dlfilters as part of the perf build.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https //lore.kernel.org/r/20210811101036.17986-6-adrian.hunter@intel.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3af1dfdd
  22. 07 7月, 2021 1 次提交
  23. 06 7月, 2021 1 次提交
  24. 02 7月, 2021 1 次提交
  25. 29 4月, 2021 1 次提交
    • M
      perf tools: Enable libtraceevent dynamic linking · 56d32d4c
      Michael Petlan 提交于
      Currently we support only static linking with kernel's libtraceevent
      (tools/lib/traceevent). This patch adds libtraceevent package detection
      and support to link perf with it dynamically.
      
        The libtraceevent package status is displayed with:
        $ make VF=1 LIBTRACEEVENT_DYNAMIC=1
        ...
        ...                 libtraceevent: [ on  ]
      
      Default behavior remains the same (static linking).
      
      Committer testing:
      
        $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent
        Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel.  Stop.
        $
      Signed-off-by: NMichael Petlan <mpetlan@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      LPU-Reference: 20210428092023.4009-1-mpetlan@redhat.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      56d32d4c
  26. 20 4月, 2021 1 次提交
    • A
      perf stat: Enable iostat mode for x86 platforms · f9ed693e
      Alexander Antonov 提交于
      This functionality is based on recently introduced sysfs attributes for
      Intel® Xeon® Scalable processor family (code name Skylake-SP):
      
      Commit bb42b3d3 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
      
      Mode is intended to provide four I/O performance metrics in MB per each
      PCIe root port:
      
       - Inbound Read: I/O devices below root port read from the host memory
       - Inbound Write: I/O devices below root port write to the host memory
       - Outbound Read: CPU reads from I/O devices below root port
       - Outbound Write: CPU writes to I/O devices below root port
      
      Each metric requiries only one uncore event which increments at every 4B
      transfer in corresponding direction. The formulas to compute metrics
      are generic:
          #EventCount * 4B / (1024 * 1024)
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NAlexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9ed693e
  27. 24 3月, 2021 1 次提交
    • S
      perf stat: Introduce 'bperf' to share hardware PMCs with BPF · 7fac83aa
      Song Liu 提交于
      The perf tool uses performance monitoring counters (PMCs) to monitor
      system performance. The PMCs are limited hardware resources. For
      example, Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
      
      Modern data center systems use these PMCs in many different ways: system
      level monitoring, (maybe nested) container level monitoring, per process
      monitoring, profiling (in sample mode), etc. In some cases, there are
      more active perf_events than available hardware PMCs. To allow all
      perf_events to have a chance to run, it is necessary to do expensive
      time multiplexing of events.
      
      On the other hand, many monitoring tools count the common metrics
      (cycles, instructions). It is a waste to have multiple tools create
      multiple perf_events of "cycles" and occupy multiple PMCs.
      
      bperf tries to reduce such wastes by allowing multiple perf_events of
      "cycles" or "instructions" (at different scopes) to share PMUs. Instead
      of having each perf-stat session to read its own perf_events, bperf uses
      BPF programs to read the perf_events and aggregate readings to BPF maps.
      Then, the perf-stat session(s) reads the values from these BPF maps.
      
      Please refer to the comment before the definition of bperf_ops for the
      description of bperf architecture.
      
      bperf is off by default. To enable it, pass --bpf-counters option to
      perf-stat. bperf uses a BPF hashmap to share information about BPF
      programs and maps used by bperf. This map is pinned to bpffs. The
      default path is /sys/fs/bpf/perf_attr_map. The user could change the
      path with option --bpf-attr-map.
      
      Committer testing:
      
        # dmesg|grep "Performance Events" -A5
        [    0.225277] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
        [    0.225280] ... version:                0
        [    0.225280] ... bit width:              48
        [    0.225281] ... generic registers:      6
        [    0.225281] ... value mask:             0000ffffffffffff
        [    0.225281] ... max period:             00007fffffffffff
        #
        #  for a in $(seq 6) ; do perf stat -a -e cycles,instructions sleep 100000 & done
        [1] 2436231
        [2] 2436232
        [3] 2436233
        [4] 2436234
        [5] 2436235
        [6] 2436236
        # perf stat -a -e cycles,instructions sleep 0.1
      
         Performance counter stats for 'system wide':
      
               310,326,987      cycles                                                        (41.87%)
               236,143,290      instructions              #    0.76  insn per cycle           (41.87%)
      
               0.100800885 seconds time elapsed
      
        #
      
      We can see that the counters were enabled for this workload 41.87% of
      the time.
      
      Now with --bpf-counters:
      
        #  for a in $(seq 32) ; do perf stat --bpf-counters -a -e cycles,instructions sleep 100000 & done
        [1] 2436514
        [2] 2436515
        [3] 2436516
        [4] 2436517
        [5] 2436518
        [6] 2436519
        [7] 2436520
        [8] 2436521
        [9] 2436522
        [10] 2436523
        [11] 2436524
        [12] 2436525
        [13] 2436526
        [14] 2436527
        [15] 2436528
        [16] 2436529
        [17] 2436530
        [18] 2436531
        [19] 2436532
        [20] 2436533
        [21] 2436534
        [22] 2436535
        [23] 2436536
        [24] 2436537
        [25] 2436538
        [26] 2436539
        [27] 2436540
        [28] 2436541
        [29] 2436542
        [30] 2436543
        [31] 2436544
        [32] 2436545
        #
        # ls -la /sys/fs/bpf/perf_attr_map
        -rw-------. 1 root root 0 Mar 23 14:53 /sys/fs/bpf/perf_attr_map
        # bpftool map | grep bperf | wc -l
        64
        #
      
        # bpftool map | tail
        1265: percpu_array  name accum_readings  flags 0x0
        	key 4B  value 24B  max_entries 1  memlock 4096B
        1266: hash  name filter  flags 0x0
        	key 4B  value 4B  max_entries 1  memlock 4096B
        1267: array  name bperf_fo.bss  flags 0x400
        	key 4B  value 8B  max_entries 1  memlock 4096B
        	btf_id 996
        	pids perf(2436545)
        1268: percpu_array  name accum_readings  flags 0x0
        	key 4B  value 24B  max_entries 1  memlock 4096B
        1269: hash  name filter  flags 0x0
        	key 4B  value 4B  max_entries 1  memlock 4096B
        1270: array  name bperf_fo.bss  flags 0x400
        	key 4B  value 8B  max_entries 1  memlock 4096B
        	btf_id 997
        	pids perf(2436541)
        1285: array  name pid_iter.rodata  flags 0x480
        	key 4B  value 4B  max_entries 1  memlock 4096B
        	btf_id 1017  frozen
        	pids bpftool(2437504)
        1286: array  flags 0x0
        	key 4B  value 32B  max_entries 1  memlock 4096B
        #
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        8f f3 bc ca 00 00 00 00  80 fd 2a d1 4d 00 00 00
        80 fd 2a d1 4d 00 00 00
        value (CPU 22):
        7e d5 64 4d 00 00 00 00  a4 8a 2e ee 4d 00 00 00
        a4 8a 2e ee 4d 00 00 00
        value (CPU 23):
        a7 78 3e 06 01 00 00 00  b2 34 94 f6 4d 00 00 00
        b2 34 94 f6 4d 00 00 00
        Found 1 element
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        c6 8b d9 ca 00 00 00 00  20 c6 fc 83 4e 00 00 00
        20 c6 fc 83 4e 00 00 00
        value (CPU 22):
        9c b4 d2 4d 00 00 00 00  3e 0c df 89 4e 00 00 00
        3e 0c df 89 4e 00 00 00
        value (CPU 23):
        18 43 66 06 01 00 00 00  5b 69 ed 83 4e 00 00 00
        5b 69 ed 83 4e 00 00 00
        Found 1 element
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        f2 6e db ca 00 00 00 00  92 67 4c ba 4e 00 00 00
        92 67 4c ba 4e 00 00 00
        value (CPU 22):
        dc 8e e1 4d 00 00 00 00  d9 32 7a c5 4e 00 00 00
        d9 32 7a c5 4e 00 00 00
        value (CPU 23):
        bd 2b 73 06 01 00 00 00  7c 73 87 bf 4e 00 00 00
        7c 73 87 bf 4e 00 00 00
        Found 1 element
        #
      
        # perf stat --bpf-counters -a -e cycles,instructions sleep 0.1
      
         Performance counter stats for 'system wide':
      
             119,410,122      cycles
             152,105,479      instructions              #    1.27  insn per cycle
      
             0.101395093 seconds time elapsed
      
        #
      
      See? We had the counters enabled all the time.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: kernel-team@fb.com
      Link: http://lore.kernel.org/lkml/20210316211837.910506-2-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7fac83aa
  28. 07 3月, 2021 3 次提交
  29. 29 1月, 2021 1 次提交
    • S
      tools: Factor Clang, LLC and LLVM utils definitions · 211a741c
      Sedat Dilek 提交于
      When dealing with BPF/BTF/pahole and DWARF v5 I wanted to build bpftool.
      
      While looking into the source code I found duplicate assignments in misc tools
      for the LLVM eco system, e.g. clang and llvm-objcopy.
      
      Move the Clang, LLC and/or LLVM utils definitions to tools/scripts/Makefile.include
      file and add missing includes where needed. Honestly, I was inspired by the commit
      c8a950d0 ("tools: Factor HOSTCC, HOSTLD, HOSTAR definitions").
      
      I tested with bpftool and perf on Debian/testing AMD64 and LLVM/Clang v11.1.0-rc1.
      
      Build instructions:
      
      [ make and make-options ]
      MAKE="make V=1"
      MAKE_OPTS="HOSTCC=clang HOSTCXX=clang++ HOSTLD=ld.lld CC=clang LD=ld.lld LLVM=1 LLVM_IAS=1"
      MAKE_OPTS="$MAKE_OPTS PAHOLE=/opt/pahole/bin/pahole"
      
      [ clean-up ]
      $MAKE $MAKE_OPTS -C tools/ clean
      
      [ bpftool ]
      $MAKE $MAKE_OPTS -C tools/bpf/bpftool/
      
      [ perf ]
      PYTHON=python3 $MAKE $MAKE_OPTS -C tools/perf/
      
      I was careful with respecting the user's wish to override custom compiler, linker,
      GNU/binutils and/or LLVM utils settings.
      Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAndrii Nakryiko <andrii@kernel.org>
      Acked-by: Jiri Olsa <jolsa@redhat.com> # tools/build and tools/perf
      Link: https://lore.kernel.org/bpf/20210128015117.20515-1-sedat.dilek@gmail.com
      211a741c
  30. 21 1月, 2021 1 次提交
    • S
      perf stat: Enable counting events for BPF programs · fa853c4b
      Song Liu 提交于
      Introduce 'perf stat -b' option, which counts events for BPF programs, like:
      
        [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
           1.487903822            115,200      ref-cycles
           1.487903822             86,012      cycles
           2.489147029             80,560      ref-cycles
           2.489147029             73,784      cycles
           3.490341825             60,720      ref-cycles
           3.490341825             37,797      cycles
           4.491540887             37,120      ref-cycles
           4.491540887             31,963      cycles
      
      The example above counts 'cycles' and 'ref-cycles' of BPF program of id
      254.  This is similar to bpftool-prog-profile command, but more
      flexible.
      
      'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF
      programs (monitor-progs) to the target BPF program (target-prog). The
      monitor-progs read perf_event before and after the target-prog, and
      aggregate the difference in a BPF map. Then the user space reads data
      from these maps.
      
      A new 'struct bpf_counter' is introduced to provide a common interface
      that uses BPF programs/maps to count perf events.
      
      Committer notes:
      
      Removed all but bpf_counter.h includes from evsel.h, not needed at all.
      
      Also BPF map lookups for PERCPU_ARRAYs need to have as its value receive
      buffer passed to the kernel libbpf_num_possible_cpus() entries, not
      evsel__nr_cpus(evsel), as the former uses
      /sys/devices/system/cpu/possible while the later uses
      /sys/devices/system/cpu/online, which may be less than the 'possible'
      number making the bpf map lookup overwrite memory and cause hard to
      debug memory corruption.
      
      We need to continue using evsel__nr_cpus(evsel) when accessing the
      perf_counts array tho, not to overwrite another are of memory :-)
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: https://lore.kernel.org/lkml/20210120163031.GU12699@kernel.org/Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-team@fb.com
      Link: http://lore.kernel.org/lkml/20201229214214.3413833-4-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fa853c4b
  31. 16 1月, 2021 1 次提交
  32. 12 11月, 2020 1 次提交
  33. 01 10月, 2020 1 次提交
    • A
      perf trace: Use the autogenerated mmap 'prot' string/id table · 388968d8
      Arnaldo Carvalho de Melo 提交于
      No change in behaviour:
      
        # perf trace -e mmap sleep 1
             0.000 ( 0.009 ms): sleep/751870 mmap(len: 143317, prot: READ, flags: PRIVATE, fd: 3)                  = 0x7fa96d0f7000
             0.028 ( 0.004 ms): sleep/751870 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS)           = 0x7fa96d0f5000
             0.037 ( 0.005 ms): sleep/751870 mmap(len: 1872744, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3)       = 0x7fa96cf2b000
             0.044 ( 0.011 ms): sleep/751870 mmap(addr: 0x7fa96cf50000, len: 1376256, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x25000) = 0x7fa96cf50000
             0.056 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0a0000, len: 307200, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x175000) = 0x7fa96d0a0000
             0.064 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0eb000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bf000) = 0x7fa96d0eb000
             0.075 ( 0.005 ms): sleep/751870 mmap(addr: 0x7fa96d0f1000, len: 13160, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7fa96d0f1000
             0.253 ( 0.005 ms): sleep/751870 mmap(len: 218049136, prot: READ, flags: PRIVATE, fd: 3)               = 0x7fa95ff38000
        #
        #
        # set -o vi
        # strace -e mmap sleep 1
        mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f333bd83000
        mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f333bd81000
        mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f333bbb7000
        mmap(0x7f333bbdc000, 1376256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f333bbdc000
        mmap(0x7f333bd2c000, 307200, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x7f333bd2c000
        mmap(0x7f333bd77000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f333bd77000
        mmap(0x7f333bd7d000, 13160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f333bd7d000
        mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f332ebc4000
        +++ exited with 0 +++
        #
      
      And you can as well tweak 'perf trace's output to more closely match
      strace's:
      
        # perf config trace.show_arg_names=no
        # perf config trace.show_duration=no
        # perf config trace.show_prefix=yes
        # perf config trace.show_timestamp=no
        # perf config trace.show_zeros=yes
        # perf config trace.no_inherit=yes
        # perf trace -e mmap sleep 1
        mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0)                      = 0x7f0d287ca000
        mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS)     = 0x7f0d287c8000
        mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0)       = 0x7f0d285fe000
        mmap(0x7f0d28623000, 1376256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f0d28623000
        mmap(0x7f0d28773000, 307200, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x7f0d28773000
        mmap(0x7f0d287be000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f0d287be000
        mmap(0x7f0d287c4000, 13160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f0d287c4000
        mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0)                   = 0x7f0d1b60b000
        #
      
        # perf config | grep ^trace
        trace.show_arg_names=no
        trace.show_duration=no
        trace.show_prefix=yes
        trace.show_timestamp=no
        trace.show_zeros=yes
        trace.no_inherit=yes
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      388968d8