1. 01 8月, 2018 2 次提交
    • A
      perf bpf: Include uapi/linux/bpf.h from the 'perf trace' script's bpf.h · 822c2621
      Arnaldo Carvalho de Melo 提交于
      The next example scripts need the definition for the BPF functions, i.e.
      things like BPF_FUNC_probe_read, and in time will require lots of other
      definitions found in uapi/linux/bpf.h, so include it from the bpf.h file
      included from the eBPF scripts build with clang via '-e bpf_script.c'
      like in this example:
      
        $ tail -8 tools/perf/examples/bpf/5sec.c
        #include <bpf.h>
      
        int probe(hrtimer_nanosleep, rqtp->tv_sec)(void *ctx, int err, long sec)
        {
      	return sec == 5;
        }
      
        license(GPL);
        $
      
      That 'bpf.h' include in the 5sec.c eBPF example will come from a set of
      header files crafted for building eBPF objects, that in a end-user
      system will come from:
      
        /usr/lib/perf/include/bpf/bpf.h
      
      And will include <uapi/linux/bpf.h> either from the place where the
      kernel was built, or from a kernel-devel rpm package like:
      
        -working-directory /lib/modules/4.17.9-100.fc27.x86_64/build
      
      That is set up by tools/perf/util/llvm-utils.c, and can be overriden
      by setting the 'kbuild-dir' variable in the "llvm" ~/.perfconfig file,
      like:
      
        # cat ~/.perfconfig
        [llvm]
             kbuild-dir = /home/foo/git/build/linux
      
      This usually doesn't need any change, just documenting here my findings
      while working with this code.
      
      In the future we may want to instead just use what is in
      /usr/include/linux/bpf.h, that comes from the UAPI provided from the
      kernel sources, for now, to avoid getting the kernel's non-UAPI
      "linux/bpf.h" file, that will cause clang to fail and is not what we
      want anyway (no BPF function definitions, etc), do it explicitely by
      asking for "uapi/linux/bpf.h".
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-zd8zeyhr2sappevojdem9xxt@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      822c2621
    • C
      perf tools: Allow overriding MAX_NR_CPUS at compile time · 21b8732e
      Christophe Leroy 提交于
      After update of kernel, the perf tool doesn't run anymore on my 32MB RAM
      powerpc board, but still runs on a 128MB RAM board:
      
        ~# strace perf
        execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot allocate memory)
        --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
        +++ killed by SIGSEGV +++
        Segmentation fault
      
      objdump -x shows that .bss section has a huge size of 24Mbytes:
      
       27 .bss          016baca8  101cebb8  101cebb8  001cd988  2**3
      
      With especially the following objects having quite big size:
      
        10205f80 l     O .bss	00140000     runtime_cycles_stats
        10345f80 l     O .bss	00140000     runtime_stalled_cycles_front_stats
        10485f80 l     O .bss	00140000     runtime_stalled_cycles_back_stats
        105c5f80 l     O .bss	00140000     runtime_branches_stats
        10705f80 l     O .bss	00140000     runtime_cacherefs_stats
        10845f80 l     O .bss	00140000     runtime_l1_dcache_stats
        10985f80 l     O .bss	00140000     runtime_l1_icache_stats
        10ac5f80 l     O .bss	00140000     runtime_ll_cache_stats
        10c05f80 l     O .bss	00140000     runtime_itlb_cache_stats
        10d45f80 l     O .bss	00140000     runtime_dtlb_cache_stats
        10e85f80 l     O .bss	00140000     runtime_cycles_in_tx_stats
        10fc5f80 l     O .bss	00140000     runtime_transaction_stats
        11105f80 l     O .bss	00140000     runtime_elision_stats
        11245f80 l     O .bss	00140000     runtime_topdown_total_slots
        11385f80 l     O .bss	00140000     runtime_topdown_slots_retired
        114c5f80 l     O .bss	00140000     runtime_topdown_slots_issued
        11605f80 l     O .bss	00140000     runtime_topdown_fetch_bubbles
        11745f80 l     O .bss	00140000     runtime_topdown_recovery_bubbles
      
      This is due to commit 4d255766 ("perf: Bump max number of cpus
      to 1024"), because many tables are sized with MAX_NR_CPUS
      
      This patch gives the opportunity to redefine MAX_NR_CPUS via
      
        $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20170922112043.8349468C57@po15668-vm-win7.idsi0.si.c-s.frSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      21b8732e
  2. 31 7月, 2018 19 次提交
    • A
      perf bpf: Show better message when failing to load an object · 739e2edc
      Arnaldo Carvalho de Melo 提交于
      Before:
      
        libbpf: license of tools/perf/examples/bpf/etcsnoop.c is GPL
        libbpf: section(6) version, size 4, link 0, flags 3, type=1
        libbpf: kernel version of tools/perf/examples/bpf/etcsnoop.c is 41200
        libbpf: section(7) .symtab, size 120, link 1, flags 0, type=2
        bpf: config program 'syscalls:sys_enter_openat'
        libbpf: load bpf program failed: Operation not permitted
        libbpf: failed to load program 'syscalls:sys_enter_openat'
        libbpf: failed to load object 'tools/perf/examples/bpf/etcsnoop.c'
        bpf: load objects failed
      
      After: (just the last line changes)
      
        bpf: load objects failed: err=-4009: (Incorrect kernel version)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-wi44iid0yjfht3lcvplc75fm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      739e2edc
    • M
      perf list: Unify metric group description format with PMU event description · 95f04328
      Michael Petlan 提交于
      PMU event descriptions use 7 spaces + '[' or 8 spaces as indentation.
      Metric groups used a tab + '['. This patch unifies it to the way PMU
      event descriptions are indented.
      
      BEFORE:
      
        $ perf list
        [...]
        Metric Groups:
      
        DSB:
          DSB_Coverage
      	  [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
        [...]
      
      AFTER:
      
        $ perf list
        [...]
        Metric Groups:
      
        DSB:
          DSB_Coverage
               [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
        [...]
      Signed-off-by: NMichael Petlan <mpetlan@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      LPU-Reference: 771439042.22924766.1532986504631.JavaMail.zimbra@redhat.com
      Link: https://lkml.kernel.org/n/tip-mlo850517m6u1rbjndvd1bwr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      95f04328
    • G
      perf vendor events arm64: Update ThunderX2 implementation defined pmu core events · b9b77222
      Ganapatrao Kulkarni 提交于
      Signed-off-by: NGanapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
      Cc: Jan Glauber <jan.glauber@cavium.com>
      Cc: Jayachandran C <jnair@caviumnetworks.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@cavium.com>
      Cc: Vadim Lomovtsev <vadim.lomovtsev@cavium.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/20180731100251.23575-1-ganapatrao.kulkarni@cavium.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b9b77222
    • L
      perf cs-etm: Generate branch sample for CS_ETM_TRACE_ON packet · 14a85b1e
      Leo Yan 提交于
      CS_ETM_TRACE_ON packet itself can give the info that there have a
      discontinuity in the trace, this patch is to add branch sample for
      CS_ETM_TRACE_ON packet if it is inserted in the middle of CS_ETM_RANGE
      packets; as result we can have hint for the trace discontinuity.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1531295145-596-7-git-send-email-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      14a85b1e
    • L
      perf cs-etm: Generate branch sample when receiving a CS_ETM_TRACE_ON packet · d603b4e9
      Leo Yan 提交于
      If one CS_ETM_TRACE_ON packet is inserted, we miss to generate branch
      sample for the previous CS_ETM_RANGE packet.
      
      This patch is to generate branch sample when receiving a CS_ETM_TRACE_ON
      packet, so this can save complete info for the previous CS_ETM_RANGE
      packet just before CS_ETM_TRACE_ON packet.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1531295145-596-6-git-send-email-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d603b4e9
    • L
      perf cs-etm: Support dummy address value for CS_ETM_TRACE_ON packet · 6035b680
      Leo Yan 提交于
      For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and
      'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in
      the decoder layer as dummy value, but the dummy value is pointless for
      branch sample when we use 'perf script' command to check program flow.
      
      This patch is a preparation to support CS_ETM_TRACE_ON packet for branch
      sample, it converts the dummy address value to zero for more readable;
      this is accomplished by cs_etm__last_executed_instr() and
      cs_etm__first_executed_instr().  The later one is a new function
      introduced by this patch.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1531295145-596-5-git-send-email-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6035b680
    • L
      perf cs-etm: Fix start tracing packet handling · 3eb3e07b
      Leo Yan 提交于
      Usually the start tracing packet is a CS_ETM_TRACE_ON packet, this
      packet is passed to cs_etm__flush();  cs_etm__flush() will check the
      condition 'prev_packet->sample_type == CS_ETM_RANGE' but 'prev_packet'
      is allocated by zalloc() so 'prev_packet->sample_type' is zero in
      initialization and this condition is false.  So cs_etm__flush() will
      directly bail out without handling the start tracing packet.
      
      This patch is to introduce a new sample type CS_ETM_EMPTY, which is used
      to indicate the packet is an empty packet.  cs_etm__flush() will swap
      packets when it finds the previous packet is empty, so this can record
      the start tracing packet into 'etmq->prev_packet'.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1531295145-596-4-git-send-email-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3eb3e07b
    • T
      perf build: Fix installation directory for eBPF · 83868bf7
      Thomas Richter 提交于
      The perf tool build and install is controlled via a Makefile. The
      'install' rule creates directories and copies files. Among them are
      header files installed in /usr/lib/include/perf/bpf/.
      
      However all listed examples are installing its header files in
      
        /usr/lib/<tool-name>/...[/include]/header.h
      
      and not in
      
        /usr/lib/include/<tool-name>/.../header.h.
      
      Background information:
      
      Building the Fedora 28 glibc RPM on s390x and s390 fails on s390 (gcc
      -m31) as gcc is not able to find header-files like stdbool.h.
      
      In the glibc.spec file, you can see that glibc is configured with
      "--with-headers". In this case, first -nostdinc is added to the CFLAGS
      and then further include paths are added via -isystem.  One of those
      paths should contain header files like stdbool.h.
      
      In order to get this path, gcc is invoked with:
      
      - on Fedora 28 (with 4.18 kernel):
      
        $ gcc -print-file-name=include
        /usr/lib/gcc/s390x-redhat-linux/8/include
        $ gcc -m31 -print-file-name=include
        /usr/lib/gcc/s390x-redhat-linux/8/../../../../lib/include
        => If perf is installed, this is: /usr/lib/include
        On my machine this directory is only containing the directory "perf".
        If perf is not installed gcc returns: /usr/lib/gcc/s390x-redhat-linux/8/include
      
      - on Ubuntu 18.04 (with 4.15 kernel):
      
        $ gcc  -print-file-name=include
        /usr/lib/gcc/s390x-linux-gnu/7/include
        $ gcc -m31 -print-file-name=include
        /usr/lib/gcc/s390x-linux-gnu/7/include
        => gcc returns the correct path even if perf is installed.
      
      In each case, the introduction of the subdirectory /usr/lib/include
      leads to the regression that one can not build the glibc RPM for s390
      anymore as gcc can not find headers like stdbool.h.
      
      To remedy this install bpf.h to /usr/lib/perf/include/bpf/bpf.h
      
      Output before using the command 'perf test -Fv 40':
      
        echo '...[bpf-program-source]...' | /usr/bin/clang ... \
      		   -I/root/lib/include/perf/bpf ...
                                     ^^^^^^^^^^^^
      ...
        [root@p23lp27 perf]# perf test -F 40
        40: BPF filter                                            :
        40.1: Basic BPF filtering                                 : Ok
        40.2: BPF pinning                                         : Ok
        40.3: BPF prologue generation                             : Ok
        40.4: BPF relocation checker                              : Ok
        [root@p23lp27 perf]#
      
      Output after using command 'perf test -Fv 40':
      
        echo '...[bpf-program-source]...' | /usr/bin/clang ... \
      		 -I/root/lib/perf/include/bpf ...
                                   ^^^^^^^^^^^^
      ...
        [root@p23lp27 perf]# perf test -F 40
        40: BPF filter                                            :
        40.1: Basic BPF filtering                                 : Ok
        40.2: BPF pinning                                         : Ok
        40.3: BPF prologue generation                             : Ok
        40.4: BPF relocation checker                              : Ok
        [root@p23lp27 perf]#
      
      Committer testing:
      
      While the above 'perf test -F 40' (or 'perf test bpf') will allow us
      to see that the correct path is now added via -I, to actually test this
      we better try to use a bpf script that includes files in the changed
      directory.
      
      We have the files that now reside in /root/lib/perf/examples/bpf/ to do
      just that:
      
        # tail -8 /root/lib/perf/examples/bpf/5sec.c
        #include <bpf.h>
      
        int probe(hrtimer_nanosleep, rqtp->tv_sec)(void *ctx, int err, long sec)
        {
      	  return sec == 5;
        }
      
        license(GPL);
        # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 4
             0.333 (4000.086 ms): sleep/9248 nanosleep(rqtp: 0x7ffc155f3300) = 0
        # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 5
             0.287 (         ): sleep/9659 nanosleep(rqtp: 0x7ffeafe38200) ...
             0.290 (         ): perf_bpf_probe:hrtimer_nanosleep:(ffffffff9911efe0) tv_sec=5
             0.287 (5000.059 ms): sleep/9659  ... [continued]: nanosleep()) = 0
        # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 6
             0.247 (5999.951 ms): sleep/10068 nanosleep(rqtp: 0x7fff2086d900) = 0
        # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 5.987
             0.293 (         ): sleep/10489 nanosleep(rqtp: 0x7ffdd4fc10e0) ...
             0.296 (         ): perf_bpf_probe:hrtimer_nanosleep:(ffffffff9911efe0) tv_sec=5
             0.293 (5986.912 ms): sleep/10489  ... [continued]: nanosleep()) = 0
        #
      Suggested-by: NStefan Liebler <stli@linux.ibm.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Fixes: 1b16fffa ("perf llvm-utils: Add bpf include path to clang command line")
      Link: http://lkml.kernel.org/r/20180731073254.91090-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      83868bf7
    • J
      perf c2c report: Fix crash for empty browser · 73978332
      Jiri Olsa 提交于
      'perf c2c' scans read/write accesses and tries to find false sharing
      cases, so when the events it wants were not asked for or ended up not
      taking place, we get no histograms.
      
      So do not try to display entry details if there's not any. Currently
      this ends up in crash:
      
        $ perf c2c report # then press 'd'
        perf: Segmentation fault
        $
      
      Committer testing:
      
      Before:
      
      Record a perf.data file without events of interest to 'perf c2c report',
      then call it and press 'd':
      
        # perf record sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data (6 samples) ]
        # perf c2c report
        perf: Segmentation fault
        -------- backtrace --------
        perf[0x5b1d2a]
        /lib64/libc.so.6(+0x346df)[0x7fcb566e36df]
        perf[0x46fcae]
        perf[0x4a9f1e]
        perf[0x4aa220]
        perf(main+0x301)[0x42c561]
        /lib64/libc.so.6(__libc_start_main+0xe9)[0x7fcb566cff29]
        perf(_start+0x29)[0x42c999]
        #
      
      After the patch the segfault doesn't take place, a follow up patch to
      tell the user why nothing changes when 'd' is pressed would be good.
      
      Reported-by: rodia@autistici.org
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: f1c5fd4d ("perf c2c report: Add TUI cacheline browser")
      Link: http://lkml.kernel.org/r/20180724062008.26126-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      73978332
    • S
      perf tests: Fix indexing when invoking subtests · aa90f9f9
      Sandipan Das 提交于
      Recently, the subtest numbering was changed to start from 1.  While it
      is fine for displaying results, this should not be the case when the
      subtests are actually invoked.
      
      Typically, the subtests are stored in zero-indexed arrays and invoked
      based on the index passed to the main test function.  Since the index
      now starts from 1, the second subtest in the array (index 1) gets
      invoked instead of the first (index 0).  This applies to all of the
      following subtests but for the last one, the subtest always fails
      because it does not meet the boundary condition of the subtest index
      being lesser than the number of subtests.
      
      This can be observed on powerpc64 and x86_64 systems running Fedora 28
      as shown below.
      
      Before:
      
        # perf test "builtin clang support"
        55: builtin clang support                                 :
        55.1: builtin clang compile C source to IR                : Ok
        55.2: builtin clang compile C source to ELF object        : FAILED!
      
        # perf test "LLVM search and compile"
        38: LLVM search and compile                               :
        38.1: Basic BPF llvm compile                              : Ok
        38.2: kbuild searching                                    : Ok
        38.3: Compile source for BPF prologue generation          : Ok
        38.4: Compile source for BPF relocation                   : FAILED!
      
        # perf test "BPF filter"
        40: BPF filter                                            :
        40.1: Basic BPF filtering                                 : Ok
        40.2: BPF pinning                                         : Ok
        40.3: BPF prologue generation                             : Ok
        40.4: BPF relocation checker                              : FAILED!
      
      After:
      
        # perf test "builtin clang support"
        55: builtin clang support                                 :
        55.1: builtin clang compile C source to IR                : Ok
        55.2: builtin clang compile C source to ELF object        : Ok
      
        # perf test "LLVM search and compile"
        38: LLVM search and compile                               :
        38.1: Basic BPF llvm compile                              : Ok
        38.2: kbuild searching                                    : Ok
        38.3: Compile source for BPF prologue generation          : Ok
        38.4: Compile source for BPF relocation                   : Ok
      
        # perf test "BPF filter"
        40: BPF filter                                            :
        40.1: Basic BPF filtering                                 : Ok
        40.2: BPF pinning                                         : Ok
        40.3: BPF prologue generation                             : Ok
        40.4: BPF relocation checker                              : Ok
      Signed-off-by: NSandipan Das <sandipan@linux.ibm.com>
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Fixes: 9ef01124 ("perf test: Fix subtest number when showing results")
      Link: http://lkml.kernel.org/r/20180726171733.33208-1-sandipan@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aa90f9f9
    • A
      perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' args · 162d3edb
      Arnaldo Carvalho de Melo 提交于
      For instance:
      
        $ trace -e socket* ssh sandy
           0.000 ( 0.031 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           0.052 ( 0.015 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.568 ( 0.020 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.603 ( 0.012 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.699 ( 0.014 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.724 ( 0.012 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.804 ( 0.020 ms): ssh/19919 socket(family: INET, type: STREAM, protocol: TCP                      ) = 3
          17.549 ( 0.098 ms): ssh/19919 socket(family: LOCAL, type: STREAM                                    ) = 4
        acme@sandy's password:
      
      Just like with other syscall args, the common bits are supressed so that
      the output is more compact, i.e. we use "TCP" instead of "IPPROTO_TCP",
      but we can make this show the original constant names if we like it by
      using some command line knob or ~/.perfconfig "[trace]" section
      variable.
      
      Also needed is to make perf's event parser accept things like:
      
        $ perf trace -e socket*/protocol=TCP/
      
      By using both the tracefs event 'format' files and these tables built
      from the kernel sources.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-l39jz1vnyda0b6jsufuc8bz7@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      162d3edb
    • A
      perf trace beauty: Add beautifiers for 'socket''s 'protocol' arg · 03aeb6c8
      Arnaldo Carvalho de Melo 提交于
      It'll be wired to 'perf trace' in the next cset.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-2i9vkvm1ik8yu4hgjmxhsyjv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      03aeb6c8
    • A
      perf trace beauty: Do not print NULL strarray entries · bc972ada
      Arnaldo Carvalho de Melo 提交于
      We may have string tables where not all slots have values, in those
      cases its better to print the numeric value, for instance:
      
      In the table below we would show "protocol: (null)" for
      
            socket_ipproto[3]
      
      Where it would be better to show "protocol: 3".
      
            $ tools/perf/trace/beauty/socket_ipproto.sh
            static const char *socket_ipproto[] = {
                  [0] = "IP",
                  [103] = "PIM",
                  [108] = "COMP",
                  [12] = "PUP",
                  [132] = "SCTP",
                  [136] = "UDPLITE",
                  [137] = "MPLS",
                  [17] = "UDP",
                  [1] = "ICMP",
                  [22] = "IDP",
                  [255] = "RAW",
                  [29] = "TP",
                  [2] = "IGMP",
                  [33] = "DCCP",
                  [41] = "IPV6",
                  [46] = "RSVP",
                  [47] = "GRE",
                  [4] = "IPIP",
                  [50] = "ESP",
                  [51] = "AH",
                  [6] = "TCP",
                  [8] = "EGP",
                  [92] = "MTP",
                  [94] = "BEETPH",
                  [98] = "ENCAP",
            };
            $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-7djfak94eb3b9ltr79cpn3ti@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bc972ada
    • A
      perf beauty: Add a generator for IPPROTO_ socket's protocol constants · 9849eec3
      Arnaldo Carvalho de Melo 提交于
      It'll use tools/include copy of linux/in.h to generate a table to be
      used by tools, initially by the 'socket' and 'socketpair' beautifiers in
      'perf trace', but that could also be used to translate from a string
      constant to the integer value to be used in a eBPF or tracefs tracepoint
      filter.
      
      When used without any args it produces:
      
        $ tools/perf/trace/beauty/socket_ipproto.sh
        static const char *socket_ipproto[] = {
      	[0] = "IP",
      	[103] = "PIM",
      	[108] = "COMP",
      	[12] = "PUP",
      	[132] = "SCTP",
      	[136] = "UDPLITE",
      	[137] = "MPLS",
      	[17] = "UDP",
      	[1] = "ICMP",
      	[22] = "IDP",
      	[255] = "RAW",
      	[29] = "TP",
      	[2] = "IGMP",
      	[33] = "DCCP",
      	[41] = "IPV6",
      	[46] = "RSVP",
      	[47] = "GRE",
      	[4] = "IPIP",
      	[50] = "ESP",
      	[51] = "AH",
      	[6] = "TCP",
      	[8] = "EGP",
      	[92] = "MTP",
      	[94] = "BEETPH",
      	[98] = "ENCAP",
        };
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-v9rafqh3qn6b9kp9vfvj9f8s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9849eec3
    • A
      tools include uapi: Grab a copy of linux/in.h · a4b20612
      Arnaldo Carvalho de Melo 提交于
      We'll use it to create tables for the 'protocol' argument to the
      socket syscall when the 'family' arg is one of AF_INET or AF_INET6.
      
      Add it to check_headers.sh so that when a new protocol gets added we get
      a notification during the build process.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-2amnveu1ns4emjn70xuavpje@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a4b20612
    • S
      perf tests: Fix complex event name parsing · a6f39cec
      Sandipan Das 提交于
      The 'umask' event parameter is unsupported on some architectures like
      powerpc64.
      
      This can be observed on a powerpc64le system running Fedora 27 as shown
      below.
      
        # perf test "Parse event definition strings" -v
         6: Parse event definition strings                        :
        --- start ---
        test child forked, pid 45915
        ...
        running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2,umask=0x3/ukp'Invalid event/parameter 'umask'
        Invalid event/parameter 'umask'
        failed to parse event 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2,umask=0x3/ukp', err 1, str 'unknown term'
        event syntax error: '..,event=0x2,umask=0x3/ukp'
                                          \___ unknown term
      
        valid terms: event,mark,pmc,cache_sel,pmcxsel,unit,thresh_stop,thresh_start,combine,thresh_sel,thresh_cmp,sample_mode,config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,no-overwrite,overwrite,driver-config
      
        mem_access -> cpu/event=0x10401e0/
        running test 0 'config=10,config1,config2=3,umask=1'
        test child finished with 1
        ---- end ----
        Parse event definition strings: FAILED!
      
      Committer testing:
      
      After applying the patch these test passes and in verbose mode we get:
      
        # perf test -v "event definition"
         6: Parse event definition strings:
        --- start ---
        test child forked, pid 11061
        running test 0 'syscalls:sys_enter_openat'Using CPUID GenuineIntel-6-9E
        <SNIP>
        running test 53 'cycles/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks'/Duk'
        running test 0 'cpu/config=10,config1,config2=3,period=1000/u'
        running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u'
        running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/'
        running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2/ukp'
        <SNIP>
        test child finished with 0
        ---- end ----
        Parse event definition strings: Ok
        #
      Suggested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Signed-off-by: NSandipan Das <sandipan@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Fixes: 06dc5bf2 ("perf tests: Check that complex event name is parsed correctly")
      Link: http://lkml.kernel.org/r/20180726105502.31670-1-sandipan@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a6f39cec
    • K
      perf evlist: Fix error out while applying initial delay and LBR · 95035c5e
      Kan Liang 提交于
      'perf record' will error out if both --delay and LBR are applied.
      
      For example:
      
        # perf record -D 1000 -a -e cycles -j any -- sleep 2
        Error:
        dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts.
        Try 'perf stat'
        #
      
      A dummy event is added implicitly for initial delay, which has the same
      configurations as real sampling events. The dummy event is a software
      event. If LBR is configured, perf must error out.
      
      The dummy event will only be used to track PERF_RECORD_MMAP while perf
      waits for the initial delay to enable the real events. The BRANCH_STACK
      bit can be safely cleared for the dummy event.
      
      After applying the patch:
      
        # perf record -D 1000 -a -e cycles -j any -- sleep 2
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.054 MB perf.data (828 samples) ]
        #
      Reported-by: NSunil K Pandey <sunil.k.pandey@intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1531145722-16404-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      95035c5e
    • A
      perf trace beauty: Default header_dir to cwd to work without parms · 61b229ce
      Arnaldo Carvalho de Melo 提交于
      Useful when checking the effects of header synchs for the files it uses
      as a input to generate string tables, in retrospect this is how it
      should've been done from day 1, not requiring the header_dir to be set
      on the Makefile, will change everything later, so that the only parm,
      common to all generators will be $(srctree) and $(beauty_outdir).
      
      So, to see what it generates, just call it without any parameters:
      
        $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh
        static const char *vhost_virtio_ioctl_cmds[] = {
      	[0x00] = "SET_FEATURES",
      	[0x01] = "SET_OWNER",
      	[0x02] = "RESET_OWNER",
      	[0x03] = "SET_MEM_TABLE",
      	[0x04] = "SET_LOG_BASE",
      	[0x07] = "SET_LOG_FD",
      	[0x10] = "SET_VRING_NUM",
      	[0x11] = "SET_VRING_ADDR",
      	[0x12] = "SET_VRING_BASE",
      	[0x13] = "SET_VRING_ENDIAN",
      	[0x14] = "GET_VRING_ENDIAN",
      	[0x20] = "SET_VRING_KICK",
      	[0x21] = "SET_VRING_CALL",
      	[0x22] = "SET_VRING_ERR",
      	[0x23] = "SET_VRING_BUSYLOOP_TIMEOUT",
      	[0x24] = "GET_VRING_BUSYLOOP_TIMEOUT",
      	[0x30] = "NET_SET_BACKEND",
      	[0x40] = "SCSI_SET_ENDPOINT",
      	[0x41] = "SCSI_CLEAR_ENDPOINT",
      	[0x42] = "SCSI_GET_ABI_VERSION",
      	[0x43] = "SCSI_SET_EVENTS_MISSED",
      	[0x44] = "SCSI_GET_EVENTS_MISSED",
      	[0x60] = "VSOCK_SET_GUEST_CID",
      	[0x61] = "VSOCK_SET_RUNNING",
        };
        static const char *vhost_virtio_ioctl_read_cmds[] = {
      	[0x00] = "GET_FEATURES",
      	[0x12] = "GET_VRING_BASE",
        };
        $
      
      Or:
      
        $ tools/perf/trace/beauty/sndrv_pcm_ioctl.sh
        static const char *sndrv_pcm_ioctl_cmds[] = {
      	[0x00] = "PVERSION",
      	[0x01] = "INFO",
      	[0x02] = "TSTAMP",
      	[0x03] = "TTSTAMP",
      	[0x04] = "USER_PVERSION",
      	[0x10] = "HW_REFINE",
      	[0x11] = "HW_PARAMS",
      	[0x12] = "HW_FREE",
      	[0x13] = "SW_PARAMS",
      	[0x20] = "STATUS",
      	[0x21] = "DELAY",
      	[0x22] = "HWSYNC",
      	[0x23] = "SYNC_PTR",
      	[0x24] = "STATUS_EXT",
      	[0x32] = "CHANNEL_INFO",
      	[0x40] = "PREPARE",
      	[0x41] = "RESET",
      	[0x42] = "START",
      	[0x43] = "DROP",
      	[0x44] = "DRAIN",
      	[0x45] = "PAUSE",
      	[0x46] = "REWIND",
      	[0x47] = "RESUME",
      	[0x48] = "XRUN",
      	[0x49] = "FORWARD",
      	[0x50] = "WRITEI_FRAMES",
      	[0x51] = "READI_FRAMES",
      	[0x52] = "WRITEN_FRAMES",
      	[0x53] = "READN_FRAMES",
      	[0x60] = "LINK",
      	[0x61] = "UNLINK",
        };
        $
      
      Etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-90am4vm8hh1osms894dp2otr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61b229ce
    • A
      perf tools: Fix the build on the alpine:edge distro · 44fe619b
      Arnaldo Carvalho de Melo 提交于
      The UAPI file byteorder/little_endian.h uses the __always_inline define
      without including the header where it is defined, linux/stddef.h, this
      ends up working in all the other distros because that file gets included
      seemingly by luck from one of the files included from little_endian.h.
      
      But not on Alpine:edge, that fails for all files where perf_event.h is
      included but linux/stddef.h isn't include before that.
      
      Adding the missing linux/stddef.h file where it breaks on Alpine:edge to
      fix that, in all other distros, that is just a very small header anyway.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-9r1pifftxvuxms8l7ir73p5l@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      44fe619b
  3. 30 7月, 2018 4 次提交
    • A
      tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy' · 1f27a050
      Arnaldo Carvalho de Melo 提交于
      To cope with the changes in:
      
        12c89130 ("x86/asm/memcpy_mcsafe: Add write-protection-fault handling")
        60622d68 ("x86/asm/memcpy_mcsafe: Return bytes remaining")
        bd131544 ("x86/asm/memcpy_mcsafe: Add labels for __memcpy_mcsafe() write fault handling")
        da7bc9c5 ("x86/asm/memcpy_mcsafe: Remove loop unrolling")
      
      This needed introducing a file with a copy of the mcsafe_handle_tail()
      function, that is used in the new memcpy_64.S file, as well as a dummy
      mcsafe_test.h header.
      
      Testing it:
      
        $ nm ~/bin/perf | grep mcsafe
        0000000000484130 T mcsafe_handle_tail
        0000000000484300 T __memcpy_mcsafe
        $
        $ perf bench mem memcpy
        # Running 'mem/memcpy' benchmark:
        # function 'default' (Default memcpy() provided by glibc)
        # Copying 1MB bytes ...
      
            44.389205 GB/sec
        # function 'x86-64-unrolled' (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
        # Copying 1MB bytes ...
      
            22.710756 GB/sec
        # function 'x86-64-movsq' (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
        # Copying 1MB bytes ...
      
            42.459239 GB/sec
        # function 'x86-64-movsb' (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
        # Copying 1MB bytes ...
      
            42.459239 GB/sec
        $
      
      This silences this perf tools build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mika Penttilä <mika.penttila@nextfour.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-igdpciheradk3gb3qqal52d0@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f27a050
    • A
      tools headers uapi: Refresh linux/bpf.h copy · fc73bfd6
      Arnaldo Carvalho de Melo 提交于
      To get the changes in:
      
        4c79579b ("bpf: Change bpf_fib_lookup to return lookup status")
      
      That do not entail changes in tools/perf/ use of it, elliminating the
      following perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-yei494y6b3mn6bjzz9g0ws12@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fc73bfd6
    • A
      tools headers powerpc: Update asm/unistd.h copy to pick new · 7def16d1
      Arnaldo Carvalho de Melo 提交于
      The new 'io_pgetevents' syscall was wired up in PowerPC in the following
      cset:
      
        b2f82565 ("powerpc: Wire up io_pgetevents")
      
      Update tools/arch/powerpc/ copy of the asm/unistd.h file so that 'perf
      trace' on PowerPC gets it in its syscall table.
      
      This elliminated the following perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/powerpc/include/uapi/asm/unistd.h' differs from latest version at 'arch/powerpc/include/uapi/asm/unistd.h'
      
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Breno Leitao <leitao@debian.org>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Link: https://lkml.kernel.org/n/tip-9uvu7tz4ud3bxxfyxwryuz47@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7def16d1
    • A
      tools headers uapi: Update tools's copy of linux/perf_event.h · 2c3ee0e1
      Arnaldo Carvalho de Melo 提交于
      To get the changes in:
      
        6cbc304f ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")
      
      That do not imply any changes in the tooling side, the (ab)use of
      sample_type is entirely done in kernel space, nothing for userspace to
      witness here.
      
      This cures the following warning during perf's build:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-o64mjoy35s9gd1gitunw1zg4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2c3ee0e1
  4. 25 7月, 2018 15 次提交
    • M
      selftests/ftrace: Add snapshot and tracing_on test case · 82f4f3e6
      Masami Hiramatsu 提交于
      Add a testcase for checking snapshot and tracing_on
      relationship. This ensures that the snapshotting doesn't
      affect current tracing on/off settings.
      
      Link: http://lkml.kernel.org/r/153149932412.11274.15289227592627901488.stgit@devbox
      
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: linux-kselftest@vger.kernel.org
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      82f4f3e6
    • T
      perf test: Fix subtest number when showing results · 9ef01124
      Thomas Richter 提交于
      Perf test 40 for example has several subtests numbered 1-4 when
      displaying the start of the subtest. When the subtest results
      are displayed the subtests are numbered 0-3.
      
      Use this command to generate trace output:
      
        [root@s35lp76 perf]# ./perf test -Fv 40 2>/tmp/bpf1
      
      Fix this by adjusting the subtest number when show the
      subtest result.
      
      Output before:
      
        [root@s35lp76 perf]# egrep '(^40\.[0-4]| subtest [0-4]:)' /tmp/bpf1
        40.1: Basic BPF filtering                                 :
        BPF filter subtest 0: Ok
        40.2: BPF pinning                                         :
        BPF filter subtest 1: Ok
        40.3: BPF prologue generation                             :
        BPF filter subtest 2: Ok
        40.4: BPF relocation checker                              :
        BPF filter subtest 3: Ok
        [root@s35lp76 perf]#
      
      Output after:
      
        root@s35lp76 ~]# egrep '(^40\.[0-4]| subtest [0-4]:)' /tmp/bpf1
        40.1: Basic BPF filtering                                 :
        BPF filter subtest 1: Ok
        40.2: BPF pinning                                         :
        BPF filter subtest 2: Ok
        40.3: BPF prologue generation                             :
        BPF filter subtest 3: Ok
        40.4: BPF relocation checker                              :
        BPF filter subtest 4: Ok
        [root@s35lp76 ~]#
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180724134858.100644-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ef01124
    • J
      perf stat: Get rid of extra clock display function · 0aa802a7
      Jiri Olsa 提交于
      There's no reason to have separate function to display clock events.
      It's only purpose was to convert the nanosecond value into microseconds.
      We do that now in generic code, if the unit and scale values are
      properly set, which this patch do for clock events.
      
      The output differs in the unit field being displayed in its columns
      rather than having it added as a suffix of the event name. Plus the
      value is rounded into 2 decimal numbers as for any other event.
      
      Before:
      
        # perf stat  -e cpu-clock,task-clock -C 0 sleep 3
      
         Performance counter stats for 'CPU(s) 0':
      
             3001.123137      cpu-clock (msec)          #    1.000 CPUs utilized
             3001.133250      task-clock (msec)         #    1.000 CPUs utilized
      
             3.001159813 seconds time elapsed
      
      Now:
      
        # perf stat  -e cpu-clock,task-clock -C 0 sleep 3
      
         Performance counter stats for 'CPU(s) 0':
      
                3,001.05 msec cpu-clock                 #    1.000 CPUs utilized
                3,001.05 msec task-clock                #    1.000 CPUs utilized
      
             3.001077794 seconds time elapsed
      
      There's a small difference in csv output, as we now output the unit
      field, which was empty before. It's in the proper spot, so there's no
      compatibility issue.
      
      Before:
      
        # perf stat  -e cpu-clock,task-clock -C 0 -x, sleep 3
        3001.065177,,cpu-clock,3001064187,100.00,1.000,CPUs utilized
        3001.077085,,task-clock,3001077085,100.00,1.000,CPUs utilized
      
        # perf stat  -e cpu-clock,task-clock -C 0 -x, sleep 3
        3000.80,msec,cpu-clock,3000799026,100.00,1.000,CPUs utilized
        3000.80,msec,task-clock,3000799550,100.00,1.000,CPUs utilized
      
      Add perf_evsel__is_clock to replace nsec_counter.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180720110036.32251-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0aa802a7
    • J
      perf tools: Use perf_evsel__match instead of open coded equivalent · 2d6cae13
      Jiri Olsa 提交于
      Use perf_evsel__match() helper in perf_evsel__is_bpf_output().
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180720110036.32251-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2d6cae13
    • J
      perf tools: Fix struct comm_str removal crash · 46b3722c
      Jiri Olsa 提交于
      We occasionaly hit following assert failure in 'perf top', when processing the
      /proc info in multiple threads.
      
        perf: ...include/linux/refcount.h:109: refcount_inc:
              Assertion `!(!refcount_inc_not_zero(r))' failed.
      
      The gdb backtrace looks like this:
      
        [Switching to Thread 0x7ffff11ba700 (LWP 13749)]
        0x00007ffff50839fb in raise () from /lib64/libc.so.6
        (gdb)
        #0  0x00007ffff50839fb in raise () from /lib64/libc.so.6
        #1  0x00007ffff5085800 in abort () from /lib64/libc.so.6
        #2  0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
        #3  0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
        #4  0x0000000000535373 in refcount_inc (r=0x7fffdc009be0)
            at ...include/linux/refcount.h:109
        #5  0x00000000005354f1 in comm_str__get (cs=0x7fffdc009bc0)
            at util/comm.c:24
        #6  0x00000000005356bd in __comm_str__findnew (str=0x7fffd000b260 ":2",
            root=0xbed5c0 <comm_str_root>) at util/comm.c:72
        #7  0x000000000053579e in comm_str__findnew (str=0x7fffd000b260 ":2",
            root=0xbed5c0 <comm_str_root>) at util/comm.c:95
        #8  0x000000000053582e in comm__new (str=0x7fffd000b260 ":2",
            timestamp=0, exec=false) at util/comm.c:111
        #9  0x00000000005363bc in thread__new (pid=2, tid=2) at util/thread.c:57
        #10 0x0000000000523da0 in ____machine__findnew_thread (machine=0xbfde38,
            threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:457
        #11 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
        ...
      
      The failing assertion is this one:
      
        REFCOUNT_WARN(!refcount_inc_not_zero(r), ...
      
      The problem is that we keep global comm_str_root list, which
      is accessed by multiple threads during the 'perf top' startup
      and following 2 paths can race:
      
        thread 1:
          ...
          thread__new
            comm__new
              comm_str__findnew
                down_write(&comm_str_lock);
                __comm_str__findnew
                  comm_str__get
      
        thread 2:
          ...
          comm__override or comm__free
            comm_str__put
              refcount_dec_and_test
                down_write(&comm_str_lock);
                rb_erase(&cs->rb_node, &comm_str_root);
      
      Because thread 2 first decrements the refcnt and only after then it removes the
      struct comm_str from the list, the thread 1 can find this object on the list
      with refcnt equls to 0 and hit the assert.
      
      This patch fixes the thread 1 __comm_str__findnew path, by ignoring objects
      that already dropped the refcnt to 0. For the rest of the objects we take the
      refcnt before comparing its name and release it afterwards with comm_str__put,
      which can also release the object completely.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20180720101740.GA27176@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      46b3722c
    • J
      perf machine: Use last_match threads cache only in single thread mode · b57334b9
      Jiri Olsa 提交于
      There's an issue with using threads::last_match in multithread mode
      which is enabled during the perf top synthesize. It might crash with
      following assertion:
      
        perf: ...include/linux/refcount.h:109: refcount_inc:
              Assertion `!(!refcount_inc_not_zero(r))' failed.
      
      The gdb backtrace looks like this:
      
        0x00007ffff50839fb in raise () from /lib64/libc.so.6
        (gdb)
        #0  0x00007ffff50839fb in raise () from /lib64/libc.so.6
        #1  0x00007ffff5085800 in abort () from /lib64/libc.so.6
        #2  0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
        #3  0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
        #4  0x0000000000535ff9 in refcount_inc (r=0x7fffe8009a70)
            at ...include/linux/refcount.h:109
        #5  0x0000000000536771 in thread__get (thread=0x7fffe8009a40)
            at util/thread.c:115
        #6  0x0000000000523cd0 in ____machine__findnew_thread (machine=0xbfde38,
            threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:432
        #7  0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
            pid=2, tid=2) at util/machine.c:489
        #8  0x0000000000523f24 in machine__findnew_thread (machine=0xbfde38,
            pid=2, tid=2) at util/machine.c:499
        #9  0x0000000000526fbe in machine__process_fork_event (machine=0xbfde38,
        ...
      
      The failing assertion is this one:
      
        REFCOUNT_WARN(!refcount_inc_not_zero(r), ...
      
      the problem is that we don't serialize access to threads::last_match.
      We serialize the access to the threads tree, but we don't care how's
      threads::last_match being accessed. Both locked/unlocked paths use
      that data and can set it. In multithreaded mode we can end up with
      invalid object in thread__get call, like in following paths race:
      
        thread 1
          ...
          machine__findnew_thread
            down_write(&threads->lock);
            __machine__findnew_thread
              ____machine__findnew_thread
                th = threads->last_match;
                if (th->tid == tid) {
                  thread__get
      
        thread 2
          ...
          machine__find_thread
            down_read(&threads->lock);
            __machine__findnew_thread
              ____machine__findnew_thread
                th = threads->last_match;
                if (th->tid == tid) {
                  thread__get
      
        thread 3
          ...
          machine__process_fork_event
            machine__remove_thread
              __machine__remove_thread
                threads->last_match = NULL
                thread__put
            thread__put
      
      Thread 1 and 2 might got stale last_match, before thread 3 clears
      it. Thread 1 and 2 then race with thread 3's thread__put and they
      might trigger the refcnt == 0 assertion above.
      
      The patch is disabling the last_match cache for multiple thread
      mode. It was originally meant for single thread scenarios, where
      it's common to have multiple sequential searches of the same
      thread.
      
      In multithread mode this does not make sense, because top's threads
      processes different /proc entries and so the 'struct threads' object
      is queried for various threads. Moreover we'd need to add more locks
      to make it work.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20180719143345.12963-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b57334b9
    • J
      perf machine: Add threads__set_last_match function · 67fda0f3
      Jiri Olsa 提交于
      Separating threads::last_match cache set into separate
      threads__set_last_match function.  This will be useful in following
      patch.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20180719143345.12963-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      67fda0f3
    • J
      perf machine: Add threads__get_last_match function · f8b2ebb5
      Jiri Olsa 提交于
      Separating threads::last_match cache read/check into separate
      threads__get_last_match function. This will be useful in following
      patch.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20180719143345.12963-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f8b2ebb5
    • J
      perf tools: Synthesize GROUP_DESC feature in pipe mode · e8fedff1
      Jiri Olsa 提交于
      Stephan reported, that pipe mode does not carry the group information
      and thus the piped report won't display the grouped output for following
      command:
      
        # perf record -e '{cycles,instructions,branches}' -a sleep 4 | perf report
      
      It has no idea about the group setup, so it will display events
      separately:
      
        # Overhead  Command          Shared Object             ...
        # ........  ...............  .......................
        #
             6.71%  swapper          [kernel.kallsyms]
             2.28%  offlineimap      libpython2.7.so.1.0
             0.78%  perf             [kernel.kallsyms]
        ...
      
      Fix GROUP_DESC feature record to be synthesized in pipe mode, so the
      report output is grouped if there are groups defined in record:
      
        #                 Overhead  Command          Shared    ...
        # ........................  ...............  .......
        #
             7.57%   0.16%   0.30%  swapper          [kernel
             1.87%   3.15%   2.46%  offlineimap      libpyth
             1.33%   0.00%   0.00%  perf             [kernel
        ...
      Reported-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180712135202.14774-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e8fedff1
    • S
      perf script: Show correct offsets for DWARF-based unwinding · 2a9d5050
      Sandipan Das 提交于
      When perf/data is recorded with the dwarf call-graph option, the
      callchain shown by 'perf script' still shows the binary offsets of the
      userspace symbols instead of their virtual addresses. Since the symbol
      offset calculation is based on using virtual address as the ip, we see
      incorrect offsets as well.
      
      The use of virtual addresses affects the ability to find out the
      line number in the corresponding source file to which an address
      maps to as described in commit 67540759 ("perf unwind: Use
      addr_location::addr instead of ip for entries").
      
      This has also been addressed by temporarily converting the virtual
      address to the correponding binary offset so that it can be mapped
      to the source line number correctly.
      
      This is a follow-up for commit 19610184 ("perf script: Show
      virtual addresses instead of offsets").
      
      This can be verified on a powerpc64le system running Fedora 27 as
      shown below:
      
        # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton
        # perf record -e probe_libc:inet_pton --call-graph=dwarf ping -6 -c 1 ::1
      
      Before:
      
        # perf report --stdio --no-children -s sym,srcline -g address
      
        # Samples: 1  of event 'probe_libc:inet_pton'
        # Event count (approx.): 1
        #
        # Overhead  Symbol                Source:Line
        # ........  ....................  ...........
        #
           100.00%  [.] __GI___inet_pton  inet_pton.c
                    |
                    ---gaih_inet getaddrinfo.c:537 (inlined)
                       __GI_getaddrinfo getaddrinfo.c:2304 (inlined)
                       main ping.c:519
                       generic_start_main libc-start.c:308 (inlined)
                       __libc_start_main libc-start.c:102
        ...
      
        # perf script -F comm,ip,sym,symoff,srcline,dso
      
        ping
                          15af28 __GI___inet_pton+0xffff000099160008 (/usr/lib64/libc-2.26.so)
          libc-2.26.so[ffff80004ca0af28]
                          10fa53 gaih_inet+0xffff000099160f43
          libc-2.26.so[ffff80004c9bfa53] (inlined)
                          1105b3 __GI_getaddrinfo+0xffff000099160163
          libc-2.26.so[ffff80004c9c05b3] (inlined)
                            2d6f main+0xfffffffd9f1003df (/usr/bin/ping)
          ping[fffffffecf882d6f]
                           2369f generic_start_main+0xffff00009916013f
          libc-2.26.so[ffff80004c8d369f] (inlined)
                           23897 __libc_start_main+0xffff0000991600b7 (/usr/lib64/libc-2.26.so)
          libc-2.26.so[ffff80004c8d3897]
      
      After:
      
        # perf report --stdio --no-children -s sym,srcline -g address
      
        # Samples: 1  of event 'probe_libc:inet_pton'
        # Event count (approx.): 1
        #
        # Overhead  Symbol                Source:Line
        # ........  ....................  ...........
        #
           100.00%  [.] __GI___inet_pton  inet_pton.c
                    |
                    ---gaih_inet.constprop.7 getaddrinfo.c:537
                       getaddrinfo getaddrinfo.c:2304
                       main ping.c:519
                       generic_start_main.isra.0 libc-start.c:308
                       __libc_start_main libc-start.c:102
        ...
      
        # perf script -F comm,ip,sym,symoff,srcline,dso
      
        ping
                    7fffb38aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
          inet_pton.c:68
                    7fffb385fa53 gaih_inet.constprop.7+0xf43 (/usr/lib64/libc-2.26.so)
          getaddrinfo.c:537
                    7fffb38605b3 getaddrinfo+0x163 (/usr/lib64/libc-2.26.so)
          getaddrinfo.c:2304
                       130782d6f main+0x3df (/usr/bin/ping)
          ping.c:519
                    7fffb377369f generic_start_main.isra.0+0x13f (/usr/lib64/libc-2.26.so)
          libc-start.c:308
                    7fffb3773897 __libc_start_main+0xb7 (/usr/lib64/libc-2.26.so)
          libc-start.c:102
      Signed-off-by: NSandipan Das <sandipan@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Fixes: 67540759 ("perf unwind: Use addr_location::addr instead of ip for entries")
      Link: http://lkml.kernel.org/r/20180703120555.32971-1-sandipan@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2a9d5050
    • K
      perf trace arm64: Use generated syscall table · a7f660d6
      Kim Phillips 提交于
      This should speed up accessing new system calls introduced with the
      kernel rather than waiting for libaudit updates to include them.
      
      It also enables users to specify wildcards, for example, perf trace -e
      'open*', just like was already possible on x86, s390, and powerpc, which
      means arm64 can now pass the "Check open filename arg using perf trace +
      vfs_getname" test.
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20180706163454.f714b9ab49ecc8566a0b3565@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a7f660d6
    • K
      perf arm64: Generate system call table from asm/unistd.h · 2b588243
      Kim Phillips 提交于
      This should speed up accessing new system calls introduced with the
      kernel rather than waiting for libaudit updates to include them.
      
      Using the existing other arch scripts resulted in this error:
      
        tools/perf/arch/arm64/entry/syscalls//mksyscalltbl: 25: printf: __NR3264_ftruncate: expected numeric value
      
      because, unlike other arches, asm-generic's unistd.h does things like:
      
        #define __NR_ftruncate __NR3264_ftruncate
      
      Turning the scripts printf's %d into a %s resulted in this in the
      generated syscalls.c file:
      
          static const char *syscalltbl_arm64[] = {
                  [__NR3264_ftruncate] = "ftruncate",
      
      So we use the host C compiler to fold the macros, and print them out
      from within a temporary C program, in order to get the correct output:
      
          static const char *syscalltbl_arm64[] = {
                  [46] = "ftruncate",
      
      Committer notes:
      
      Testing this with a container with an old toolchain breaks because it
      ends up using the system's /usr/include/asm-generic/unistd.h, included
      from tools/arch/arm64/include/uapi/asm/unistd.h when what is desired is
      for it to include tools/include/uapi/asm-generic/unistd.h.
      
      Since all that tools/arch/arm64/include/uapi/asm/unistd.h is to set a
      define and then include asm-generic/unistd.h, do that directly and use
      tools/include/uapi/asm-generic/unistd.h as the file to get the syscall
      definitions to expand.
      
      Testing it:
      
         tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc tools/include/uapi/asm-generic/unistd.h
      
      Now works and generates in the syscall string table.
      
      Before it ended up as:
      
        $ tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc tools/arch/arm64/include/uapi/asm/unistd.h
        static const char *syscalltbl_arm64[] = {
        <stdin>: In function 'main':
        <stdin>:257:38: error: '__NR_getrandom' undeclared (first use in this function)
        <stdin>:257:38: note: each undeclared identifier is reported only once for each function it appears in
        <stdin>:258:41: error: '__NR_memfd_create' undeclared (first use in this function)
        <stdin>:259:32: error: '__NR_bpf' undeclared (first use in this function)
        <stdin>:260:37: error: '__NR_execveat' undeclared (first use in this function)
        tools/perf/arch/arm64/entry/syscalls/mksyscalltbl: 47: tools/perf/arch/arm64/entry/syscalls/mksyscalltbl: /tmp/create-table-60liya: Permission denied
        };
        $
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20180706163443.22626f5e9e10e5bab5e5c662@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2b588243
    • K
      tools include: Grab copies of arm64 dependent unistd.h files · 34b009cf
      Kim Phillips 提交于
      Will be used for generating the syscall id/string translation table.
      
      The arm64 unistd.h file simply #includes the asm-generic/unistd.h, so,
      since we will want to know whether either change, we grab both:
      
        arch/arm64/include/uapi/asm/unistd.h
      
      and
      
        include/uapi/asm-generic/unistd.h
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20180706163434.1b64ffbcc0284fb79982f53b@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      34b009cf
    • S
      perf tests: Fix record+probe_libc_inet_pton.sh when event exists · 60089e42
      Sandipan Das 提交于
      If the event 'probe_libc:inet_pton' already exists, this test fails and
      deletes the existing event before exiting. This will then pass for any
      subsequent executions.
      
      Instead of skipping to deleting the existing event because of failing to
      add a new event, a duplicate event is now created and the script
      continues with the usual checks. Only the new duplicate event that is
      created at the beginning of the test is deleted as a part of the
      cleanups in the end. All existing events remain as it is.
      
      This can be observed on a powerpc64 system running Fedora 27 as shown
      below.
      
        # perf probe -x /usr/lib64/power8/libc-2.26.so -a inet_pton
      
        Added new event:
          probe_libc:inet_pton (on inet_pton in /usr/lib64/power8/libc-2.26.so)
      
      Before:
      
        # perf test -v "probe libc's inet_pton & backtrace it with ping"
      
        62: probe libc's inet_pton & backtrace it with ping       :
        --- start ---
        test child forked, pid 21302
        test child finished with -1
        ---- end ----
        probe libc's inet_pton & backtrace it with ping: FAILED!
      
        # perf probe --list
      
      After:
      
        # perf test -v "probe libc's inet_pton & backtrace it with ping"
      
        62: probe libc's inet_pton & backtrace it with ping       :
        --- start ---
        test child forked, pid 21490
        ping 21513 [035] 39357.565561: probe_libc:inet_pton_1: (7fffa4c623b0)
        7fffa4c623b0 __GI___inet_pton+0x0 (/usr/lib64/power8/libc-2.26.so)
        7fffa4c190dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so)
        7fffa4c19c4c getaddrinfo+0x15c (/usr/lib64/power8/libc-2.26.so)
        111d93c20 main+0x3e0 (/usr/bin/ping)
        test child finished with 0
        ---- end ----
        probe libc's inet_pton & backtrace it with ping: Ok
      
        # perf probe --list
      
          probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c in /usr/lib64/power8/libc-2.26.so)
      Signed-off-by: NSandipan Das <sandipan@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/e11fecff96e6cf4c65cdbd9012463513d7b8356c.1530724939.git.sandipan@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      60089e42
    • S
      perf tests: Fix record+probe_libc_inet_pton.sh to ensure cleanups · 83e3b6d7
      Sandipan Das 提交于
      If there is a mismatch in the perf script output, this test fails and
      exits before the event and temporary files created during its execution
      are cleaned up.
      
      This can be observed on a powerpc64 system running Fedora 27 as shown
      below.
      
        # perf test -v "probe libc's inet_pton & backtrace it with ping"
      
        62: probe libc's inet_pton & backtrace it with ping       :
        --- start ---
        test child forked, pid 18655
        ping 18674 [013] 24511.496995: probe_libc:inet_pton: (7fffa6b423b0)
        7fffa6b423b0 __GI___inet_pton+0x0 (/usr/lib64/power8/libc-2.26.so)
        7fffa6af90dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so)
        FAIL: expected backtrace entry "getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/power8/libc-2.26.so\)$" got "7fffa6af90dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so)"
        test child finished with -1
        ---- end ----
        probe libc's inet_pton & backtrace it with ping: FAILED!
      
        # ls /tmp/expected.* /tmp/perf.data.* /tmp/perf.script.*
      
        /tmp/expected.u31  /tmp/perf.data.Pki  /tmp/perf.script.Bhs
      
        # perf probe --list
      
          probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c in /usr/lib64/power8/libc-2.26.so)
      
      Cleanup of the event and the temporary files are now ensured by allowing
      the cleanup code to be executed even if the lines from the backtrace do
      not match their expected patterns instead of simply exiting from the
      point of failure.
      Signed-off-by: NSandipan Das <sandipan@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/ce9fb091dd3028fba8749a1a267cfbcb264bbfb1.1530724939.git.sandipan@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      83e3b6d7