1. 18 12月, 2018 9 次提交
  2. 22 11月, 2018 12 次提交
    • I
      Merge tag 'perf-core-for-mingo-4.21-20181122' of... · e8e94fce
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-4.21-20181122' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      - Start using BPF maps in 'perf trace' for filters in the augmented syscalls
        code, keeping the existing code for tracepoint filters so that we can switch
        back and forth while getting everything BPFied (Arnaldo Carvalho de Melo)
      
      - Suppress potential format-truncation warning in the PMU code (Ben Hutchings)
      
      - Introduce 'perf bench epoll', with "wait" and "ctl" benchmarks (Davidlohr Bueso)
      
      - Fix slowness due to -ffunction-section, do it by sorting the maps by name, so
        avoiding the using rb_first/next to traverse all entries looking for a map name,
        that with --ffunction-section gets to thousands of maps (Eric Saint-Etienne)
      
      - Separate jvmti cmlr check (Jiri Olsa)
      
      - Allow using the stepping when figuring out which JSON files to use for a x86
        processor, so that Cascadelake server can be support, which has the same
        cpuid as some other processor, being different only in the stepping (Kan Liang)
      
      - Share code and output format for uregs and iregs 'perf script' output (Milian Wolff)
      
      - Use perf_evsel__is_clocki() for clock events in 'perf stat' (Ravi Bangoria)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e8e94fce
    • K
      perf pmu: Move *_cpuid_str() weak functions to header.c · f4a0742b
      Kan Liang 提交于
      The weak functions, strcmp_cpuid_str() and get_cpuid_str(), are defined
      in pmu.c.
      
      Most of the cpuid related functions, including *_cpuid_str()'s
      declaration and platform specific definition, are in header.c/h.
      
      To make the declaration and definition of all cpuid related functions in
      a consistent place, move the weak functions to header.c.
      
      There is no functional change.
      Suggested-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Link: http://lkml.kernel.org/r/20181121164939.13482-1-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f4a0742b
    • E
      perf symbols: Fix slowness due to -ffunction-section · 1e628569
      Eric Saint-Etienne 提交于
      Perf can take minutes to parse an image when -ffunction-section is used.
      This is especially true with the kernel image when it is compiled this
      way, which is the arm64 default since the patcheset "Enable deadcode
      elimination at link time".
      
      Perf organize maps using a rbtree. Whenever perf finds a new symbols, it
      first searches this rbtree for the map it belongs to, by strcmp()'aring
      section names.  When it finds the map with the right name, it uses it to
      add the symbol. With a usual image there aren't so many maps but when
      using -ffunction-section there's basically one map per function.  With
      the kernel image that's north of 40,000 maps. For most symbols perf has
      to parses the entire rbtree to eventually create a new map and add it.
      Consequently perf spends most of the time browsing a rbtree that keeps
      getting larger.
      
      This performance fix introduces a secondary rbtree that indexes maps
      based on the section name.
      Signed-off-by: NEric Saint-Etienne <eric.saint.etienne@oracle.com>
      Reviewed-by: NDave Kleikamp <dave.kleikamp@oracle.com>
      Reviewed-by: NDavid Aldridge <david.aldridge@oracle.com>
      Reviewed-by: NRob Gardner <rob.gardner@oracle.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1542822679-25591-1-git-send-email-eric.saint.etienne@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1e628569
    • J
      perf jvmti: Separate jvmti cmlr check · dd1d0044
      Jiri Olsa 提交于
      The Compiled Method Load Record (cmlr) is JDK specific interface to
      access JVM stack info. This makes the jvmti agent code not compile under
      another jdk, which does not support that.
      
      Separating jvmti cmlr check into special feature check, and adding
      HAVE_JVMTI_CMLR macro to indicate that.
      
      Mark cmlr code in jvmti/libjvmti.c with HAVE_JVMTI_CMLR, so we can
      compile it on system without cmlr support.
      
      This change makes the jvmti compile with java-1.8.0-ibm package. It's
      without the line numbers support, but the rest works.
      
      Adding NO_JVMTI_CMLR compile variable for testing.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Gustavo Luiz Duarte <gduarte@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20181121154341.21521-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dd1d0044
    • K
      perf vendor events: Add JSON metrics for Cascadelake server · ecd94f1b
      Kan Liang 提交于
      Add JSON metrics (based on event list v1) for Cascadelake server
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/3ab97c73-c197-8555-1a35-b54636e667e6@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ecd94f1b
    • K
      perf vendor events: Add stepping in CPUID string for x86 · 3b54411a
      Kan Liang 提交于
      The perf tools cannot find the proper event list for the Cascadelake
      server.  Because the Cascadelake server and the Skylake server have the
      same CPU model number, which are used by the perf tools to find the
      event list.
      
      The stepping for Skylake server is up to 4.
      
      The stepping for Cascadelake server starts from 5.
      
      The stepping can be used to distinguish between them.
      
      The stepping is added in get_cpuid_str().
      
      The stepping information for Skylake server is updated in mapfile.csv.
      
      A x86 specific strcmp_cpuid_cmp() function is added to handle two CPUID
      formats in mapfile.csv, "vendor-family-model-stepping" and
      "vendor-family-model":
      
      - If a cpuid-regular-expression from the mapfile.csv using the new
        stepping format, a cpuid-string generated on the machine must include
        stepping. Otherwise, it is a mismatch.
      
      - If the cpuid-regular-expression using the old non-stepping format,
        the stepping in the cpuid-string will be ignored.
      
      The script, using environment string "PERF_CPUID" without stepping on
      Skylake server, will be broken. If so, users must fix their scripts.
      
      Committer notes:
      
      Fixed this build error on centos:6 and debian:7:
      
        arch/x86/util/header.c: In function 'is_full_cpuid':
        arch/x86/util/header.c:82:39: error: declaration of 'cpuid' shadows a global declaration [-Werror=shadow]
        arch/x86/util/header.c:12:1: error: shadowed declaration is here [-Werror=shadow]
        arch/x86/util/header.c: In function 'strcmp_cpuid_str':
        arch/x86/util/header.c:98:56: error: declaration of 'cpuid' shadows a global declaration [-Werror=shadow]
        arch/x86/util/header.c:12:1: error: shadowed declaration is here [-Werror=shadow]
        cc1: all warnings being treated as errors
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20181114212416.15665-1-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3b54411a
    • R
      perf stat: Use perf_evsel__is_clocki() for clock events · eb08d006
      Ravi Bangoria 提交于
      We already have function to check if a given event is either
      SW_CPU_CLOCK or SW_TASK_CLOCK. Utilize it.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: yuzhoujian@didichuxing.com
      Link: http://lkml.kernel.org/r/20181115095533.16930-1-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eb08d006
    • B
      perf pmu: Suppress potential format-truncation warning · 11a64a05
      Ben Hutchings 提交于
      Depending on which functions are inlined in util/pmu.c, the snprintf()
      calls in perf_pmu__parse_{scale,unit,per_pkg,snapshot}() might trigger a
      warning:
      
        util/pmu.c: In function 'pmu_aliases':
        util/pmu.c:178:31: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=]
          snprintf(path, PATH_MAX, "%s/%s.unit", dir, name);
                                     ^~
      
      I found this when trying to build perf from Linux 3.16 with gcc 8.
      However I can reproduce the problem in mainline if I force
      __perf_pmu__new_alias() to be inlined.
      
      Suppress this by using scnprintf() as has been done elsewhere in perf.
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181111184524.fux4taownc6ndbx6@decadent.org.ukSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      11a64a05
    • P
      perf tools: Add Hygon Dhyana support · 4787eff3
      Pu Wen 提交于
      The tool perf is useful for the performance analysis on the Hygon Dhyana
      platform. But right now there is no Hygon support for it to analyze the
      KVM guest os data. So add Hygon Dhyana support to it by checking vendor
      string to share the code path of AMD.
      Signed-off-by: NPu Wen <puwen@hygon.cn>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1542008451-31735-1-git-send-email-puwen@hygon.cnSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4787eff3
    • D
      perf bench: Add epoll_ctl(2) benchmark · 231457ec
      Davidlohr Bueso 提交于
      Benchmark the various operations allowed for epoll_ctl(2).  The idea is
      to concurrently stress a single epoll instance doing add/mod/del
      operations.
      
      Committer testing:
      
        # perf bench epoll ctl
        # Running 'epoll/ctl' benchmark:
        Run summary [PID 20344]: 4 threads doing epoll_ctl ops 64 file-descriptors for 8 secs.
      
        [thread  0] fdmap: 0x21a46b0 ... 0x21a47ac [ add: 1680960 ops; mod: 1680960 ops; del: 1680960 ops ]
        [thread  1] fdmap: 0x21a4960 ... 0x21a4a5c [ add: 1685440 ops; mod: 1685440 ops; del: 1685440 ops ]
        [thread  2] fdmap: 0x21a4c10 ... 0x21a4d0c [ add: 1674368 ops; mod: 1674368 ops; del: 1674368 ops ]
        [thread  3] fdmap: 0x21a4ec0 ... 0x21a4fbc [ add: 1677568 ops; mod: 1677568 ops; del: 1677568 ops ]
      
        Averaged 1679584 ADD operations (+- 0.14%)
        Averaged 1679584 MOD operations (+- 0.14%)
        Averaged 1679584 DEL operations (+- 0.14%)
        #
      
      Lets measure those calls with 'perf trace' to get a glympse at what this
      benchmark is doing in terms of syscalls:
      
        # perf trace -m32768 -s perf bench epoll ctl
        # Running 'epoll/ctl' benchmark:
        Run summary [PID 20405]: 4 threads doing epoll_ctl ops 64 file-descriptors for 8 secs.
      
        [thread  0] fdmap: 0x21764e0 ... 0x21765dc [ add: 1100480 ops; mod: 1100480 ops; del: 1100480 ops ]
        [thread  1] fdmap: 0x2176790 ... 0x217688c [ add: 1250176 ops; mod: 1250176 ops; del: 1250176 ops ]
        [thread  2] fdmap: 0x2176a40 ... 0x2176b3c [ add: 1022464 ops; mod: 1022464 ops; del: 1022464 ops ]
        [thread  3] fdmap: 0x2176cf0 ... 0x2176dec [ add: 705472 ops; mod: 705472 ops; del: 705472 ops ]
      
        Averaged 1019648 ADD operations (+- 11.27%)
        Averaged 1019648 MOD operations (+- 11.27%)
        Averaged 1019648 DEL operations (+- 11.27%)
      
        Summary of events:
      
        epoll-ctl (20405), 1264 events, 0.0%
      
         syscall            calls    total       min       avg       max      stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
         --------------- -------- --------- --------- --------- ---------     ------
         eventfd2             256     9.514     0.001     0.037     5.243     68.00%
         clone                  4     1.245     0.204     0.311     0.531     24.13%
         mprotect              66     0.345     0.002     0.005     0.021      7.43%
         openat                45     0.313     0.004     0.007     0.073     21.93%
         mmap                  88     0.302     0.002     0.003     0.013      5.02%
         futex                  4     0.160     0.002     0.040     0.140     83.43%
         sched_setaffinity      4     0.124     0.005     0.031     0.070     49.39%
         read                  44     0.103     0.001     0.002     0.013     15.54%
         fstat                 40     0.052     0.001     0.001     0.003      5.43%
         close                 39     0.039     0.001     0.001     0.001      1.48%
         stat                   9     0.034     0.003     0.004     0.006      7.30%
         access                 3     0.023     0.007     0.008     0.008      4.25%
         open                   2     0.021     0.008     0.011     0.013     22.60%
         getdents               4     0.019     0.001     0.005     0.009     37.15%
         write                  2     0.013     0.004     0.007     0.009     38.48%
         munmap                 1     0.010     0.010     0.010     0.010      0.00%
         brk                    3     0.006     0.001     0.002     0.003     26.34%
         rt_sigprocmask         2     0.004     0.001     0.002     0.003     43.95%
         rt_sigaction           3     0.004     0.001     0.001     0.002     16.07%
         prlimit64              3     0.004     0.001     0.001     0.001      5.39%
         prctl                  1     0.003     0.003     0.003     0.003      0.00%
         epoll_create           1     0.003     0.003     0.003     0.003      0.00%
         lseek                  2     0.002     0.001     0.001     0.001     11.42%
         sched_getaffinity        1     0.002     0.002     0.002     0.002      0.00%
         arch_prctl             1     0.002     0.002     0.002     0.002      0.00%
         set_tid_address        1     0.001     0.001     0.001     0.001      0.00%
         getpid                 1     0.001     0.001     0.001     0.001      0.00%
         set_robust_list        1     0.001     0.001     0.001     0.001      0.00%
         execve                 1     0.000     0.000     0.000     0.000      0.00%
      
       epoll-ctl (20406), 1245480 events, 14.6%
      
         syscall            calls    total       min       avg       max      stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
         --------------- -------- --------- --------- --------- ---------     ------
         epoll_ctl         619511  1034.927     0.001     0.002     6.691      0.67%
         nanosleep           3226   616.114     0.006     0.191    10.376      7.57%
         futex                  2    11.336     0.002     5.668    11.334     99.97%
         set_robust_list        1     0.001     0.001     0.001     0.001      0.00%
         clone                  1     0.000     0.000     0.000     0.000      0.00%
      
       epoll-ctl (20407), 1243151 events, 14.5%
      
         syscall            calls    total       min       avg       max      stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
         --------------- -------- --------- --------- --------- ---------     ------
         epoll_ctl         618350  1042.181     0.001     0.002     2.512      0.40%
         nanosleep           3220   366.261     0.012     0.114    18.162      9.59%
         futex                  4     5.463     0.001     1.366     5.427     99.12%
         set_robust_list        1     0.002     0.002     0.002     0.002      0.00%
      
       epoll-ctl (20408), 1801690 events, 21.1%
      
         syscall            calls    total       min       avg       max      stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
         --------------- -------- --------- --------- --------- ---------     ------
         epoll_ctl         896174  1540.581     0.001     0.002     6.987      0.74%
         nanosleep           4667   783.393     0.006     0.168    10.419      7.10%
         futex                  2     4.682     0.002     2.341     4.681     99.93%
         set_robust_list        1     0.002     0.002     0.002     0.002      0.00%
         clone                  1     0.000     0.000     0.000     0.000      0.00%
      
       epoll-ctl (20409), 4254890 events, 49.8%
      
         syscall            calls    total       min       avg       max      stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
         --------------- -------- --------- --------- --------- ---------     ------
         epoll_ctl        2116416  3768.097     0.001     0.002     9.956      0.41%
         nanosleep          11023  1141.778     0.006     0.104     9.447      4.95%
         futex                  3     0.037     0.002     0.012     0.029     70.50%
         set_robust_list        1     0.008     0.008     0.008     0.008      0.00%
         madvise                1     0.005     0.005     0.005     0.005      0.00%
         clone                  1     0.000     0.000     0.000     0.000      0.00%
        #
      
      Committer notes:
      
      Fix build on fedora:24-x-ARC-uClibc, debian:experimental-x-mips,
      debian:experimental-x-mipsel, ubuntu:16.04-x-arm and ubuntu:16.04-x-powerpc
      
          CC       /tmp/build/perf/bench/epoll-ctl.o
        bench/epoll-ctl.c: In function 'init_fdmaps':
        bench/epoll-ctl.c:214:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
          for (i = 0; i < nfds; i+=inc) {
                        ^
        bench/epoll-ctl.c: In function 'bench_epoll_ctl':
        bench/epoll-ctl.c:377:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
          for (i = 0; i < nthreads; i++) {
                        ^
        bench/epoll-ctl.c:388:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
          for (i = 0; i < nthreads; i++) {
                        ^
        cc1: all warnings being treated as errors
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Jason Baron <jbaron@akamai.com>
      Link: http://lkml.kernel.org/r/20181106152226.20883-3-dave@stgolabs.net
      [ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ]
      [ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      231457ec
    • D
      perf bench: Add epoll parallel epoll_wait benchmark · 121dd9ea
      Davidlohr Bueso 提交于
      This program benchmarks concurrent epoll_wait(2) for file descriptors
      that are monitored with with EPOLLIN along various semantics, by a
      single epoll instance. Such conditions can be found when using
      single/combined or multiple queuing when load balancing.
      
      Each thread has a number of private, nonblocking file descriptors,
      referred to as fdmap. A writer thread will constantly be writing to the
      fdmaps of all threads, minimizing each threads's chances of epoll_wait
      not finding any ready read events and blocking as this is not what we
      want to stress. Full details in the start of the C file.
      
      Committer testing:
      
        # perf bench
        Usage:
      	perf bench [<common options>] <collection> <benchmark> [<options>]
      
              # List of all available benchmark collections:
      
               sched: Scheduler and IPC benchmarks
                 mem: Memory access benchmarks
                numa: NUMA scheduling and MM benchmarks
               futex: Futex stressing benchmarks
               epoll: Epoll stressing benchmarks
                 all: All benchmarks
      
        # perf bench epoll
      
              # List of available benchmarks for collection 'epoll':
      
                wait: Benchmark epoll concurrent epoll_waits
                 all: Run all futex benchmarks
      
        # perf bench epoll wait
        # Running 'epoll/wait' benchmark:
        Run summary [PID 19295]: 3 threads monitoring on 64 file-descriptors for 8 secs.
      
        [thread  0] fdmap: 0xdaa650 ... 0xdaa74c [ 328241 ops/sec ]
        [thread  1] fdmap: 0xdaa900 ... 0xdaa9fc [ 351695 ops/sec ]
        [thread  2] fdmap: 0xdaabb0 ... 0xdaacac [ 381423 ops/sec ]
      
        Averaged 353786 operations/sec (+- 4.35%), total secs = 8
        #
      
      Committer notes:
      
      Fix the build on debian:experimental-x-mips, debian:experimental-x-mipsel
      and others:
      
          CC       /tmp/build/perf/bench/epoll-wait.o
        bench/epoll-wait.c: In function 'writerfn':
        bench/epoll-wait.c:399:12: error: format '%ld' expects argument of type 'long int', but argument 2 has type 'size_t' {aka 'unsigned int'} [-Werror=format=]
          printinfo("exiting writer-thread (total full-loops: %ld)\n", iter);
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~
        bench/epoll-wait.c:86:31: note: in definition of macro 'printinfo'
          do { if (__verbose) { printf(fmt, ## arg); fflush(stdout); } } while (0)
                                       ^~~
        cc1: all warnings being treated as errors
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Jason Baron <jbaron@akamai.com> <jbaron@akamai.com>
      Link: http://lkml.kernel.org/r/20181106152226.20883-2-dave@stgolabs.net
      Link: http://lkml.kernel.org/r/20181106182349.thdkpvshkna5vd7o@linux-r8p5>
      [ Applied above fixup as per Davidlohr's request ]
      [ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ]
      [ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      121dd9ea
    • A
      tools build feature: Check if eventfd() is available · 11c6cbe7
      Arnaldo Carvalho de Melo 提交于
      A new 'perf bench epoll' will use this, and to disable it for older
      systems, add a feature test for this API.
      
      This is just a simple program that if successfully compiled, means that
      the feature is present, at least at the library level, in a build that
      sets the output directory to /tmp/build/perf (using O=/tmp/build/perf),
      we end up with:
      
        $ ls -la /tmp/build/perf/feature/test-eventfd*
        -rwxrwxr-x. 1 acme acme 8176 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.bin
        -rw-rw-r--. 1 acme acme  588 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.d
        -rw-rw-r--. 1 acme acme    0 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.make.output
        $ ldd /tmp/build/perf/feature/test-eventfd.bin
      	  linux-vdso.so.1 (0x00007fff3bf3f000)
      	  libc.so.6 => /lib64/libc.so.6 (0x00007fa984061000)
      	  /lib64/ld-linux-x86-64.so.2 (0x00007fa984417000)
        $ grep eventfd -A 2 -B 2 /tmp/build/perf/FEATURE-DUMP
        feature-dwarf=1
        feature-dwarf_getlocations=1
        feature-eventfd=1
        feature-fortify-source=1
        feature-sync-compare-and-swap=1
        $
      
      The main thing here is that in the end we'll have -DHAVE_EVENTFD in
      CFLAGS, and then the 'perf bench' entry needing that API can be
      selectively pruned.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-wkeldwob7dpx6jvtuzl8164k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      11c6cbe7
  3. 21 11月, 2018 19 次提交
    • D
      perf bench: Move HAVE_PTHREAD_ATTR_SETAFFINITY_NP into bench.h · d47d77c3
      Davidlohr Bueso 提交于
      Both futex and epoll need this call, and can cause build failure on
      systems that don't have it pthread_attr_setaffinity_np().
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Jason Baron <jbaron@akamai.com>
      Link: http://lkml.kernel.org/r/20181109210719.pr7ohayuwqmfp2wl@linux-r8p5Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d47d77c3
    • M
      perf script: Share code and output format for uregs and iregs output · 9add8fe8
      Milian Wolff 提交于
      The iregs output was missing the newline at end as well as the leading
      ABI output. This made it hard to compare the iregs and uregs values.
      Instead, use a single function to output the register values and use it
      for both, iregs and uregs, to ensure the output is consistent.
      
      Before:
      
        perf  7049 [-01]  1343.354347:          1 cycles:ppp:
              ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
          AX:0x80000000    BX:0x0    CX:0x0    DX:0x7    SI:0xf    DI:0x286    BP:0xffff95bc8213a460    SP:0xffffacbf0ba97d18    IP:0xffffffffa7bc21cd FLAGS:0x28e    CS:0x10    SS:0x18    R8:0x2    R9:0x21440   R10:0x33816fb3b8c   R11:0x1   R12:0xffff95bc8213a460   R13:0xffff95bc8213a400   R14:0xffff95bc8213a400   R15:0x1  ABI:2    AX:0xffffffffffffffda    BX:0xffffffffffffffff    CX:0x7f84ad85798b    DX:0x560209699d50    SI:0x7ffe2c7a6820    DI:0x7ffe2c7a8c9b    BP:0x7ffe2c7a20d0    SP:0x7ffe2c7a2058    IP:0x7f84ad85798b FLAGS:0x206    CS:0x33    SS:0x2b    R8:0x7ffe2c7a2030    R9:0x7f84ae55f010   R10:0x8   R11:0x206   R12:0xffffffffffffffff   R13:0xffffffffffffffff   R14:0xffffffffffffffff   R15:0xffffffffffffffff
      
        perf  7049 [-01]  1343.354363:          1 cycles:ppp:
              ...
      
      After:
      
        perf  7049 [-01]  1343.354347:          1 cycles:ppp:
              ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
          ABI:2    AX:0x80000000    BX:0x0    CX:0x0    DX:0x7    SI:0xf    DI:0x286    BP:0xffff95bc8213a460    SP:0xffffacbf0ba97d18    IP:0xffffffffa7bc21cd FLAGS:0x28e    CS:0x10    SS:0x18    R8:0x2    R9:0x21440   R10:0x33816fb3b8c   R11:0x1   R12:0xffff95bc8213a460   R13:0xffff95bc8213a400   R14:0xffff95bc8213a400   R15:0x1
          ABI:2    AX:0xffffffffffffffda    BX:0xffffffffffffffff    CX:0x7f84ad85798b    DX:0x560209699d50    SI:0x7ffe2c7a6820    DI:0x7ffe2c7a8c9b    BP:0x7ffe2c7a20d0    SP:0x7ffe2c7a2058    IP:0x7f84ad85798b FLAGS:0x206    CS:0x33    SS:0x2b    R8:0x7ffe2c7a2030    R9:0x7f84ae55f010   R10:0x8   R11:0x206   R12:0xffffffffffffffff   R13:0xffffffffffffffff   R14:0xffffffffffffffff   R15:0xffffffffffffffff
      
        perf  7049 [-01]  1343.354363:          1 cycles:ppp:
              ...
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20181107223437.9071-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9add8fe8
    • A
      perf bpf: Reduce the hardcoded .max_entries for pid_maps · 0f7c2de5
      Arnaldo Carvalho de Melo 提交于
      While working on augmented syscalls I got into this error:
      
        # trace -vv --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
        <SNIP>
        libbpf: map 0 is "__augmented_syscalls__"
        libbpf: map 1 is "__bpf_stdout__"
        libbpf: map 2 is "pids_filtered"
        libbpf: map 3 is "syscalls"
        libbpf: collecting relocating info for: '.text'
        libbpf: relo for 13 value 84 name 133
        libbpf: relocation: insn_idx=3
        libbpf: relocation: find map 3 (pids_filtered) for insn 3
        libbpf: collecting relocating info for: 'raw_syscalls:sys_enter'
        libbpf: relo for 8 value 0 name 0
        libbpf: relocation: insn_idx=1
        libbpf: relo for 8 value 0 name 0
        libbpf: relocation: insn_idx=3
        libbpf: relo for 9 value 28 name 178
        libbpf: relocation: insn_idx=36
        libbpf: relocation: find map 1 (__augmented_syscalls__) for insn 36
        libbpf: collecting relocating info for: 'raw_syscalls:sys_exit'
        libbpf: relo for 8 value 0 name 0
        libbpf: relocation: insn_idx=0
        libbpf: relo for 8 value 0 name 0
        libbpf: relocation: insn_idx=2
        bpf: config program 'raw_syscalls:sys_enter'
        bpf: config program 'raw_syscalls:sys_exit'
        libbpf: create map __bpf_stdout__: fd=3
        libbpf: create map __augmented_syscalls__: fd=4
        libbpf: create map syscalls: fd=5
        libbpf: create map pids_filtered: fd=6
        libbpf: added 13 insn from .text to prog raw_syscalls:sys_enter
        libbpf: added 13 insn from .text to prog raw_syscalls:sys_exit
        libbpf: load bpf program failed: Operation not permitted
        libbpf: failed to load program 'raw_syscalls:sys_exit'
        libbpf: failed to load object 'tools/perf/examples/bpf/augmented_raw_syscalls.c'
        bpf: load objects failed: err=-4009: (Incorrect kernel version)
        event syntax error: 'tools/perf/examples/bpf/augmented_raw_syscalls.c'
                             \___ Failed to load program for unknown reason
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf trace [<options>] [<command>]
            or: perf trace [<options>] -- <command> [<options>]
            or: perf trace record [<options>] [<command>]
            or: perf trace record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event/syscall selector. use 'perf list' to list available events
      
      If I then try to use strace (perf trace'ing 'perf trace' needs some more work
      before its possible) to get a bit more info I get:
      
        # strace -e bpf trace --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=4, map_flags=0, inner_map_fd=0, map_name="__bpf_stdout__", map_ifindex=0}, 72) = 3
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=4, map_flags=0, inner_map_fd=0, map_name="__augmented_sys", map_ifindex=0}, 72) = 4
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=1, max_entries=500, map_flags=0, inner_map_fd=0, map_name="syscalls", map_ifindex=0}, 72) = 5
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=512, map_flags=0, inner_map_fd=0, map_name="pids_filtered", map_ifindex=0}, 72) = 6
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=57, insns=0x1223f50, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_enter", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = 7
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=18, insns=0x1224120, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted)
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=18, insns=0x1224120, license="GPL", log_level=1, log_size=262144, log_buf="", kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted)
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=18, insns=0x1224120, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted)
        event syntax error: 'tools/perf/examples/bpf/augmented_raw_syscalls.c'
                             \___ Failed to load program for unknown reason
        <SNIP similar output as without 'strace'>
        #
      
      I managed to create the maps, etc, but then installing the "sys_exit" hook into
      the "raw_syscalls:sys_exit" tracepoint somehow gets -EPERMed...
      
      I then go and try reducing the size of this new table:
      
        +++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
        @@ -47,6 +47,17 @@ struct augmented_filename {
         #define SYS_OPEN 2
         #define SYS_OPENAT 257
      
        +struct syscall {
        +       bool    filtered;
        +};
        +
        +struct bpf_map SEC("maps") syscalls = {
        +       .type        = BPF_MAP_TYPE_ARRAY,
        +       .key_size    = sizeof(int),
        +       .value_size  = sizeof(struct syscall),
        +       .max_entries = 500,
        +};
      
      And after reducing that .max_entries a tad, it works. So yeah, the "unknown
      reason" should be related to the number of bytes all this is taking, reduce the
      default for pid_map()s so that we can have a "syscalls" map with enough slots
      for all syscalls in most arches. And take notes about this error message,
      improve it :-)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Edward Cree <ecree@solarflare.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/n/tip-yjzhak8asumz9e9hts2dgplp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0f7c2de5
    • M
      perf script: Add newline after uregs output · b07d16f7
      Milian Wolff 提交于
      This change makes it much easier to easily distinguish between
      consecutive samples by keeping the empty line between them, like we see
      when we do not enable uregs output.
      
      Before:
      
        cpp-inlining 28298 [-01] 54837.342780:    3068085 cycles:pp:
                    7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
                    ...
         ABI:2    AX:0x0    BX:0x40f56cf6    CX:0x294a3ae7    ...
        cpp-inlining 28298 [-01] 54837.344493:    2881929 cycles:pp:
                    7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
                    ...
         ABI:2    AX:0x40d440c7    BX:0x40d440c7    CX:0x4d45e5da    ...
      
      After:
      
        cpp-inlining 28298 [-01] 54837.342780:    3068085 cycles:pp:
                    7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
                    ...
         ABI:2    AX:0x0    BX:0x40f56cf6    CX:0x294a3ae7    ...
      
        cpp-inlining 28298 [-01] 54837.344493:    2881929 cycles:pp:
                    7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
                    ...
         ABI:2    AX:0x40d440c7    BX:0x40d440c7    CX:0x4d45e5da    ...
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20181107093705.16346-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b07d16f7
    • A
      Revert "perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter" · 4aa792de
      Arnaldo Carvalho de Melo 提交于
      Now that we have the "filtered_pids" logic in place, no need to do this
      rough filter to avoid the feedback loop from 'perf trace's own syscalls,
      revert it.
      
      This reverts commit 7ed71f124284359676b6496ae7db724fee9da753.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-88vh02cnkam0vv5f9vp02o3h@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4aa792de
    • A
      perf augmented_syscalls: Remove example hardcoded set of filtered pids · e312747b
      Arnaldo Carvalho de Melo 提交于
      Now that 'perf trace' fills in that "filtered_pids" BPF map, remove the
      set of filtered pids used as an example to test that feature.
      
      That feature works like this:
      
      Starting a system wide 'strace' like 'perf trace' augmented session we
      noticed that lots of events take place for a pid, which ends up being
      the feedback loop of perf trace's syscalls being processed by the
      'gnome-terminal' process:
      
        # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c
           0.391 ( 0.002 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f750bc, count: 8176) = 453
           0.394 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f75280, count: 7724) = -1 EAGAIN Resource temporarily unavailable
           0.438 ( 0.001 ms): gnome-terminal/2469 read(fd: 4<anon_inode:[eventfd]>, buf: 0x7fffc696aeb0, count: 16) = 8
           0.519 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f75280, count: 7724) = 114
           0.522 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f752f1, count: 7611) = -1 EAGAIN Resource temporarily unavailable
        ^C
      
      So we can use --filter-pids to get rid of that one, and in this case what is
      being used to implement that functionality is that "filtered_pids" BPF map that
      the tools/perf/examples/bpf/augmented_raw_syscalls.c created and that 'perf trace'
      bpf loader noticed and created a "struct bpf_map" associated that then got populated
      by 'perf trace':
      
        # perf trace --filter-pids 2469 -e tools/perf/examples/bpf/augmented_raw_syscalls.c
           0.020 ( 0.002 ms): gnome-shell/1663 epoll_pwait(epfd: 12<anon_inode:[eventpoll]>, events: 0x7ffd8f3ef960, maxevents: 32, sigsetsize: 8) = 1
           0.025 ( 0.002 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8240, count: 8112) = 48
           0.029 ( 0.001 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8258, count: 8088) = -1 EAGAIN Resource temporarily unavailable
           0.032 ( 0.001 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8240, count: 8112) = -1 EAGAIN Resource temporarily unavailable
           0.040 ( 0.003 ms): gnome-shell/1663 recvmsg(fd: 46<socket:[35893]>, msg: 0x7ffd8f3ef950) = -1 EAGAIN Resource temporarily unavailable
          21.529 ( 0.002 ms): gnome-shell/1663 epoll_pwait(epfd: 5<anon_inode:[eventpoll]>, events: 0x7ffd8f3ef960, maxevents: 32, sigsetsize: 8) = 1
          21.533 ( 0.004 ms): gnome-shell/1663 recvmsg(fd: 82<socket:[42826]>, msg: 0x7ffd8f3ef7b0, flags: DONTWAIT|CMSG_CLOEXEC) = 236
          21.581 ( 0.006 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7ffd8f3ef060) = 0
          21.605 ( 0.020 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_CREATE, arg: 0x7ffd8f3eeea0) = 0
          21.626 ( 0.119 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_SET_DOMAIN, arg: 0x7ffd8f3eee94) = 0
          21.746 ( 0.081 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_PWRITE, arg: 0x7ffd8f3eeea0) = 0
        ^C
      
      Oops, yet another gnome process that is involved with the output that
      'perf trace' generates, lets filter that out too:
      
        # perf trace --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c
               ? (         ): wpa_supplicant/1366  ... [continued]: select()) = 0 Timeout
           0.006 ( 0.002 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e430) = 0
           0.011 ( 0.001 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e3e0) = 0
           0.014 ( 0.001 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e430) = 0
               ? (         ): gmain/1791  ... [continued]: poll()) = 0 Timeout
           0.017 (         ): wpa_supplicant/1366 select(n: 6, inp: 0x55646fed3ad0, outp: 0x55646fed3b60, exp: 0x55646fed3bf0, tvp: 0x7fffe5b1e4a0) ...
         157.879 ( 0.019 ms): gmain/1791 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: , mask: 16789454) = -1 ENOENT No such file or directory
               ? (         ): cupsd/1001  ... [continued]: epoll_pwait()) = 0
               ? (         ): gsd-color/1908  ... [continued]: poll()) = 0 Timeout
         499.615 (         ): cupsd/1001 epoll_pwait(epfd: 4<anon_inode:[eventpoll]>, events: 0x557a21166500, maxevents: 4096, timeout: 1000, sigsetsize: 8) ...
         586.593 ( 0.004 ms): gsd-color/1908 recvmsg(fd: 3<socket:[38074]>, msg: 0x7ffdef34e800) = -1 EAGAIN Resource temporarily unavailable
               ? (         ): fwupd/2230  ... [continued]: poll()) = 0 Timeout
               ? (         ): rtkit-daemon/906  ... [continued]: poll()) = 0 Timeout
               ? (         ): rtkit-daemon/907  ... [continued]: poll()) = 1
         724.603 ( 0.007 ms): rtkit-daemon/907 read(fd: 6<anon_inode:[eventfd]>, buf: 0x7f05ff768d08, count: 8) = 8
               ? (         ): ssh/5461  ... [continued]: select()) = 1
         810.431 ( 0.002 ms): ssh/5461 clock_gettime(which_clock: BOOTTIME, tp: 0x7ffd7f39f870) = 0
         ^C
      
      Several syscall exit events for syscalls in flight when 'perf trace' started, etc. Saner :-)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-c3tu5yg204p5mvr9kvwew07n@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e312747b
    • A
      perf trace: Fill in BPF "filtered_pids" map when present · a9964c43
      Arnaldo Carvalho de Melo 提交于
      This makes the augmented_syscalls support the --filter-pids and
      auto-filtered feedback loop pids just like when working without BPF,
      i.e. with just raw_syscalls:sys_{enter,exit} and tracepoint filters.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-zc5n453sxxm0tz1zfwwelyti@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a9964c43
    • A
      perf trace: See if there is a map named "filtered_pids" · 744fafc7
      Arnaldo Carvalho de Melo 提交于
      Lookup for the first map named "filtered_pids" and, if augmenting
      syscalls, i.e. if a BPF event is present and the
      "__augmented_syscalls__" is present, then fill in that map with the pids
      to filter, be it feedback loop ones (perf trace's pid, its father if it
      is "sshd", more auto-filtered in the future) or the ones explicitely
      stated in the tool command line via --filter-pids.
      
      The code to actually fill in the map comes next.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-rhzytmw7qpe6lqyjxi1ded9t@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      744fafc7
    • A
      perf trace: Add "_from_option" suffix to trace__set_filter() · 6a0b3aba
      Arnaldo Carvalho de Melo 提交于
      As we'll need that name for a new function to set filters for both
      tracepoints and BPF maps for filtering pids.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-mdkck6hf3fnd21rz2766280q@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6a0b3aba
    • A
      perf evlist: Rename perf_evlist__set_filter* to perf_evlist__set_tp_filter* · 7ad92a33
      Arnaldo Carvalho de Melo 提交于
      To better reflect that this is a tracepoint filter, as opposed, for
      instance to map based BPF filters.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-9138svli6ddcphrr3ymy9oy3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7ad92a33
    • A
      perf augmented_syscalls: Use pid_filter · ed9a77ba
      Arnaldo Carvalho de Melo 提交于
      Just to test filtering a bunch of pids, now its time to go and get that
      hooked up in 'perf trace', right after we load the bpf program, if we
      find a "pids_filtered" map defined, we'll populate it with the filtered
      pids.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-1i9s27wqqdhafk3fappow84x@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ed9a77ba
    • A
      perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter · 77ecb640
      Arnaldo Carvalho de Melo 提交于
      When testing system wide tracing without filtering the syscalls called
      by 'perf trace' itself we get into a feedback loop, drop for now those
      two syscalls, that are the ones that 'perf trace' does in its loop for
      writing the syscalls it intercepts, to help with testing till we get
      that filtering in place.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-rkbu536af66dbsfx51sr8yof@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      77ecb640
    • A
      perf bpf: Add simple pid_filter class accessible to BPF proggies · 8008aab0
      Arnaldo Carvalho de Melo 提交于
      Will be used in the augmented_raw_syscalls.c to implement 'perf trace
      --filter-pids'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-9sybmz4vchlbpqwx2am13h9e@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8008aab0
    • A
      perf bpf: Add defines for map insertion/lookup · 382b55db
      Arnaldo Carvalho de Melo 提交于
      Starting with a helper for a basic pid_map(), a hash using a pid as a
      key.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-gdwvq53wltvq6b3g5tdmh0cw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      382b55db
    • A
      perf augmented_syscalls: Remove needless linux/socket.h include · 66067538
      Arnaldo Carvalho de Melo 提交于
      Leftover from when we started augmented_raw_syscalls.c from
      tools/perf/examples/bpf/augmented_syscalls.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: e58a0322dbac ("perf examples bpf: Start augmenting raw_syscalls:sys_{start,exit}")
      Link: https://lkml.kernel.org/n/tip-pmts9ls2skh8n3zisb4txudd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      66067538
    • A
      perf augmented_syscalls: Filter on a hard coded pid · 55f127b4
      Arnaldo Carvalho de Melo 提交于
      Just to show where we'll hook pid based filters, and what we use to
      obtain the current pid, using a BPF getpid() equivalent.
      
      Now we need to remove that hardcoded PID with a BPF hash map, so that we
      start by filtering 'perf trace's own PID, implement the --filter-pid
      functionality, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-oshrcgcekiyhd0whwisxfvtv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      55f127b4
    • A
      perf bpf: Add unistd.h to the headers accessible to bpf proggies · 1475d35c
      Arnaldo Carvalho de Melo 提交于
      Start with a getpid() function wrapping BPF_FUNC_get_current_pid_tgid,
      idea is to mimic the system headers.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-zo8hv22onidep7tm785dzxfk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1475d35c
    • I
      Merge tag 'perf-urgent-for-mingo-4.20-20181121' of... · b1a9d7b0
      Ingo Molnar 提交于
      Merge tag 'perf-urgent-for-mingo-4.20-20181121' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes:
      
      - Update kernel ABI headers, one of them lead to a small change in
        the ioctl 'cmd' beautifier in 'perf trace' to support the new ISO7816
        commands. (Arnaldo Carvalho de Melo)
      
      - Restore proper cwd on return from mnt namespace (Jiri Olsa)
      
      - Add feature check for the get_current_dir_name() function used in the
        namespace fix from Jiri, that is not available in systems such as
        Alpine Linux, which uses the  musl libc (Arnaldo Carvalho de Melo)
      
      - Fix crash in 'perf record' when synthesizing the unit for events such
        as 'cpu-clock' (Jiri Olsa)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b1a9d7b0
    • P
      perf/x86/intel: Fix regression by default disabling perfmon v4 interrupt handling · 2a5bf23d
      Peter Zijlstra 提交于
      Kyle Huey reported that 'rr', a replay debugger, broke due to the following commit:
      
        af3bdb99 ("perf/x86/intel: Add a separate Arch Perfmon v4 PMI handler")
      
      Rework the 'disable_counter_freezing' __setup() parameter such that we
      can explicitly enable/disable it and switch to default disabled.
      
      To this purpose, rename the parameter to "perf_v4_pmi=" which is a much
      better description and allows requiring a bool argument.
      
      [ mingo: Improved the changelog some more. ]
      Reported-by: NKyle Huey <me@kylehuey.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert O'Callahan <robert@ocallahan.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/20181120170842.GZ2131@hirez.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2a5bf23d