1. 09 7月, 2019 9 次提交
    • A
      perf tools: Use zfree() where applicable · d8f9da24
      Arnaldo Carvalho de Melo 提交于
      In places where the equivalent was already being done, i.e.:
      
         free(a);
         a = NULL;
      
      And in placs where struct members are being freed so that if we have
      some erroneous reference to its struct, then accesses to freed members
      will result in segfaults, which we can detect faster than use after free
      to areas that may still have something seemingly valid.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-jatyoofo5boc1bsvoig6bb6i@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d8f9da24
    • A
      tools lib: Adopt zalloc()/zfree() from tools/perf · 7f7c536f
      Arnaldo Carvalho de Melo 提交于
      Eroding a bit more the tools/perf/util/util.h hodpodge header.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7f7c536f
    • A
      perf tools: Move get_current_dir_name() cond prototype out of util.h · e5653eb8
      Arnaldo Carvalho de Melo 提交于
      And in a separate header, so that we erode util.h a bit more.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-xpzvuu9d0gei9jl9bkzgobln@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e5653eb8
    • A
      perf namespaces: Move the conditional setns() prototype to namespaces.h · 245aec7f
      Arnaldo Carvalho de Melo 提交于
      Out of util.h, to reduce its scope, and since we have a namespaces.h
      header, much better to have it there, where it is related to.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-zlu81bbtccuzygh7m8nmgybc@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      245aec7f
    • A
      perf tools: Add missing headers, mostly stdlib.h · 215a0d30
      Arnaldo Carvalho de Melo 提交于
      Part of the erosion of util/util.h, that will lose its include stdlib.h,
      we need to add it to places where it is needed but was getting it
      indirectly.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-1imnqezw99ahc07fjeb51qby@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      215a0d30
    • A
      perf evsel: perf_evsel__name(NULL) is valid, no need to check evsel · fc50e0ba
      Arnaldo Carvalho de Melo 提交于
      It'll return "unknown", no need to open code it.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-4okvjmm18arjrcyfhuahgfxm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fc50e0ba
    • L
      perf session: Fix potential NULL pointer dereference found by the smatch tool · f3c8d907
      Leo Yan 提交于
      Based on the following report from Smatch, fix the potential
      NULL pointer dereference check.
      
        tools/perf/util/session.c:1252
        dump_read() error: we previously assumed 'evsel' could be null
        (see line 1249)
      
        tools/perf/util/session.c
        1240 static void dump_read(struct perf_evsel *evsel, union perf_event *event)
        1241 {
        1242         struct read_event *read_event = &event->read;
        1243         u64 read_format;
        1244
        1245         if (!dump_trace)
        1246                 return;
        1247
        1248         printf(": %d %d %s %" PRIu64 "\n", event->read.pid, event->read.tid,
        1249                evsel ? perf_evsel__name(evsel) : "FAIL",
        1250                event->read.value);
        1251
        1252         read_format = evsel->attr.read_format;
                                   ^^^^^^^
      
      'evsel' could be NULL pointer, for this case this patch directly bails
      out without dumping read_event.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Alexios Zavras <alexios.zavras@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190702103420.27540-9-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3c8d907
    • L
      perf map: Fix potential NULL pointer dereference found by smatch tool · 363bbaef
      Leo Yan 提交于
      Based on the following report from Smatch, fix the potential NULL
      pointer dereference check.
      
        tools/perf/util/map.c:479
        map__fprintf_srccode() error: we previously assumed 'state' could be
        null (see line 466)
      
        tools/perf/util/map.c
        465         /* Avoid redundant printing */
        466         if (state &&
        467             state->srcfile &&
        468             !strcmp(state->srcfile, srcfile) &&
        469             state->line == line) {
        470                 free(srcfile);
        471                 return 0;
        472         }
        473
        474         srccode = find_sourceline(srcfile, line, &len);
        475         if (!srccode)
        476                 goto out_free_line;
        477
        478         ret = fprintf(fp, "|%-8d %.*s", line, len, srccode);
        479         state->srcfile = srcfile;
                    ^^^^^^^
        480         state->line = line;
                    ^^^^^^^
      
      This patch validates 'state' pointer before access its elements.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Alexios Zavras <alexios.zavras@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Fixes: dd2e18e9 ("perf tools: Support 'srccode' output")
      Link: http://lkml.kernel.org/r/20190702103420.27540-8-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      363bbaef
    • L
      perf annotate: Fix dereferencing freed memory found by the smatch tool · 600c787d
      Leo Yan 提交于
      Based on the following report from Smatch, fix the potential
      dereferencing freed memory check.
      
        tools/perf/util/annotate.c:1125
        disasm_line__parse() error: dereferencing freed memory 'namep'
      
        tools/perf/util/annotate.c
        1100 static int disasm_line__parse(char *line, const char **namep, char **rawp)
        1101 {
        1102         char tmp, *name = ltrim(line);
      
        [...]
      
        1114         *namep = strdup(name);
        1115
        1116         if (*namep == NULL)
        1117                 goto out_free_name;
      
        [...]
      
        1124 out_free_name:
        1125         free((void *)namep);
                                  ^^^^^
        1126         *namep = NULL;
                     ^^^^^^
        1127         return -1;
        1128 }
      
      If strdup() fails to allocate memory space for *namep, we don't need to
      free memory with pointer 'namep', which is resident in data structure
      disasm_line::ins::name; and *namep is NULL pointer for this failure, so
      it's pointless to assign NULL to *namep again.
      
      Committer note:
      
      Freeing namep, which is the address of the first entry of the 'struct
      ins' that is the first member of struct disasm_line would in fact free
      that disasm_line instance, if it was allocated via malloc/calloc, which,
      later, would a dereference of freed memory.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Alexios Zavras <alexios.zavras@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190702103420.27540-5-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      600c787d
  2. 07 7月, 2019 4 次提交
    • A
      perf python: Remove -fstack-protector-strong if clang doesn't have it · c18ae632
      Arnaldo Carvalho de Melo 提交于
      Some distros put -fstack-protector-strong in the compiler flags to be
      used to build python extensions, but then, the clang version in that
      distro doesn't know about that, only gcc does.
      
      Check if that is the case and remove it from the set of options used to
      build the python binding with clang.
      
      Case at hand:
      
      oraclelinux:7
      
        $ head -2 /etc/os-release
        NAME="Oracle Linux Server"
        VERSION="7.6"
        $ grep stack-protector /usr/lib64/python2.7/_sysconfigdata.py | head -1 | cut -c-120
       'CFLAGS': '-fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --para
        $
        gcc version 4.8.5 20150623 (Red Hat 4.8.5-36.0.1) (GCC)
        clang version 3.4.2 (tags/RELEASE_34/dot2-final)
      
        clang: error: unknown argument: '-fstack-protector-strong'
        clang: error: unknown argument: '-fstack-protector-strong'
        error: command 'clang' failed with exit status 1
        cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf*.so': No such file or directory
        make[2]: *** [/tmp/build/perf/python/perf.so] Error 1
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-brmp2415zxpbhz45etkgjoma@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c18ae632
    • J
      perf evsel: Do not rely on errno values for precise_ip fallback · cd136189
      Jiri Olsa 提交于
      Konstantin reported problem with default perf record command, which
      fails on some AMD servers, because of the default maximum precise
      config.
      
      The current fallback mechanism counts on getting ENOTSUP errno for
      precise_ip fails, but that's not the case on some AMD servers.
      
      We can fix this by removing the errno check completely, because the
      precise_ip fallback is separated. We can just try  (if requested by
      evsel->precise_max) all possible precise_ip, and if one succeeds we win,
      if not, we continue with standard fallback.
      Reported-by: NKonstantin Kharlamov <Hi-Angel@yandex.ru>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Monnet <quentin.monnet@netronome.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Link: http://lkml.kernel.org/r/20190703080949.10356-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cd136189
    • A
      perf thread: Allow references to thread objects after machine__exit() · 4c00af0e
      Arnaldo Carvalho de Melo 提交于
      Threads are created when we either synthesize PERF_RECORD_FORK events
      for pre-existing threads or when we receive PERF_RECORD_FORK events from
      the kernel as new threads get created.
      
      We then keep them in machine->threads[].entries rb trees till when we
      receive a PERF_RECORD_EXIT, i.e. that thread terminated.
      
      The thread object has a reference count that is grabbed when, for
      instance, we keep that thread referenced in struct hist_entry, in 'perf
      report' and 'perf top'.
      
      When we receive a PERF_RECORD_EXIT we remove the thread object from the
      rb tree and move it to the corresponding machine->threads[].dead list,
      then we do a thread__put(), dropping the reference we had for keeping it
      in the rb tree.
      
      In thread__put() we were assuming that when the reference count hit zero
      we should remove it from the dead list by simply doing a
      list_del_init(&thread->node).
      
      That works well when all the thread lifetime is during the machine that
      has the list heads lifetime, since we know that we can do the
      list_del_init() and it will update the 'dead' list_head.
      
      But in 'perf sched lat' we were doing:
      
          machine__new() (via perf_session__new)
      
          process events, grabbing refcounts to keep those thread objects
          in 'perf sched' local data structures.
      
          machine__exit() (via perf_session__delete) which would delete the
          'dead' list heads.
      
          And then doing the final thread__put() for the refcounts 'perf sched'
          rightfully obtained for keeping those thread object references.
      
          b00m, since thread__put() would do the list_del_init() touching
          a dead dead list head.
      
      Fix it by removing all the dead threads from machine->threads[].dead at
      machine__exit(), since whatever is there should have refcounts taken by
      things like 'perf sched lat', and make thread__put() check if the thread
      is in a linked list before removing it from that list.
      Reported-by: NWei Li <liwei391@huawei.com>
      Link: https://lkml.kernel.org/r/20190508143648.8153-1-liwei391@huawei.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhipeng Xie <xiezhipeng1@huawei.com>
      Link: https://lkml.kernel.org/r/20190704194355.GI10740@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4c00af0e
    • S
      perf header: Assign proper ff->ph in perf_event__synthesize_features() · c952b35f
      Song Liu 提交于
      bpf/btf write_* functions need ff->ph->env.
      
      With this missing, pipe-mode (perf record -o -)  would crash like:
      
      Program terminated with signal SIGSEGV, Segmentation fault.
      
      This patch assign proper ph value to ff.
      
      Committer testing:
      
        (gdb) run record -o -
        Starting program: /root/bin/perf record -o -
        PERFILE2
        <SNIP start of perf.data headers>
        Thread 1 "perf" received signal SIGSEGV, Segmentation fault.
        __do_write_buf (size=4, buf=0x160, ff=0x7fffffff8f80) at util/header.c:126
        126		memcpy(ff->buf + ff->offset, buf, size);
        (gdb) bt
        #0  __do_write_buf (size=4, buf=0x160, ff=0x7fffffff8f80) at util/header.c:126
        #1  do_write (ff=ff@entry=0x7fffffff8f80, buf=buf@entry=0x160, size=4) at util/header.c:137
        #2  0x00000000004eddba in write_bpf_prog_info (ff=0x7fffffff8f80, evlist=<optimized out>) at util/header.c:912
        #3  0x00000000004f69d7 in perf_event__synthesize_features (tool=tool@entry=0x97cc00 <record>, session=session@entry=0x7fffe9c6d010,
            evlist=0x7fffe9cae010, process=process@entry=0x4435d0 <process_synthesized_event>) at util/header.c:3695
        #4  0x0000000000443c79 in record__synthesize (tail=tail@entry=false, rec=0x97cc00 <record>) at builtin-record.c:1214
        #5  0x0000000000444ec9 in __cmd_record (rec=0x97cc00 <record>, argv=<optimized out>, argc=0) at builtin-record.c:1435
        #6  cmd_record (argc=0, argv=<optimized out>) at builtin-record.c:2450
        #7  0x00000000004ae3e9 in run_builtin (p=p@entry=0x98e058 <commands+216>, argc=argc@entry=3, argv=0x7fffffffd670) at perf.c:304
        #8  0x000000000042eded in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:356
        #9  run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:400
        #10 main (argc=3, argv=<optimized out>) at perf.c:522
        (gdb)
      
      After the patch the SEGSEGV is gone.
      Reported-by: NDavid Carrillo Cisneros <davidca@fb.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: kernel-team@fb.com
      Cc: stable@vger.kernel.org # v5.1+
      Fixes: 606f972b ("perf bpf: Save bpf_prog_info information as headers to perf.data")
      Link: http://lkml.kernel.org/r/20190620010453.4118689-1-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c952b35f
  3. 03 7月, 2019 4 次提交
    • A
      perf tools metric: Don't include duration_time in group · 488c3bf7
      Andi Kleen 提交于
      The Memory_BW metric generates groups including duration_time, which
      maps to a software event.
      
      For some reason this makes the group always not count.
      
      Always put duration_time outside a group when generating metrics.  It's
      always the same time, so no need to group it.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190628220737.13259-3-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      488c3bf7
    • A
      perf list: Avoid extra : for --raw metrics · 9c344d15
      Andi Kleen 提交于
      When printing the metrics raw, don't print : after the metricgroups.
      This helps the command line completion to complete those too.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190628220737.13259-2-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9c344d15
    • J
      perf pmu: Support more complex PMU event aliasing · 730670b1
      John Garry 提交于
      The jevent "Unit" field is used for uncore PMU alias definition.
      
      The form uncore_pmu_example_X is supported, where "X" is a wildcard, to
      support multiple instances of the same PMU in a system.
      
      Unfortunately this format not suitable for all uncore PMUs; take the
      Hisi DDRC uncore PMU for example, where the name is in the form
      hisi_scclX_ddrcY.
      
      For for current jevent parsing, we would be required to hardcode an
      uncore alias translation for each possible value of X. This is not
      scalable.
      
      Instead, add support for "Unit" field in the form "hisi_sccl,ddrc",
      where we can match by hisi_scclX and ddrcY. Tokens  in Unit field are
      delimited by ','.
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linuxarm@huawei.com
      Link: http://lkml.kernel.org/r/1561732552-143038-2-git-send-email-john.garry@huawei.com
      [ Shut up older gcc complianing about the last arg to strtok_r() being uninitialized, set that tmp to NULL ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      730670b1
    • J
      perf diff: Print the basic block cycles diff · b10c78c5
      Jin Yao 提交于
       $ perf record -b ./div
       $ perf record -b ./div
      
      Following is the default perf diff output
      
       $ perf diff
      
       # Event 'cycles'
       #
       # Baseline  Delta Abs  Shared Object     Symbol
       # ........  .........  ................  ..................................
       #
           48.75%     +0.33%  div               [.] main
            8.21%     -0.20%  div               [.] compute_flag
           19.02%     -0.12%  libc-2.23.so      [.] __random_r
           16.17%     -0.09%  libc-2.23.so      [.] __random
            2.27%     -0.03%  div               [.] rand@plt
                      +0.02%  [i915]            [k] gen8_irq_handler
            5.52%     +0.02%  libc-2.23.so      [.] rand
      
      This patch creates a new computation selection 'cycles'.
      
       $ perf diff -c cycles
      
       # Event 'cycles'
       #
       # Baseline       [Program Block Range] Cycles Diff Shared Object Symbol
       # ........ ....................................... .........................................
       #
           48.75%             [div.c:42 -> div.c:45]  147 div           [.] main
           48.75%             [div.c:31 -> div.c:40]    4 div           [.] main
           48.75%             [div.c:40 -> div.c:40]    0 div           [.] main
           48.75%             [div.c:42 -> div.c:42]    0 div           [.] main
           48.75%             [div.c:42 -> div.c:44]    0 div           [.] main
           19.02% [random_r.c:357 -> random_r.c:360]    0 libc-2.23.so  [.] __random_r
           19.02% [random_r.c:357 -> random_r.c:373]    0 libc-2.23.so  [.] __random_r
           19.02% [random_r.c:357 -> random_r.c:376]    0 libc-2.23.so  [.] __random_r
           19.02% [random_r.c:357 -> random_r.c:380]    0 libc-2.23.so  [.] __random_r
           19.02% [random_r.c:357 -> random_r.c:392]    0 libc-2.23.so  [.] __random_r
           16.17%     [random.c:288 -> random.c:291]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:288 -> random.c:291]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:288 -> random.c:295]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:288 -> random.c:297]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:291 -> random.c:291]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:293 -> random.c:293]    0 libc-2.23.so  [.] __random
            8.21%             [div.c:22 -> div.c:22]  148 div           [.] compute_flag
            8.21%             [div.c:22 -> div.c:25]    0 div           [.] compute_flag
            8.21%             [div.c:27 -> div.c:28]    0 div           [.] compute_flag
            5.52%           [rand.c:26 -> rand.c:27]    0 libc-2.23.so  [.] rand
            5.52%           [rand.c:26 -> rand.c:28]    0 libc-2.23.so  [.] rand
            2.27%         [rand@plt+0 -> rand@plt+0]    0 div           [.] rand@plt
            0.01% [entry_64.S:694 -> entry_64.S:694]   16 [vmlinux]     [k] native_irq_return_iret
            0.00%       [fair.c:7676 -> fair.c:7665]  162 [vmlinux]     [k] update_blocked_averages
      
      "[Program Block Range]" indicates the range of program basic block
      (start -> end). If we can find the source line it prints the source line
      otherwise it prints the symbol+offset instead.
      
       v4:
       ---
       Use source lines or symbol+offset to indicate the basic block. It should
       be easier to understand.
      
       v3:
       ---
       Cast 'struct hist_entry' to 'struct block_hist' in hist_entry__block_fprintf.
       Use symbol_conf.report_block to check if executing hist_entry__block_fprintf.
      
       v2:
       ---
       Keep standard perf diff format and display the 'Baseline' and
       'Shared Object'.
      
      The output is sorted by "Baseline" and the basic blocks in the same
      function are sorted by cycles diff.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1561713784-30533-7-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b10c78c5
  4. 02 7月, 2019 11 次提交
  5. 26 6月, 2019 12 次提交