1. 16 8月, 2019 2 次提交
    • A
      perf db-export: Fix thread__exec_comm() · f1f66289
      Adrian Hunter 提交于
      commit 3de7ae0b2a1d86dbb23d0cb135150534fdb2e836 upstream.
      
      Threads synthesized from /proc have comms with a start time of zero, and
      not marked as "exec". Currently, there can be 2 such comms. The first is
      created by processing a synthesized fork event and is set to the
      parent's comm string, and the second by processing a synthesized comm
      event set to the thread's current comm string.
      
      In the absence of an "exec" comm, thread__exec_comm() picks the last
      (oldest) comm, which, in the case above, is the parent's comm string.
      For a main thread, that is very probably wrong. Use the second-to-last
      in that case.
      
      This affects only db-export because it is the only user of
      thread__exec_comm().
      
      Example:
      
        $ sudo perf record -a -o pt-a-sleep-1 -e intel_pt//u -- sleep 1
        $ sudo chown ahunter pt-a-sleep-1
      
      Before:
      
        $ perf script -i pt-a-sleep-1 --itrace=bep -s tools/perf/scripts/python/export-to-sqlite.py pt-a-sleep-1.db branches calls
        $ sqlite3 -header -column pt-a-sleep-1.db 'select * from comm_threads_view'
        comm_id     command     thread_id   pid         tid
        ----------  ----------  ----------  ----------  ----------
        1           swapper     1           0           0
        2           rcu_sched   2           10          10
        3           kthreadd    3           78          78
        5           sudo        4           15180       15180
        5           sudo        5           15180       15182
        7           kworker/4:  6           10335       10335
        8           kthreadd    7           55          55
        10          systemd     8           865         865
        10          systemd     9           865         875
        13          perf        10          15181       15181
        15          sleep       10          15181       15181
        16          kworker/3:  11          14179       14179
        17          kthreadd    12          29376       29376
        19          systemd     13          746         746
        21          systemd     14          401         401
        23          systemd     15          879         879
        23          systemd     16          879         945
        25          kthreadd    17          556         556
        27          kworker/u1  18          14136       14136
        28          kworker/u1  19          15021       15021
        29          kthreadd    20          509         509
        31          systemd     21          836         836
        31          systemd     22          836         967
        33          systemd     23          1148        1148
        33          systemd     24          1148        1163
        35          kworker/2:  25          17988       17988
        36          kworker/0:  26          13478       13478
      
      After:
      
        $ perf script -i pt-a-sleep-1 --itrace=bep -s tools/perf/scripts/python/export-to-sqlite.py pt-a-sleep-1b.db branches calls
        $ sqlite3 -header -column pt-a-sleep-1b.db 'select * from comm_threads_view'
        comm_id     command     thread_id   pid         tid
        ----------  ----------  ----------  ----------  ----------
        1           swapper     1           0           0
        2           rcu_sched   2           10          10
        3           kswapd0     3           78          78
        4           perf        4           15180       15180
        4           perf        5           15180       15182
        6           kworker/4:  6           10335       10335
        7           kcompactd0  7           55          55
        8           accounts-d  8           865         865
        8           accounts-d  9           865         875
        10          perf        10          15181       15181
        12          sleep       10          15181       15181
        13          kworker/3:  11          14179       14179
        14          kworker/1:  12          29376       29376
        15          haveged     13          746         746
        16          systemd-jo  14          401         401
        17          NetworkMan  15          879         879
        17          NetworkMan  16          879         945
        19          irq/131-iw  17          556         556
        20          kworker/u1  18          14136       14136
        21          kworker/u1  19          15021       15021
        22          kworker/u1  20          509         509
        23          thermald    21          836         836
        23          thermald    22          836         967
        25          unity-sett  23          1148        1148
        25          unity-sett  24          1148        1163
        27          kworker/2:  25          17988       17988
        28          kworker/0:  26          13478       13478
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 65de51f9 ("perf tools: Identify which comms are from exec")
      Link: http://lkml.kernel.org/r/20190808064823.14846-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f1f66289
    • T
      perf annotate: Fix s390 gap between kernel end and module start · 532db2b9
      Thomas Richter 提交于
      commit b9c0a64901d5bdec6eafd38d1dc8fa0e2974fccb upstream.
      
      During execution of command 'perf top' the error message:
      
         Not enough memory for annotating '__irf_end' symbol!)
      
      is emitted from this call sequence:
        __cmd_top
          perf_top__mmap_read
            perf_top__mmap_read_idx
              perf_event__process_sample
                hist_entry_iter__add
                  hist_iter__top_callback
                    perf_top__record_precise_ip
                      hist_entry__inc_addr_samples
                        symbol__inc_addr_samples
                          symbol__get_annotation
                            symbol__alloc_hist
      
      In this function the size of symbol __irf_end is calculated. The size of
      a symbol is the difference between its start and end address.
      
      When the symbol was read the first time, its start and end was set to:
      
         symbol__new: __irf_end 0xe954d0-0xe954d0
      
      which is correct and maps with /proc/kallsyms:
      
         root@s8360046:~/linux-4.15.0/tools/perf# fgrep _irf_end /proc/kallsyms
         0000000000e954d0 t __irf_end
         root@s8360046:~/linux-4.15.0/tools/perf#
      
      In function symbol__alloc_hist() the end of symbol __irf_end is
      
        symbol__alloc_hist sym:__irf_end start:0xe954d0 end:0x3ff80045a8
      
      which is identical with the first module entry in /proc/kallsyms
      
      This results in a symbol size of __irf_req for histogram analyses of
      70334140059072 bytes and a malloc() for this requested size fails.
      
      The root cause of this is function
        __dso__load_kallsyms()
        +-> symbols__fixup_end()
      
      Function symbols__fixup_end() enlarges the last symbol in the kallsyms
      map:
      
         # fgrep __irf_end /proc/kallsyms
         0000000000e954d0 t __irf_end
         #
      
      to the start address of the first module:
         # cat /proc/kallsyms | sort  | egrep ' [tT] '
         ....
         0000000000e952d0 T __security_initcall_end
         0000000000e954d0 T __initramfs_size
         0000000000e954d0 t __irf_end
         000003ff800045a8 T fc_get_event_number       [scsi_transport_fc]
         000003ff800045d0 t store_fc_vport_disable    [scsi_transport_fc]
         000003ff800046a8 T scsi_is_fc_rport  [scsi_transport_fc]
         000003ff800046d0 t fc_target_setup   [scsi_transport_fc]
      
      On s390 the kernel is located around memory address 0x200, 0x10000 or
      0x100000, depending on linux version. Modules however start some- where
      around 0x3ff xxxx xxxx.
      
      This is different than x86 and produces a large gap for which histogram
      allocation fails.
      
      Fix this by detecting the kernel's last symbol and do no adjustment for
      it. Introduce a weak function and handle s390 specifics.
      Reported-by: NKlaus Theurich <klaus.theurich@de.ibm.com>
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20190724122703.3996-2-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      532db2b9
  2. 31 7月, 2019 2 次提交
    • L
      perf annotate: Fix dereferencing freed memory found by the smatch tool · 915945f3
      Leo Yan 提交于
      [ Upstream commit 600c787dbf6521d8d07ee717ab7606d5070103ea ]
      
      Based on the following report from Smatch, fix the potential
      dereferencing freed memory check.
      
        tools/perf/util/annotate.c:1125
        disasm_line__parse() error: dereferencing freed memory 'namep'
      
        tools/perf/util/annotate.c
        1100 static int disasm_line__parse(char *line, const char **namep, char **rawp)
        1101 {
        1102         char tmp, *name = ltrim(line);
      
        [...]
      
        1114         *namep = strdup(name);
        1115
        1116         if (*namep == NULL)
        1117                 goto out_free_name;
      
        [...]
      
        1124 out_free_name:
        1125         free((void *)namep);
                                  ^^^^^
        1126         *namep = NULL;
                     ^^^^^^
        1127         return -1;
        1128 }
      
      If strdup() fails to allocate memory space for *namep, we don't need to
      free memory with pointer 'namep', which is resident in data structure
      disasm_line::ins::name; and *namep is NULL pointer for this failure, so
      it's pointless to assign NULL to *namep again.
      
      Committer note:
      
      Freeing namep, which is the address of the first entry of the 'struct
      ins' that is the first member of struct disasm_line would in fact free
      that disasm_line instance, if it was allocated via malloc/calloc, which,
      later, would a dereference of freed memory.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Alexios Zavras <alexios.zavras@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190702103420.27540-5-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      915945f3
    • L
      perf session: Fix potential NULL pointer dereference found by the smatch tool · b305dcff
      Leo Yan 提交于
      [ Upstream commit f3c8d90757724982e5f07cd77d315eb64ca145ac ]
      
      Based on the following report from Smatch, fix the potential
      NULL pointer dereference check.
      
        tools/perf/util/session.c:1252
        dump_read() error: we previously assumed 'evsel' could be null
        (see line 1249)
      
        tools/perf/util/session.c
        1240 static void dump_read(struct perf_evsel *evsel, union perf_event *event)
        1241 {
        1242         struct read_event *read_event = &event->read;
        1243         u64 read_format;
        1244
        1245         if (!dump_trace)
        1246                 return;
        1247
        1248         printf(": %d %d %s %" PRIu64 "\n", event->read.pid, event->read.tid,
        1249                evsel ? perf_evsel__name(evsel) : "FAIL",
        1250                event->read.value);
        1251
        1252         read_format = evsel->attr.read_format;
                                   ^^^^^^^
      
      'evsel' could be NULL pointer, for this case this patch directly bails
      out without dumping read_event.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Alexios Zavras <alexios.zavras@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190702103420.27540-9-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b305dcff
  3. 26 7月, 2019 5 次提交
    • A
      perf stat: Fix group lookup for metric group · e358d2ab
      Andi Kleen 提交于
      [ Upstream commit 2f87f33f4226523df9c9cc28f9874ea02fcc3d3f ]
      
      The metric group code tries to find a group it added earlier in the
      evlist. Fix the lookup to handle groups with partially overlaps
      correctly. When a sub string match fails and we reset the match, we have
      to compare the first element again.
      
      I also renamed the find_evsel function to find_evsel_group to make its
      purpose clearer.
      
      With the earlier changes this fixes:
      
      Before:
      
        % perf stat -M UPI,IPC sleep 1
        ...
               1,032,922      uops_retired.retire_slots #      1.1 UPI
               1,896,096      inst_retired.any
               1,896,096      inst_retired.any
               1,177,254      cpu_clk_unhalted.thread
      
      After:
      
        % perf stat -M UPI,IPC sleep 1
        ...
              1,013,193      uops_retired.retire_slots #      1.1 UPI
                 932,033      inst_retired.any
                 932,033      inst_retired.any          #      0.9 IPC
               1,091,245      cpu_clk_unhalted.thread
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Fixes: b18f3e36 ("perf stat: Support JSON metrics in perf stat")
      Link: http://lkml.kernel.org/r/20190624193711.35241-4-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e358d2ab
    • A
      perf stat: Make metric event lookup more robust · a64e018b
      Andi Kleen 提交于
      [ Upstream commit 145c407c808352acd625be793396fd4f33c794f8 ]
      
      After setting up metric groups through the event parser, the metricgroup
      code looks them up again in the event list.
      
      Make sure we only look up events that haven't been used by some other
      metric. The data structures currently cannot handle more than one metric
      per event. This avoids problems with multiple events partially
      overlapping.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Link: http://lkml.kernel.org/r/20190624193711.35241-2-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a64e018b
    • K
      perf tools: Increase MAX_NR_CPUS and MAX_CACHES · 1182ff22
      Kyle Meyer 提交于
      [ Upstream commit 9f94c7f947e919c343b30f080285af53d0fa9902 ]
      
      Attempting to profile 1024 or more CPUs with perf causes two errors:
      
        perf record -a
        [ perf record: Woken up X times to write data ]
        way too many cpu caches..
        [ perf record: Captured and wrote X MB perf.data (X samples) ]
      
        perf report -C 1024
        Error: failed to set  cpu bitmap
        Requested CPU 1024 too large. Consider raising MAX_NR_CPUS
      
        Increasing MAX_NR_CPUS from 1024 to 2048 and redefining MAX_CACHES as
        MAX_NR_CPUS * 4 returns normal functionality to perf:
      
        perf record -a
        [ perf record: Woken up X times to write data ]
        [ perf record: Captured and wrote X MB perf.data (X samples) ]
      
        perf report -C 1024
        ...
      Signed-off-by: NKyle Meyer <kyle.meyer@hpe.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190620193630.154025-1-meyerk@stormcage.eag.rdlabs.hpecorp.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      1182ff22
    • A
      perf evsel: Make perf_evsel__name() accept a NULL argument · 4c57957e
      Arnaldo Carvalho de Melo 提交于
      [ Upstream commit fdbdd7e8580eac9bdafa532746c865644d125e34 ]
      
      In which case it simply returns "unknown", like when it can't figure out
      the evsel->name value.
      
      This makes this code more robust and fixes a problem in 'perf trace'
      where a NULL evsel was being passed to a routine that only used the
      evsel for printing its name when a invalid syscall id was passed.
      Reported-by: NLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-f30ztaasku3z935cn3ak3h53@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4c57957e
    • T
      perf report: Fix OOM error in TUI mode on s390 · eac8b39d
      Thomas Richter 提交于
      [ Upstream commit 8a07aa4e9b7b0222129c07afff81634a884b2866 ]
      
      Debugging a OOM error using the TUI interface revealed this issue
      on s390:
      
      [tmricht@m83lp54 perf]$ cat /proc/kallsyms |sort
      ....
      00000001119b7158 B radix_tree_node_cachep
      00000001119b8000 B __bss_stop
      00000001119b8000 B _end
      000003ff80002850 t autofs_mount	[autofs4]
      000003ff80002868 t autofs_show_options	[autofs4]
      000003ff80002a98 t autofs_evict_inode	[autofs4]
      ....
      
      There is a huge gap between the last kernel symbol
      __bss_stop/_end and the first kernel module symbol
      autofs_mount (from autofs4 module).
      
      After reading the kernel symbol table via functions:
      
       dso__load()
       +--> dso__load_kernel_sym()
            +--> dso__load_kallsyms()
      	   +--> __dso_load_kallsyms()
      	        +--> symbols__fixup_end()
      
      the symbol __bss_stop has a start address of 1119b8000 and
      an end address of 3ff80002850, as can be seen by this debug statement:
      
        symbols__fixup_end __bss_stop start:0x1119b8000 end:0x3ff80002850
      
      The size of symbol __bss_stop is 0x3fe6e64a850 bytes!
      It is the last kernel symbol and fills up the space until
      the first kernel module symbol.
      
      This size kills the TUI interface when executing the following
      code:
      
        process_sample_event()
          hist_entry_iter__add()
            hist_iter__report_callback()
              hist_entry__inc_addr_samples()
                symbol__inc_addr_samples(symbol = __bss_stop)
                  symbol__cycles_hist()
                     annotated_source__alloc_histograms(...,
      				                symbol__size(sym),
      		                                ...)
      
      This function allocates memory to save sample histograms.
      The symbol_size() marco is defined as sym->end - sym->start, which
      results in above value of 0x3fe6e64a850 bytes and
      the call to calloc() in annotated_source__alloc_histograms() fails.
      
      The histgram memory allocation might fail, make this failure
      no-fatal and continue processing.
      
      Output before:
      [tmricht@m83lp54 perf]$ ./perf --debug stderr=1 report -vvvvv \
      					      -i ~/slow.data 2>/tmp/2
      [tmricht@m83lp54 perf]$ tail -5 /tmp/2
        __symbol__inc_addr_samples(875): ENOMEM! sym->name=__bss_stop,
      		start=0x1119b8000, addr=0x2aa0005eb08, end=0x3ff80002850,
      		func: 0
      problem adding hist entry, skipping event
      0x938b8 [0x8]: failed to process type: 68 [Cannot allocate memory]
      [tmricht@m83lp54 perf]$
      
      Output after:
      [tmricht@m83lp54 perf]$ ./perf --debug stderr=1 report -vvvvv \
      					      -i ~/slow.data 2>/tmp/2
      [tmricht@m83lp54 perf]$ tail -5 /tmp/2
         symbol__inc_addr_samples map:0x1597830 start:0x110730000 end:0x3ff80002850
         symbol__hists notes->src:0x2aa2a70 nr_hists:1
         symbol__inc_addr_samples sym:unlink_anon_vmas src:0x2aa2a70
         __symbol__inc_addr_samples: addr=0x11094c69e
         0x11094c670 unlink_anon_vmas: period++ [addr: 0x11094c69e, 0x2e, evidx=0]
         	=> nr_samples: 1, period: 526008
      [tmricht@m83lp54 perf]$
      
      There is no error about failed memory allocation and the TUI interface
      shows all entries.
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/90cb5607-3e12-5167-682d-978eba7dafa8@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      eac8b39d
  4. 14 7月, 2019 1 次提交
    • J
      perf pmu: Fix uncore PMU alias list for ARM64 · d8e26651
      John Garry 提交于
      commit 599ee18f0740d7661b8711249096db94c09bc508 upstream.
      
      In commit 292c34c1 ("perf pmu: Fix core PMU alias list for X86
      platform"), we fixed the issue of CPU events being aliased to uncore
      events.
      
      Fix this same issue for ARM64, since the said commit left the (broken)
      behaviour untouched for ARM64.
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linuxarm@huawei.com
      Cc: stable@vger.kernel.org
      Fixes: 292c34c1 ("perf pmu: Fix core PMU alias list for X86 platform")
      Link: http://lkml.kernel.org/r/1560521283-73314-2-git-send-email-john.garry@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d8e26651
  5. 03 7月, 2019 1 次提交
    • A
      perf header: Fix unchecked usage of strncpy() · 6461a454
      Arnaldo Carvalho de Melo 提交于
      commit 5192bde7d98c99f2cd80225649e3c2e7493722f7 upstream.
      
      The strncpy() function may leave the destination string buffer
      unterminated, better use strlcpy() that we have a __weak fallback
      implementation for systems without it.
      
      This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
      
        util/header.c: In function 'perf_event__synthesize_event_update_name':
        util/header.c:3625:2: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
          strncpy(ev->data, evsel->name, len);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        util/header.c:3618:15: note: length computed here
          size_t len = strlen(evsel->name);
                       ^~~~~~~~~~~~~~~~~~~
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: a6e52817 ("perf tools: Add event_update event unit type")
      Link: https://lkml.kernel.org/n/tip-wycz66iy8dl2z3yifgqf894p@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6461a454
  6. 22 6月, 2019 2 次提交
  7. 26 5月, 2019 3 次提交
    • A
      perf intel-pt: Fix sample timestamp wrt non-taken branches · 05fab345
      Adrian Hunter 提交于
      commit 1b6599a9d8e6c9f7e9b0476012383b1777f7fc93 upstream.
      
      The sample timestamp is updated to ensure that the timestamp represents
      the time of the sample and not a branch that the decoder is still
      walking towards. The sample timestamp is updated when the decoder
      returns, but the decoder does not return for non-taken branches. Update
      the sample timestamp then also.
      
      Note that commit 3f04d98e ("perf intel-pt: Improve sample
      timestamp") was also a stable fix and appears, for example, in v4.4
      stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample
      timestamp").
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v4.4+
      Fixes: 3f04d98e ("perf intel-pt: Improve sample timestamp")
      Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05fab345
    • A
      perf intel-pt: Fix improved sample timestamp · ba86f8f8
      Adrian Hunter 提交于
      commit 61b6e08dc8e3ea80b7485c9b3f875ddd45c8466b upstream.
      
      The decoder uses its current timestamp in samples. Usually that is a
      timestamp that has already passed, but in some cases it is a timestamp
      for a branch that the decoder is walking towards, and consequently
      hasn't reached.
      
      The intel_pt_sample_time() function decides which is which, but was not
      handling TNT packets exactly correctly.
      
      In the case of TNT, the timestamp applies to the first branch, so the
      decoder must first walk to that branch.
      
      That means intel_pt_sample_time() should return true for TNT, and this
      patch makes that change. However, if the first branch is a non-taken
      branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false
      for subsequent taken branches in the same TNT packet.
      
      To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to
      distinguish the cases.
      
      Note that commit 3f04d98e ("perf intel-pt: Improve sample
      timestamp") was also a stable fix and appears, for example, in v4.4
      stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample
      timestamp").
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v4.4+
      Fixes: 3f04d98e ("perf intel-pt: Improve sample timestamp")
      Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba86f8f8
    • A
      perf intel-pt: Fix instructions sampling rate · 3ed850ab
      Adrian Hunter 提交于
      commit 7ba8fa20e26eb3c0c04d747f7fd2223694eac4d5 upstream.
      
      The timestamp used to determine if an instruction sample is made, is an
      estimate based on the number of instructions since the last known
      timestamp. A consequence is that it might go backwards, which results in
      extra samples. Change it so that a sample is only made when the
      timestamp goes forwards.
      
      Note this does not affect a sampling period of 0 or sampling periods
      specified as a count of instructions.
      
      Example:
      
       Before:
      
       $ perf script --itrace=i10us
       ls 13812 [003] 2167315.222583:       3270 instructions:u:      7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:      30902 instructions:u:      7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         10 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          8 instructions:u:      7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         14 instructions:u:      7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          6 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         14 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          4 instructions:u:      7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222728:      16423 instructions:u:      7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222734:      12731 instructions:u:      7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ...
      
       After:
       $ perf script --itrace=i10us
       ls 13812 [003] 2167315.222583:       3270 instructions:u:      7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:      30902 instructions:u:      7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222728:      16479 instructions:u:      7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so)
       ...
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: f4aa0819 ("perf tools: Add Intel PT decoder")
      Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ed850ab
  8. 04 5月, 2019 1 次提交
    • W
      perf machine: Update kernel map address and re-order properly · 458a65c7
      Wei Li 提交于
      [ Upstream commit 977c7a6d1e263ff1d755f28595b99e4bc0c48a9f ]
      
      Since commit 1fb87b8e ("perf machine: Don't search for active kernel
      start in __machine__create_kernel_maps"), the __machine__create_kernel_maps()
      just create a map what start and end are both zero. Though the address will be
      updated later, the order of map in the rbtree may be incorrect.
      
      The commit ee05d217 ("perf machine: Set main kernel end address properly")
      fixed the logic in machine__create_kernel_maps(), but it's still wrong in
      function machine__process_kernel_mmap_event().
      
      To reproduce this issue, we need an environment which the module address
      is before the kernel text segment. I tested it on an aarch64 machine with
      kernel 4.19.25:
      
        [root@localhost hulk]# grep _stext /proc/kallsyms
        ffff000008081000 T _stext
        [root@localhost hulk]# grep _etext /proc/kallsyms
        ffff000009780000 R _etext
        [root@localhost hulk]# tail /proc/modules
        hisi_sas_v2_hw 77824 0 - Live 0xffff00000191d000
        nvme_core 126976 7 nvme, Live 0xffff0000018b6000
        mdio 20480 1 ixgbe, Live 0xffff0000018ab000
        hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0xffff000001861000
        hns_mdio 20480 2 - Live 0xffff000001822000
        hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0xffff000001815000
        dm_mirror 40960 0 - Live 0xffff000001804000
        dm_region_hash 32768 1 dm_mirror, Live 0xffff0000017f5000
        dm_log 32768 2 dm_mirror,dm_region_hash, Live 0xffff0000017e7000
        dm_mod 315392 17 dm_mirror,dm_log, Live 0xffff000001780000
        [root@localhost hulk]#
      
      Before fix:
      
        [root@localhost bin]# perf record sleep 3
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
        [root@localhost bin]# perf buildid-list -i perf.data
        4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso]
        [root@localhost bin]# perf buildid-list -i perf.data -H
        0000000000000000000000000000000000000000 /proc/kcore
        [root@localhost bin]#
      
      After fix:
      
        [root@localhost tools]# ./perf/perf record sleep 3
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
        [root@localhost tools]# ./perf/perf buildid-list -i perf.data
        28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms]
        106c14ce6e4acea3453e484dc604d66666f08a2f [vdso]
        [root@localhost tools]# ./perf/perf buildid-list -i perf.data -H
        28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore
      Signed-off-by: NWei Li <liwei391@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Li Bin <huawei.libin@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190228092003.34071-1-liwei391@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin (Microsoft) <sashal@kernel.org>
      458a65c7
  9. 20 4月, 2019 5 次提交
    • A
      perf evsel: Free evsel->counts in perf_evsel__exit() · cf050670
      Arnaldo Carvalho de Melo 提交于
      [ Upstream commit 42dfa451d825a2ad15793c476f73e7bbc0f9d312 ]
      
      Using gcc's ASan, Changbin reports:
      
        =================================================================
        ==7494==ERROR: LeakSanitizer: detected memory leaks
      
        Direct leak of 48 byte(s) in 1 object(s) allocated from:
            #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
            #1 0x5625e5330a5e in zalloc util/util.h:23
            #2 0x5625e5330a9b in perf_counts__new util/counts.c:10
            #3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47
            #4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505
            #5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347
            #6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47
            #7 0x5625e51528e6 in run_test tests/builtin-test.c:358
            #8 0x5625e5152baf in test_and_print tests/builtin-test.c:388
            #9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
            #10 0x5625e515572f in cmd_test tests/builtin-test.c:722
            #11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
            #12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
            #13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
            #14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
            #15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
      
        Indirect leak of 72 byte(s) in 1 object(s) allocated from:
            #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
            #1 0x5625e532560d in zalloc util/util.h:23
            #2 0x5625e532566b in xyarray__new util/xyarray.c:10
            #3 0x5625e5330aba in perf_counts__new util/counts.c:15
            #4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47
            #5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505
            #6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347
            #7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47
            #8 0x5625e51528e6 in run_test tests/builtin-test.c:358
            #9 0x5625e5152baf in test_and_print tests/builtin-test.c:388
            #10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
            #11 0x5625e515572f in cmd_test tests/builtin-test.c:722
            #12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
            #13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
            #14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
            #15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
            #16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
      
      His patch took care of evsel->prev_raw_counts, but the above backtraces
      are about evsel->counts, so fix that instead.
      Reported-by: NChangbin Du <changbin.du@gmail.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Link: https://lkml.kernel.org/n/tip-hd1x13g59f0nuhe4anxhsmfp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      cf050670
    • C
      perf hist: Add missing map__put() in error case · 28848061
      Changbin Du 提交于
      [ Upstream commit cb6186aeffda4d27e56066c79e9579e7831541d3 ]
      
      We need to map__put() before returning from failure of
      sample__resolve_callchain().
      
      Detected with gcc's ASan.
      Signed-off-by: NChangbin Du <changbin.du@gmail.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Krister Johansen <kjlx@templeofstupid.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Fixes: 9c68ae98 ("perf callchain: Reference count maps")
      Link: http://lkml.kernel.org/r/20190316080556.3075-10-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      28848061
    • C
      perf build-id: Fix memory leak in print_sdt_events() · df894a04
      Changbin Du 提交于
      [ Upstream commit 8bde8516893da5a5fdf06121f74d11b52ab92df5 ]
      
      Detected with gcc's ASan:
      
        Direct leak of 4356 byte(s) in 120 object(s) allocated from:
            #0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
            #1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215
            #2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339
            #3 0x55719af66272 in print_events util/parse-events.c:2542
            #4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
            #5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
            #6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
            #7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
            #8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520
            #9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
      Signed-off-by: NChangbin Du <changbin.du@gmail.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Fixes: 40218dae ("perf list: Show SDT and pre-cached events")
      Link: http://lkml.kernel.org/r/20190316080556.3075-7-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      df894a04
    • C
      perf config: Fix a memory leak in collect_config() · 871aa38e
      Changbin Du 提交于
      [ Upstream commit 54569ba4b06d5baedae4614bde33a25a191473ba ]
      
      Detected with gcc's ASan:
      
        Direct leak of 66 byte(s) in 5 object(s) allocated from:
            #0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
            #1 0x560c8761034d in collect_config util/config.c:597
            #2 0x560c8760d9cb in get_value util/config.c:169
            #3 0x560c8760dfd7 in perf_parse_file util/config.c:285
            #4 0x560c8760e0d2 in perf_config_from_file util/config.c:476
            #5 0x560c876108fd in perf_config_set__init util/config.c:661
            #6 0x560c87610c72 in perf_config_set__new util/config.c:709
            #7 0x560c87610d2f in perf_config__init util/config.c:718
            #8 0x560c87610e5d in perf_config util/config.c:730
            #9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442
            #10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
      Signed-off-by: NChangbin Du <changbin.du@gmail.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Fixes: 20105ca1 ("perf config: Introduce perf_config_set class")
      Link: http://lkml.kernel.org/r/20190316080556.3075-6-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      871aa38e
    • C
      perf list: Don't forget to drop the reference to the allocated thread_map · 93d449bd
      Changbin Du 提交于
      [ Upstream commit 39df730b09774bd860e39ea208a48d15078236cb ]
      
      Detected via gcc's ASan:
      
        Direct leak of 2048 byte(s) in 64 object(s) allocated from:
          6     #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370)
          7     #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43
          8     #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85
          9     #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250
         10     #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382
         11     #5 0x556b0f0e3231 in print_events util/parse-events.c:2514
         12     #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
         13     #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
         14     #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
         15     #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
         16     #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520
         17     #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
      Signed-off-by: NChangbin Du <changbin.du@gmail.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Fixes: 89896051 ("perf tools: Do not put a variable sized type not at the end of a struct")
      Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      93d449bd
  10. 06 4月, 2019 5 次提交
    • T
      perf script python: Add trace_context extension module to sys.modules · fd400e96
      Tony Jones 提交于
      [ Upstream commit cc437642255224e4140fed1f3e3156fc8ad91903 ]
      
      In Python3, the result of PyModule_Create (called from
      scripts/python/Perf-Trace-Util/Context.c) is not automatically added to
      sys.modules.  See: https://bugs.python.org/issue4592
      
      Below is the observed behavior without the fix:
      
        # ldd /usr/bin/perf | grep -i python
      	libpython3.6m.so.1.0 => /usr/lib64/libpython3.6m.so.1.0 (0x00007f8e1dfb2000)
      
        # perf record /bin/false
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.015 MB perf.data (17 samples) ]
      
        # perf script -g python | cat
        generated Python script: perf-script.py
      
        # perf script -s ./perf-script.py
        Traceback (most recent call last):
          File "./perf-script.py", line 18, in <module>
            from perf_trace_context import *
        ModuleNotFoundError: No module named 'perf_trace_context'
        Error running python script ./perf-script.py
        #
      
      Committer notes:
      
      To build with python3 use:
      
        $ make -C tools/perf PYTHON=python3
      
      Use a non-const variable to pass the 'name' arg to
      PyImport_AppendInittab(), as python2.6 has that as 'char *', which ends
      up trowing this in some environments:
      
         CC       /tmp/build/perf/util/parse-branch-options.o
        util/scripting-engines/trace-event-python.c: In function 'python_start_script':
        util/scripting-engines/trace-event-python.c:1520:2: error: passing argument 1 of 'PyImport_AppendInittab' discards 'const' qualifier from pointer target type [-Werror]
          PyImport_AppendInittab("perf_trace_context", initfunc);
          ^
        In file included from /usr/include/python2.6/Python.h:130:0,
                         from util/scripting-engines/trace-event-python.c:22:
        /usr/include/python2.6/import.h:54:17: note: expected 'char *' but argument is of type 'const char *'
         PyAPI_FUNC(int) PyImport_AppendInittab(char *name, void (*initfunc)(void));
                         ^
        cc1: all warnings being treated as errors
      Signed-off-by: NTony Jones <tonyj@suse.de>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jaroslav Škarvada <jskarvad@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
      Fixes: 66dfdff0 ("perf tools: Add Python 3 support")
      Link: http://lkml.kernel.org/r/20190124005229.16146-2-tonyj@suse.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      fd400e96
    • T
      perf script python: Use PyBytes for attr in trace-event-python · d90a375b
      Tony Jones 提交于
      [ Upstream commit 72e0b15cb24a497d7d0d4707cf51ff40c185ae8c ]
      
      With Python3.  PyUnicode_FromStringAndSize is unsafe to call on attr and will
      return NULL.  Use _PyBytes_FromStringAndSize (as with raw_buf).
      
      Below is the observed behavior without the fix.  Note it is first necessary
      to apply the prior fix (Add trace_context extension module to sys,modules):
      
        # ldd /usr/bin/perf | grep -i python
                libpython3.6m.so.1.0 => /usr/lib64/libpython3.6m.so.1.0 (0x00007f8e1dfb2000)
      
        # perf record -e raw_syscalls:sys_enter /bin/false
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.018 MB perf.data (21 samples) ]
      
        # perf script -g python | cat
        generated Python script: perf-script.py
      
        # perf script -s ./perf-script.py
        in trace_begin
        Segmentation fault (core dumped)
      Signed-off-by: NTony Jones <tonyj@suse.de>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jaroslav Škarvada <jskarvad@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
      Fixes: 66dfdff0 ("perf tools: Add Python 3 support")
      Link: http://lkml.kernel.org/r/20190124005229.16146-3-tonyj@suse.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d90a375b
    • T
      perf report: Add s390 diagnosic sampling descriptor size · 5cdd0259
      Thomas Richter 提交于
      [ Upstream commit 2187d87eacd46f6214ce3dc9cfd7a558375a4153 ]
      
      On IBM z13 machine types 2964 and 2965 the descriptor
      sizes for sampling and diagnostic sampling entries
      might be missing in the trailer entry and are set to zero.
      
      This leads to a perf report failure when processing diagnostic
      sampling entries.
      
      This patch adds missing descriptor sizes when the trailer entry
      contains zero for these fields.
      
      Output before:
        [root@s38lp82 perf]#  ./perf report --stdio | fgrep Samples
        0xabbf0 [0x8]: failed to process type: 68
        Error:
        failed to process sample
        [root@s38lp82 perf]#
      
      Output after:
        [root@s38lp82 perf]#  ./perf report --stdio | fgrep Samples
        # Total Lost Samples: 0
        # Samples: 3K of event 'SF_CYCLES_BASIC_DIAG'
        # Samples: 162  of event 'CF_DIAG'
        [root@s38lp82 perf]#
      
      Fixes: 2b1444f2 ("perf report: Add raw report support for s390 auxiliary trace")
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20190211100627.85714-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5cdd0259
    • H
      perf report: Don't shadow inlined symbol with different addr range · d41687c8
      He Kuang 提交于
      [ Upstream commit 7346195e8643482968f547483e0d823ec1982fab ]
      
      We can't assume inlined symbols with the same name are equal, because
      their address range may be different. This will cause the symbols with
      different addresses be shadowed when adding to the hist entry, and lead
      to ERANGE error when checking the symbol address during sample parse,
      the addr should be within the range of [sym.start, sym.end].
      
      The error message is like: "0x36aea60 [0x8]: failed to process type: 68".
      
      The second parameter of symbol__new() is the length of the fake symbol
      for the inline frame, which is the subtraction of the end and start
      address of base_sym.
      Signed-off-by: NHe Kuang <hekuang@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: aa441895 ("perf report: Compare symbol name for inlined frames when sorting")
      Link: http://lkml.kernel.org/r/20190219130531.15692-1-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d41687c8
    • W
      perf annotate: Fix getting source line failure · e6786f86
      Wei Li 提交于
      [ Upstream commit 11db1ad4513d6205d2519e1a30ff4cef746e3243 ]
      
      The output of "perf annotate -l --stdio xxx" changed since commit 425859ff
      ("perf annotate: No need to calculate notes->start twice") removed notes->start
      assignment in symbol__calc_lines(). It will get failed in
      find_address_in_section() from symbol__tty_annotate() subroutine as the
      a2l->addr is wrong. So the annotate summary doesn't report the line number of
      source code correctly.
      
      Before fix:
      
        liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c
        void hotspot_1(void)
        {
      	volatile int i;
      
      	for (i = 0; i < 0x10000000; i++);
      	for (i = 0; i < 0x10000000; i++);
      	for (i = 0; i < 0x10000000; i++);
        }
      
        int main(void)
        {
      	hotspot_1();
      
      	return 0;
        }
        liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ gcc common_while_1.c -g -o common_while_1
      
        liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.488 MB perf.data (12498 samples) ]
        liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio
      
        Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
        ----------------------------------------------
      
         19.30 common_while_1[32]
         19.03 common_while_1[4e]
         19.01 common_while_1[16]
          5.04 common_while_1[13]
          4.99 common_while_1[4b]
          4.78 common_while_1[2c]
          4.77 common_while_1[10]
          4.66 common_while_1[2f]
          4.59 common_while_1[51]
          4.59 common_while_1[35]
          4.52 common_while_1[19]
          4.20 common_while_1[56]
          0.51 common_while_1[48]
         Percent |      Source code & Disassembly of common_while_1 for cycles:ppp (12480 samples, percent: local period)
        -----------------------------------------------------------------------------------------------------------------
               :
               :
               :
               :         Disassembly of section .text:
               :
               :         00000000000005fa <hotspot_1>:
               :         hotspot_1():
               :         void hotspot_1(void)
               :         {
          0.00 :   5fa:   push   %rbp
          0.00 :   5fb:   mov    %rsp,%rbp
               :                 volatile int i;
               :
               :                 for (i = 0; i < 0x10000000; i++);
          0.00 :   5fe:   movl   $0x0,-0x4(%rbp)
          0.00 :   605:   jmp    610 <hotspot_1+0x16>
          0.00 :   607:   mov    -0x4(%rbp),%eax
         common_while_1[10]    4.77 :   60a:   add    $0x1,%eax
         common_while_1[13]    5.04 :   60d:   mov    %eax,-0x4(%rbp)
         common_while_1[16]   19.01 :   610:   mov    -0x4(%rbp),%eax
         common_while_1[19]    4.52 :   613:   cmp    $0xfffffff,%eax
            0.00 :   618:   jle    607 <hotspot_1+0xd>
                 :                 for (i = 0; i < 0x10000000; i++);
        ...
      
      After fix:
      
        liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.488 MB perf.data (12500 samples) ]
        liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio
      
        Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
        ----------------------------------------------
      
         33.34 common_while_1.c:5
         33.34 common_while_1.c:6
         33.32 common_while_1.c:7
         Percent |      Source code & Disassembly of common_while_1 for cycles:ppp (12482 samples, percent: local period)
        -----------------------------------------------------------------------------------------------------------------
               :
               :
               :
               :         Disassembly of section .text:
               :
               :         00000000000005fa <hotspot_1>:
               :         hotspot_1():
               :         void hotspot_1(void)
               :         {
          0.00 :   5fa:   push   %rbp
          0.00 :   5fb:   mov    %rsp,%rbp
               :                 volatile int i;
               :
               :                 for (i = 0; i < 0x10000000; i++);
          0.00 :   5fe:   movl   $0x0,-0x4(%rbp)
          0.00 :   605:   jmp    610 <hotspot_1+0x16>
          0.00 :   607:   mov    -0x4(%rbp),%eax
         common_while_1.c:5    4.70 :   60a:   add    $0x1,%eax
          4.89 :   60d:   mov    %eax,-0x4(%rbp)
         common_while_1.c:5   19.03 :   610:   mov    -0x4(%rbp),%eax
         common_while_1.c:5    4.72 :   613:   cmp    $0xfffffff,%eax
          0.00 :   618:   jle    607 <hotspot_1+0xd>
               :                 for (i = 0; i < 0x10000000; i++);
          0.00 :   61a:   movl   $0x0,-0x4(%rbp)
          0.00 :   621:   jmp    62c <hotspot_1+0x32>
          0.00 :   623:   mov    -0x4(%rbp),%eax
         common_while_1.c:6    4.54 :   626:   add    $0x1,%eax
          4.73 :   629:   mov    %eax,-0x4(%rbp)
         common_while_1.c:6   19.54 :   62c:   mov    -0x4(%rbp),%eax
         common_while_1.c:6    4.54 :   62f:   cmp    $0xfffffff,%eax
        ...
      Signed-off-by: NWei Li <liwei391@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 425859ff ("perf annotate: No need to calculate notes->start twice")
      Link: http://lkml.kernel.org/r/20190221095716.39529-1-liwei391@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e6786f86
  11. 03 4月, 2019 2 次提交
  12. 27 3月, 2019 1 次提交
  13. 24 3月, 2019 5 次提交
  14. 14 3月, 2019 2 次提交
    • J
      perf symbols: Filter out hidden symbols from labels · 47e3f3c0
      Jiri Olsa 提交于
      [ Upstream commit 59a17706915fe5ea6f711e1f92d4fb706bce07fe ]
      
      When perf is built with the annobin plugin (RHEL8 build) extra symbols
      are added to its binary:
      
        # nm perf | grep annobin | head -10
        0000000000241100 t .annobin_annotate.c
        0000000000326490 t .annobin_annotate.c
        0000000000249255 t .annobin_annotate.c_end
        00000000003283a8 t .annobin_annotate.c_end
        00000000001bce18 t .annobin_annotate.c_end.hot
        00000000001bce18 t .annobin_annotate.c_end.hot
        00000000001bc3e2 t .annobin_annotate.c_end.unlikely
        00000000001bc400 t .annobin_annotate.c_end.unlikely
        00000000001bce18 t .annobin_annotate.c.hot
        00000000001bce18 t .annobin_annotate.c.hot
        ...
      
      Those symbols have no use for report or annotation and should be
      skipped.  Moreover they interfere with the DWARF unwind test on the PPC
      arch, where they are mixed with checked symbols and then the test fails:
      
        # perf test dwarf -v
        59: Test dwarf unwind                                     :
        --- start ---
        test child forked, pid 8515
        unwind: .annobin_dwarf_unwind.c:ip = 0x10dba40dc (0x2740dc)
        ...
        got: .annobin_dwarf_unwind.c 0x10dba40dc, expecting test__arch_unwind_sample
        unwind: failed with 'no error'
      
      The annobin symbols are defined as NOTYPE/LOCAL/HIDDEN:
      
        # readelf -s ./perf | grep annobin | head -1
          40: 00000000001bce4f     0 NOTYPE  LOCAL  HIDDEN    13 .annobin_init.c
      
      They can still pass the check for the label symbol. Adding check for
      HIDDEN and INTERNAL (as suggested by Nick below) visibility and filter
      out such symbols.
      
      >   Just to be awkward, if you are going to ignore STV_HIDDEN
      >   symbols then you should probably also ignore STV_INTERNAL ones
      >   as well...  Annobin does not generate them, but you never know,
      >   one day some other tool might create some.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Clifton <nickc@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190128133526.GD15461@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      47e3f3c0
    • S
      perf tools: Handle TOPOLOGY headers with no CPU · 1e4b7541
      Stephane Eranian 提交于
      [ Upstream commit 1497e804d1a6e2bd9107ddf64b0310449f4673eb ]
      
      This patch fixes an issue in cpumap.c when used with the TOPOLOGY
      header. In some configurations, some NUMA nodes may have no CPU (empty
      cpulist). Yet a cpumap map must be created otherwise perf abort with an
      error. This patch handles this case by creating a dummy map.
      
        Before:
      
        $ perf record -o - -e cycles noploop 2 | perf script -i -
        0x6e8 [0x6c]: failed to process type: 80
      
        After:
      
        $ perf record -o - -e cycles noploop 2 | perf script -i -
        noploop for 2 seconds
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1547885559-1657-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      1e4b7541
  15. 20 2月, 2019 1 次提交
    • J
      perf report: Fix wrong iteration count in --branch-history · d20bfcb5
      Jin Yao 提交于
      [ Upstream commit a3366db06bb656cef2e03f30f780d93059bcc594 ]
      
      By calculating the removed loops, we can get the iteration count.
      
      But the iteration count could be reported incorrectly, reporting
      impossibly high counts.
      
      That's because previous code uses the number of removed LBR entries for
      the iteration count. That's not good. Fix this by increasing the
      iteration count when a loop is detected.
      
      When matching the chain, the iteration count would be added up, finally we need
      to compute the average value when printing out.
      
      For example,
      
        $ perf report --branch-history --stdio --no-children
      
      Before:
      
        ---f2 +0
           |
           |--33.62%--f1 +9 (cycles:1)
           |          f1 +0
           |          main +22 (cycles:1)
           |          main +17
           |          main +38 (cycles:1)
           |          main +27
           |          f1 +26 (cycles:1)
           |          f1 +24
           |          f2 +27 (cycles:7)
           |          f2 +0
           |          f1 +19 (cycles:1)
           |          f1 +14
           |          f2 +27 (cycles:11)
           |          f2 +0
           |          f1 +9 (cycles:1 iter:2968 avg_cycles:3)
           |          f1 +0
           |          main +22 (cycles:1 iter:2968 avg_cycles:3)
           |          main +17
           |          main +38 (cycles:1 iter:2968 avg_cycles:3)
      
      2968 is an impossible high iteration count and avg_cycles is too small.
      
      After:
      
        ---f2 +0
           |
           |--33.62%--f1 +9 (cycles:1)
           |          f1 +0
           |          main +22 (cycles:1)
           |          main +17
           |          main +38 (cycles:1)
           |          main +27
           |          f1 +26 (cycles:1)
           |          f1 +24
           |          f2 +27 (cycles:7)
           |          f2 +0
           |          f1 +19 (cycles:1)
           |          f1 +14
           |          f2 +27 (cycles:11)
           |          f2 +0
           |          f1 +9 (cycles:1 iter:1 avg_cycles:23)
           |          f1 +0
           |          main +22 (cycles:1 iter:1 avg_cycles:23)
           |          main +17
           |          main +38 (cycles:1 iter:1 avg_cycles:23)
      
      avg_cycles:23 is the average cycles of this iteration.
      
      Fixes: c4ee0625 ("perf report: Calculate the average cycles of iterations")
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1546582230-17507-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d20bfcb5
  16. 13 2月, 2019 2 次提交