1. 14 8月, 2019 1 次提交
    • I
      perf tools: Add helpers to use capabilities if present · c22e150e
      Igor Lubashev 提交于
      Add utilities to help checking capabilities of the running procss.  Make
      perf link with libcap, if it is available. If no libcap-dev[el],
      fallback to the geteuid() == 0 test used before.
      
      Committer notes:
      
        $ perf test python
        18: 'import perf' in python                               : FAILED!
        $ perf test -v python
        Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
        18: 'import perf' in python                               :
        --- start ---
        test child forked, pid 23288
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.so: undefined symbol: cap_get_flag
        test child finished with -1
        ---- end ----
        'import perf' in python: FAILED!
        $
      
      This happens because differently from the perf binary generated with
      this patch applied:
      
        $ ldd /tmp/build/perf/perf | grep libcap
        	libcap.so.2 => /lib64/libcap.so.2 (0x00007f724a4ef000)
        $
      
      The python binding isn't linking with libcap:
      
        $ ldd /tmp/build/perf/python/perf.so | grep libcap
        $
      
      So add 'cap' to the 'extra_libraries' variable in
      tools/perf/util/setup.py, and rebuild:
      
        $ perf test python
        18: 'import perf' in python                               : Ok
        $
      
      If we explicitely disable libcap it also continues to work:
      
        $ make NO_LIBCAP=1 -C tools/perf O=/tmp/build/perf install-bin
          $ ldd /tmp/build/perf/perf | grep libcap
        $ ldd /tmp/build/perf/python/perf.so | grep libcap
        $ perf test python
        18: 'import perf' in python                               : Ok
        $
      Signed-off-by: NIgor Lubashev <ilubashe@akamai.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      [ split from a larger patch ]
      Link: http://lkml.kernel.org/r/8a1e76cf5c7c9796d0d4d240fbaa85305298aafa.1565188228.git.ilubashe@akamai.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c22e150e
  2. 13 8月, 2019 3 次提交
  3. 09 8月, 2019 5 次提交
    • T
      perf annotate: Fix s390 gap between kernel end and module start · b9c0a649
      Thomas Richter 提交于
      During execution of command 'perf top' the error message:
      
         Not enough memory for annotating '__irf_end' symbol!)
      
      is emitted from this call sequence:
        __cmd_top
          perf_top__mmap_read
            perf_top__mmap_read_idx
              perf_event__process_sample
                hist_entry_iter__add
                  hist_iter__top_callback
                    perf_top__record_precise_ip
                      hist_entry__inc_addr_samples
                        symbol__inc_addr_samples
                          symbol__get_annotation
                            symbol__alloc_hist
      
      In this function the size of symbol __irf_end is calculated. The size of
      a symbol is the difference between its start and end address.
      
      When the symbol was read the first time, its start and end was set to:
      
         symbol__new: __irf_end 0xe954d0-0xe954d0
      
      which is correct and maps with /proc/kallsyms:
      
         root@s8360046:~/linux-4.15.0/tools/perf# fgrep _irf_end /proc/kallsyms
         0000000000e954d0 t __irf_end
         root@s8360046:~/linux-4.15.0/tools/perf#
      
      In function symbol__alloc_hist() the end of symbol __irf_end is
      
        symbol__alloc_hist sym:__irf_end start:0xe954d0 end:0x3ff80045a8
      
      which is identical with the first module entry in /proc/kallsyms
      
      This results in a symbol size of __irf_req for histogram analyses of
      70334140059072 bytes and a malloc() for this requested size fails.
      
      The root cause of this is function
        __dso__load_kallsyms()
        +-> symbols__fixup_end()
      
      Function symbols__fixup_end() enlarges the last symbol in the kallsyms
      map:
      
         # fgrep __irf_end /proc/kallsyms
         0000000000e954d0 t __irf_end
         #
      
      to the start address of the first module:
         # cat /proc/kallsyms | sort  | egrep ' [tT] '
         ....
         0000000000e952d0 T __security_initcall_end
         0000000000e954d0 T __initramfs_size
         0000000000e954d0 t __irf_end
         000003ff800045a8 T fc_get_event_number       [scsi_transport_fc]
         000003ff800045d0 t store_fc_vport_disable    [scsi_transport_fc]
         000003ff800046a8 T scsi_is_fc_rport  [scsi_transport_fc]
         000003ff800046d0 t fc_target_setup   [scsi_transport_fc]
      
      On s390 the kernel is located around memory address 0x200, 0x10000 or
      0x100000, depending on linux version. Modules however start some- where
      around 0x3ff xxxx xxxx.
      
      This is different than x86 and produces a large gap for which histogram
      allocation fails.
      
      Fix this by detecting the kernel's last symbol and do no adjustment for
      it. Introduce a weak function and handle s390 specifics.
      Reported-by: NKlaus Theurich <klaus.theurich@de.ibm.com>
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20190724122703.3996-2-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b9c0a649
    • T
      perf record: Fix module size on s390 · 12a6d294
      Thomas Richter 提交于
      On s390 the modules loaded in memory have the text segment located after
      the GOT and Relocation table. This can be seen with this output:
      
        [root@m35lp76 perf]# fgrep qeth /proc/modules
        qeth 151552 1 qeth_l2, Live 0x000003ff800b2000
        ...
        [root@m35lp76 perf]# cat /sys/module/qeth/sections/.text
        0x000003ff800b3990
        [root@m35lp76 perf]#
      
      There is an offset of 0x1990 bytes. The size of the qeth module is
      151552 bytes (0x25000 in hex).
      
      The location of the GOT/relocation table at the beginning of a module is
      unique to s390.
      
      commit 203d8a4a ("perf s390: Fix 'start' address of module's map")
      adjusts the start address of a module in the map structures, but does
      not adjust the size of the modules. This leads to overlapping of module
      maps as this example shows:
      
      [root@m35lp76 perf] # ./perf report -D
           0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x25000)
                @ 0]:  x /lib/modules/.../qeth.ko.xz
           0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x8000)
                @ 0]:  x /lib/modules/.../ip6_tables.ko.xz
      
      The module qeth.ko has an adjusted start address modified to b3990, but
      its size is unchanged and the module ends at 0x3ff800d8990.  This end
      address overlaps with the next modules start address of 0x3ff800d85a0.
      
      When the size of the leading GOT/Relocation table stored in the
      beginning of the text segment (0x1990 bytes) is subtracted from module
      qeth end address, there are no overlaps anymore:
      
         0x3ff800d8990 - 0x1990 = 0x0x3ff800d7000
      
      which is the same as
      
         0x3ff800b2000 + 0x25000 = 0x0x3ff800d7000.
      
      To fix this issue, also adjust the modules size in function
      arch__fix_module_text_start(). Add another function parameter named size
      and reduce the size of the module when the text segment start address is
      changed.
      
      Output after:
           0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x23670)
                @ 0]:  x /lib/modules/.../qeth.ko.xz
           0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x7a60)
                @ 0]:  x /lib/modules/.../ip6_tables.ko.xz
      Reported-by: NStefan Liebler <stli@linux.ibm.com>
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: stable@vger.kernel.org
      Fixes: 203d8a4a ("perf s390: Fix 'start' address of module's map")
      Link: http://lkml.kernel.org/r/20190724122703.3996-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      12a6d294
    • H
      perf cpumap: Fix writing to illegal memory in handling cpumap mask · 5f5e25f1
      He Zhe 提交于
      cpu_map__snprint_mask() would write to illegal memory pointed by
      zalloc(0) when there is only one cpu.
      
      This patch fixes the calculation and adds sanity check against the input
      parameters.
      Signed-off-by: NHe Zhe <zhe.he@windriver.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Fixes: 4400ac8a ("perf cpumap: Introduce cpu_map__snprint_mask()")
      Link: http://lkml.kernel.org/r/1564734592-15624-2-git-send-email-zhe.he@windriver.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5f5e25f1
    • A
      perf db-export: Fix thread__exec_comm() · 3de7ae0b
      Adrian Hunter 提交于
      Threads synthesized from /proc have comms with a start time of zero, and
      not marked as "exec". Currently, there can be 2 such comms. The first is
      created by processing a synthesized fork event and is set to the
      parent's comm string, and the second by processing a synthesized comm
      event set to the thread's current comm string.
      
      In the absence of an "exec" comm, thread__exec_comm() picks the last
      (oldest) comm, which, in the case above, is the parent's comm string.
      For a main thread, that is very probably wrong. Use the second-to-last
      in that case.
      
      This affects only db-export because it is the only user of
      thread__exec_comm().
      
      Example:
      
        $ sudo perf record -a -o pt-a-sleep-1 -e intel_pt//u -- sleep 1
        $ sudo chown ahunter pt-a-sleep-1
      
      Before:
      
        $ perf script -i pt-a-sleep-1 --itrace=bep -s tools/perf/scripts/python/export-to-sqlite.py pt-a-sleep-1.db branches calls
        $ sqlite3 -header -column pt-a-sleep-1.db 'select * from comm_threads_view'
        comm_id     command     thread_id   pid         tid
        ----------  ----------  ----------  ----------  ----------
        1           swapper     1           0           0
        2           rcu_sched   2           10          10
        3           kthreadd    3           78          78
        5           sudo        4           15180       15180
        5           sudo        5           15180       15182
        7           kworker/4:  6           10335       10335
        8           kthreadd    7           55          55
        10          systemd     8           865         865
        10          systemd     9           865         875
        13          perf        10          15181       15181
        15          sleep       10          15181       15181
        16          kworker/3:  11          14179       14179
        17          kthreadd    12          29376       29376
        19          systemd     13          746         746
        21          systemd     14          401         401
        23          systemd     15          879         879
        23          systemd     16          879         945
        25          kthreadd    17          556         556
        27          kworker/u1  18          14136       14136
        28          kworker/u1  19          15021       15021
        29          kthreadd    20          509         509
        31          systemd     21          836         836
        31          systemd     22          836         967
        33          systemd     23          1148        1148
        33          systemd     24          1148        1163
        35          kworker/2:  25          17988       17988
        36          kworker/0:  26          13478       13478
      
      After:
      
        $ perf script -i pt-a-sleep-1 --itrace=bep -s tools/perf/scripts/python/export-to-sqlite.py pt-a-sleep-1b.db branches calls
        $ sqlite3 -header -column pt-a-sleep-1b.db 'select * from comm_threads_view'
        comm_id     command     thread_id   pid         tid
        ----------  ----------  ----------  ----------  ----------
        1           swapper     1           0           0
        2           rcu_sched   2           10          10
        3           kswapd0     3           78          78
        4           perf        4           15180       15180
        4           perf        5           15180       15182
        6           kworker/4:  6           10335       10335
        7           kcompactd0  7           55          55
        8           accounts-d  8           865         865
        8           accounts-d  9           865         875
        10          perf        10          15181       15181
        12          sleep       10          15181       15181
        13          kworker/3:  11          14179       14179
        14          kworker/1:  12          29376       29376
        15          haveged     13          746         746
        16          systemd-jo  14          401         401
        17          NetworkMan  15          879         879
        17          NetworkMan  16          879         945
        19          irq/131-iw  17          556         556
        20          kworker/u1  18          14136       14136
        21          kworker/u1  19          15021       15021
        22          kworker/u1  20          509         509
        23          thermald    21          836         836
        23          thermald    22          836         967
        25          unity-sett  23          1148        1148
        25          unity-sett  24          1148        1163
        27          kworker/2:  25          17988       17988
        28          kworker/0:  26          13478       13478
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 65de51f9 ("perf tools: Identify which comms are from exec")
      Link: http://lkml.kernel.org/r/20190808064823.14846-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3de7ae0b
    • A
      perf annotate: Fix printing of unaugmented disassembled instructions from BPF · 85127775
      Arnaldo Carvalho de Melo 提交于
      The code to disassemble BPF programs uses binutil's disassembling
      routines, and those use in turn fprintf to print to a memstream FILE,
      adding a newline at the end of each line, which ends up confusing the
      TUI routines called from:
      
        annotate_browser__write()
          annotate_line__write()
            annotate_browser__printf()
              ui_browser__vprintf()
                SLsmg_vprintf()
      
      The SLsmg_vprintf() function in the slang library gets confused with the
      terminating newline, so make the disasm_line__parse() function that
      parses the lines produced by the BPF specific disassembler (that uses
      binutil's libopcodes) and the lines produced by the objdump based
      disassembler used for everything else (and that doesn't adds this
      terminating newline) trim the end of the line in addition of the
      beginning.
      
      This way when disasm_line->ops.raw, i.e. for instructions without a
      special scnprintf() method, we'll not have that \n getting in the way of
      filling the screen right after the instruction with spaces to avoid
      leaving what was on the screen before and thus garbling the annotation
      screen, breaking scrolling, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Fixes: 6987561c ("perf annotate: Enable annotation of BPF programs")
      Link: https://lkml.kernel.org/n/tip-unbr5a5efakobfr6rhxq99ta@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      85127775
  4. 30 7月, 2019 31 次提交