1. 22 8月, 2017 2 次提交
    • A
      perf evsel: Fix buffer overflow while freeing events · 475fb533
      Andi Kleen 提交于
      Fix buffer overflow for:
      
        % perf stat -e msr/tsc/,cstate_core/c7-residency/ true
      
      that causes glibc free list corruption. For some reason it doesn't
      trigger in valgrind, but it is visible in AS:
      
        =================================================================
        ==32681==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000003f5c at pc 0x0000005671ef bp 0x7ffdaaac9ac0 sp 0x7ffdaaac9ab0
        READ of size 4 at 0x603000003f5c thread T0
          #0 0x5671ee in perf_evsel__close_fd util/evsel.c:1196
          #1 0x56c57a in perf_evsel__close util/evsel.c:1717
          #2 0x55ed5f in perf_evlist__close util/evlist.c:1631
          #3 0x4647e1 in __run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:749
          #4 0x4648e3 in run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:767
          #5 0x46e1bc in cmd_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:2785
          #6 0x52f83d in run_builtin /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:296
          #7 0x52fd49 in handle_internal_command /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:348
          #8 0x5300de in run_argv /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:392
          #9 0x5308f3 in main /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:530
          #10 0x7f0672d13400 in __libc_start_main (/lib64/libc.so.6+0x20400)
          #11 0x428419 in _start (/home/ak/hle/obj-perf/perf+0x428419)
      
        0x603000003f5c is located 0 bytes to the right of 28-byte region [0x603000003f40,0x603000003f5c)
        allocated by thread T0 here:
          #0 0x7f0675139020 in calloc (/lib64/libasan.so.3+0xc7020)
          #1 0x648a2d in zalloc util/util.h:23
          #2 0x648a88 in xyarray__new util/xyarray.c:9
          #3 0x566419 in perf_evsel__alloc_fd util/evsel.c:1039
          #4 0x56b427 in perf_evsel__open util/evsel.c:1529
          #5 0x56c620 in perf_evsel__open_per_thread util/evsel.c:1730
          #6 0x461dea in create_perf_stat_counter /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:263
          #7 0x4637d7 in __run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:600
          #8 0x4648e3 in run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:767
          #9 0x46e1bc in cmd_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:2785
          #10 0x52f83d in run_builtin /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:296
          #11 0x52fd49 in handle_internal_command /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:348
          #12 0x5300de in run_argv /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:392
          #13 0x5308f3 in main /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:530
          #14 0x7f0672d13400 in __libc_start_main (/lib64/libc.so.6+0x20400)
      
      The event is allocated with cpus == 1, but freed with cpus == real number
      When the evsel close function walks the file descriptors it exceeds the
      fd xyarray boundaries and reads random memory.
      
      v2:
      
      Now that xyarrays save their original dimensions we can use these to
      iterate the two dimensional fd arrays. Fix some users (close, ioctl) in
      evsel.c to use these fields directly. This allows simplifying the code
      and dropping quite a few function arguments. Adjust all callers by
      removing the unneeded arguments.
      
      The actual perf event reading still uses the original values from the
      evsel list.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170811232634.30465-2-andi@firstfloor.org
      [ Fix up xy_max_[xy]() -> xyarray__max_[xy]() ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      475fb533
    • A
      perf xyarray: Save max_x, max_y · d74be476
      Andi Kleen 提交于
      Save the original array dimensions in xyarrays, so that users can
      retrieve them later. Add some inline functions to access these fields.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170811232634.30465-1-andi@firstfloor.org
      [ As noticed by Jiri, fix up namespacing: xy__method() -> xyarray__method() ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d74be476
  2. 18 8月, 2017 6 次提交
  3. 16 8月, 2017 1 次提交
    • W
      perf bpf: Fix endianness problem when loading parameters in prologue · db26984a
      Wang Nan 提交于
      Perf's BPF prologue generator unconditionally fetches 8 bytes for
      function parameters, which causes problems on big endian machines. Thomas
      gives a detailed analysis for this problem:
      
       http://lkml.kernel.org/r/968ebda5-abe4-8830-8d69-49f62529d151@linux.vnet.ibm.com
      
       ---- 8< ----
        I investigated perf test BPF for s390x and have a question regarding
        the 38.3 subtest (bpf-prologue test) which fails on s390x.
      
        When I turn on trace_printk in tests/bpf-script-test-prologue.c
        I see this output in /sys/kernel/debug/tracing/trace:
      
        [root@s8360047 perf]# cat /sys/kernel/debug/tracing/trace
        perf-30229 [000] d..2 170161.535791: : f_mode 2001d00000000 offset:0 orig:0
        perf-30229 [000] d..2 170161.535809: : f_mode 6001f00000000 offset:0 orig:0
        perf-30229 [000] d..2 170161.535815: : f_mode 6001f00000000 offset:1 orig:0
        perf-30229 [000] d..2 170161.535819: : f_mode 2001d00000000 offset:1 orig:0
        perf-30229 [000] d..2 170161.535822: : f_mode 2001d00000000 offset:2 orig:1
        perf-30229 [000] d..2 170161.535825: : f_mode 6001f00000000 offset:2 orig:1
        perf-30229 [000] d..2 170161.535828: : f_mode 6001f00000000 offset:3 orig:1
        perf-30229 [000] d..2 170161.535832: : f_mode 2001d00000000 offset:3 orig:1
        perf-30229 [000] d..2 170161.535835: : f_mode 2001d00000000 offset:4 orig:0
        perf-30229 [000] d..2 170161.535841: : f_mode 6001f00000000 offset:4 orig:0
      
        [...]
      
        There are 3 parameters the eBPF program tests/bpf-script-test-prologue.c
        accesses: f_mode (member of struct file at offset 140) offset and orig.  They
        are parameters of the lseek() system call triggered in this test case in
        function llseek_loop().
      
        What is really strange is the value of f_mode. It is an 8 byte value, whereas
        in the probe event it is defined as a 4 byte value.  The lower 4 bytes are all
        zero and do not belong to member f_mode.  The correct value should be 2001d for
        read-only and 6001f for read-write open mode.
      
        Here is the output of the 'perf test -vv bpf' trace:
        Try to find probe point from debuginfo.
        Matched function: null_lseek [2d9310d]
         Probe point found: null_lseek+0
        Searching 'file' variable in context.
        Converting variable file into trace event.
        converting f_mode in file
        f_mode type is unsigned int.
        Opening /sys/kernel/debug/tracing//README write=0
        Searching 'offset' variable in context.
        Converting variable offset into trace event.
        offset type is long long int.
        Searching 'orig' variable in context.
        Converting variable orig into trace event.
        orig type is int.
        Found 1 probe_trace_events.
        Opening /sys/kernel/debug/tracing//kprobe_events write=1
        Writing event: p:perf_bpf_probe/func _text+8794224 f_mode=+140(%r2):x32
       ---- 8< ----
      
      This patch parses the type of each argument and converts data from memory to
      expected type.
      
      Now the test runs successfully on 4.13.0-rc5:
      
        [root@s8360046 perf]# ./perf test  bpf
        38: BPF filter                                 :
        38.1: Basic BPF filtering                      : Ok
        38.2: BPF pinning                              : Ok
        38.3: BPF prologue generation                  : Ok
        38.4: BPF relocation checker                   : Ok
        [root@s8360046 perf]#
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20170815092159.31912-1-tmricht@linux.vnet.ibm.comSigned-off-by: NThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      db26984a
  4. 12 8月, 2017 4 次提交
    • T
      perf report: Fix module symbol adjustment for s390x · 4a084ecf
      Thomas Richter 提交于
      The 'perf report' tool does not display the addresses of kernel module
      symbols correctly.
      
      For example symbol qeth_send_ipa_cmd in kernel module qeth.ko has this
      relative address for function qeth_send_ipa_cmd():
      
        [root@s8360047 linux]# nm -g drivers/s390/net/qeth.ko | fgrep send_ipa_cmd
        0000000000013088 T qeth_send_ipa_cmd
      
      The module is loaded at address:
      
        [root@s8360047 linux]# cat /sys/module/qeth/sections/.text
        0x000003ff80296d20
        [root@s8360047 linux]#
      
      This should result in a start address of:
      
        0x13088 + 0x3ff80296d20 = 0x3ff802a9da8
      
      Using crash to verify the address on a live system:
      
        [root@s8360046 linux]# crash vmlinux
      
        crash 7.1.9++
        Copyright (C) 2002-2016  Red Hat, Inc.
        Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
      
        [...]
      
        crash> mod -s qeth drivers/s390/net/qeth.ko
             MODULE       NAME        SIZE  OBJECT FILE
             3ff8028d700  qeth      151552  drivers/s390/net/qeth.ko
        crash> sym qeth_send_ipa_cmd
        3ff802a9da8 (T) qeth_send_ipa_cmd [qeth] /root/linux/drivers/s390/net/qeth_core_main.c: 2944
        crash>
      
      Now perf report displays the address of symbol qeth_send_ipa_cmd:
      symbol__new:
      
        qeth_send_ipa_cmd 0x130f0-0x132ce
      
      There is a difference of 0x68 between the entry in the symbol table (see
      nm command above) and perf. The difference is from the offset the .text
      segment of qeth.ko:
      
        [root@s8360047 perf]# readelf -a drivers/s390/net/qeth.ko
        Section Headers:
        [Nr] Name              Type             Address           Offset
             Size              EntSize          Flags  Link  Info  Align
        [ 0]                   NULL             0000000000000000  00000000
             0000000000000000  0000000000000000           0     0     0
        [ 1] .note.gnu.build-i NOTE             0000000000000000  00000040
             0000000000000024  0000000000000000   A       0     0     4
        [ 2] .text             PROGBITS         0000000000000000  00000068
             000000000001c8a0  0000000000000000  AX       0     0     8
      
      As seen the .text segment has an offset of 0x68 with start address 0x0.
      Therefore 0x68 is added to the address of qeth_send_ipa_cmd and thus
      0x13088 + 0x68 = 0x130f0 is displayed.
      
      This is wrong, perf report needs to display the start address of symbol
      qeth_send_ipa_cmd at 0x13088 + qeth.ko.text section start address.
      
      The qeth.ko module .text start address is available in the qeth.ko DSO
      map. Just identify the kernel module symbols and correct the addresses.
      
      With the fix I see this correct address for symbol: symbol__new:
      qeth_send_ipa_cmd 0x3ff802a9da8-0x3ff802a9f86
      Signed-off-by: NThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Reviewed-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
      LPU-Reference: 20170803134902.47207-1-tmricht@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/n/tip-q8lktlpoxb5e3dj52u1s1rw4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4a084ecf
    • T
      perf record: Fix wrong size in perf_record_mmap for last kernel module · 9ad4652b
      Thomas Richter 提交于
      During work on perf report for s390 I ran into the following issue:
      
      0 0x318 [0x78]: PERF_RECORD_MMAP -1/0:
              [0x3ff804d6990(0xfffffc007fb2966f) @ 0]:
              x /lib/modules/4.12.0perf1+/kernel/drivers/s390/net/qeth_l2.ko
      
      This is a PERF_RECORD_MMAP entry of the perf.data file with an invalid
      module size for qeth_l2.ko (the s390 ethernet device driver).
      
      Even a mainframe does not have 0xfffffc007fb2966f bytes of main memory.
      
      It turned out that this wrong size is created by the perf record
      command.  What happens is this function call sequence from
      __cmd_record():
      
        perf_session__new():
          perf_session__create_kernel_maps():
            machine__create_kernel_maps():
              machine__create_modules():   Creates map for all loaded kernel modules.
                modules__parse():   Reads /proc/modules and extracts module name and
                                    load address (1st and last column)
                  machine__create_module():   Called for every module found in /proc/modules.
                                    Creates a new map for every module found and enters
                                    module name and start address into the map. Since the
                                    module end address is unknown it is set to zero.
      
      This ends up with a kernel module map list sorted by module start
      addresses.  All module end addresses are zero.
      
      Last machine__create_kernel_maps() calls function map_groups__fixup_end().
      This function iterates through the maps and assigns each map entry's
      end address the successor map entry start address. The last entry of the
      map group has no successor, so ~0 is used as end to consume the remaining
      memory.
      
      Later __cmd_record calls function record__synthesize() which in turn calls
      perf_event__synthesize_kernel_mmap() and perf_event__synthesize_modules()
      to create PERF_REPORT_MMAP entries into the perf.data file.
      
      On s390 this results in the last module qeth_l2.ko
      (which has highest start address, see module table:
              [root@s8360047 perf]# cat /proc/modules
              qeth_l2 86016 1 - Live 0x000003ff804d6000
              qeth 266240 1 qeth_l2, Live 0x000003ff80296000
              ccwgroup 24576 1 qeth, Live 0x000003ff80218000
              vmur 36864 0 - Live 0x000003ff80182000
              qdio 143360 2 qeth_l2,qeth, Live 0x000003ff80002000
              [root@s8360047 perf]# )
      to be the last entry and its map has an end address of ~0.
      
      When the PERF_RECORD_MMAP entry is created for kernel module qeth_l2.ko
      its start address and length is written. The length is calculated in line:
          event->mmap.len   = pos->end - pos->start;
      and results in 0xffffffffffffffff - 0x3ff804d6990(*) = 0xfffffc007fb2966f
      
      (*) On s390 the module start address is actually determined by a __weak function
      named arch__fix_module_text_start() in machine__create_module().
      
      I think this improvable. We can use the module size (2nd column of /proc/modules)
      to get each loaded kernel module size and calculate its end address.
      Only for map entries which do not have a valid end address (end is still zero)
      we can use the heuristic we have now, that is use successor start address or ~0.
      Signed-off-by: NThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Reviewed-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
      LPU-Reference: 20170803134902.47207-2-tmricht@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/n/tip-nmoqij5b5vxx7rq2ckwu8iaj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ad4652b
    • M
      perf srcline: Do not consider empty files as valid srclines · d964b1cd
      Milian Wolff 提交于
      Sometimes we get a non-null, but empty, string for the filename from
      bfd. This then results in srclines of the form ":0", which is different
      from the canonical SRCLINE_UNKNOWN in the form "??:0".  Set the file to
      NULL if it is empty to fix this.
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170806212446.24925-14-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d964b1cd
    • M
      perf util: Take elf_name as const string in dso__demangle_sym · 80c345b2
      Milian Wolff 提交于
      The input string is not modified and thus can be passed in as a pointer
      to const data.
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170806212446.24925-3-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      80c345b2
  5. 11 8月, 2017 2 次提交
  6. 29 7月, 2017 2 次提交
    • G
      perf data: Add mmap[2] events to CTF conversion · f9f6f2a9
      Geneviève Bastien 提交于
      This adds the mmap and mmap2 events to the CTF trace obtained from perf
      data.
      
      These events will allow CTF trace visualization tools like Trace Compass
      to automatically resolve the symbols of the callchain to the
      corresponding function or origin library.
      
      To include those events, one needs to convert with the --all option.
      Here follows an output of babeltrace:
      
        $ sudo perf data convert --all --to-ctf myctftrace
        $ babeltrace ./myctftrace
        [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 },
       { pid = 638, tid = 638, start = 0x7F54AE39E000, filename =
       "/usr/lib/ld-2.25.so" }
        [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 }, { pid =
       638, tid = 638, start = 0x7F54AE565000, filename =
       "/usr/lib/libudev.so.1.6.6" }
        [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 }, { pid =
       638, tid = 638, start = 0x7FFC093EA000, filename = "[vdso]" }
      Signed-off-by: NGeneviève Bastien <gbastien@versatic.net>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
      Cc: Julien Desfossez <jdesfossez@efficios.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170727181205.24843-2-gbastien@versatic.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9f6f2a9
    • G
      perf data: Add callchain to CTF conversion · a3073c8e
      Geneviève Bastien 提交于
      The field perf_callchain, if available, is added to the sampling events
      during the CTF conversion. It is an array of u64 values.  The
      perf_callchain_size field contains the size of the array.
      
      It will allow the analysis of sampling data in trace visualization tools
      like Trace Compass. Possible analyses with those data: dynamic
      flamegraphs, correlation with other tracing data like a userspace trace.
      
      Here follows a babeltrace CTF output of a trace with callchain:
      
        $ babeltrace ./myctftrace
        [17:38:45.672760285] (+?.?????????) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 1, perf_callchain_size = 7, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770, [3] = 0xFFFFFFFF81006EC6, [4] = 0xFFFFFFFF8118245E, [5] = 0xFFFFFFFF810A9224, [6] = 0xFFFFFFFF8164A4C6 ] }
        [17:38:45.672777672] (+0.000017387) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 1, perf_callchain_size = 8, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770, [3] = 0xFFFFFFFF81006EC6, [4] = 0xFFFFFFFF8118245E, [5] = 0xFFFFFFFF810A9224, [6] = 0xFFFFFFFF8164A4C6, [7] = 0xFFFFFFFF8164ABAD ] }
        [17:38:45.672786700] (+0.000009028) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 70, perf_callchain_size = 3, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770 ] }
      Signed-off-by: NGeneviève Bastien <gbastien@versatic.net>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
      Cc: Julien Desfossez <jdesfossez@efficios.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170727181205.24843-1-gbastien@versatic.net
      [ Removed PERF_SAMPLE_CALLCHAIN from the TODO list, jolsa ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a3073c8e
  7. 28 7月, 2017 1 次提交
  8. 27 7月, 2017 5 次提交
  9. 26 7月, 2017 8 次提交
  10. 25 7月, 2017 3 次提交
    • J
      perf evsel: Add verbose output for sys_perf_event_open fallback · 2b04e0f8
      Jiri Olsa 提交于
      Adding info about what is being switched off in the sys_perf_event_open
      fallback.
      
      New output (notice the 'switching off' lines):
      
        $ perf stat -e '{cycles,instructions}' -vvv ls
        Using CPUID GenuineIntel-6-3D
        intel_pt default config: tsc
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
          disabled                         1
          inherit                          1
          enable_on_exec                   1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0x8
        sys_perf_event_open failed, error -22
        switching off cloexec flag
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
          disabled                         1
          inherit                          1
          enable_on_exec                   1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
        sys_perf_event_open failed, error -22
        switching off exclude_guest, exclude_host
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
          disabled                         1
          inherit                          1
          enable_on_exec                   1
        ------------------------------------------------------------
        sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
        sys_perf_event_open failed, error -22
        switching off sample_id_all
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
          disabled                         1
          inherit                          1
          enable_on_exec                   1
        ...
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170721121212.21414-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2b04e0f8
    • A
      perf cgroup: Fix refcount usage · cd8dd032
      Arnaldo Carvalho de Melo 提交于
      When converting from atomic_t to refcount_t we didn't follow the usual
      step of initializing it to one before taking any new reference, which
      trips over checking if taking a reference for a freed refcount_t, fix
      it.
      
      Brendan's report:
      
       ---
      It's 4.12-rc7, with node v4.4.1. I'm building 4.13-rc1 now, as I hit
      what I think is another unrelated perf bug and I'm starting to wonder
      what else is broken on that version:
      
      (root) /mnt/src/linux-4.12-rc7/tools/perf # ./perf record -F 99 -a -e
      cpu-clock --cgroup=docker/f9e9d5df065b14646e8a11edc837a13877fd90c171137b2ba3feb67a0201cb65
      -g
      perf: /mnt/src/linux-4.12-rc7/tools/include/linux/refcount.h:108:
      refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed.
      Aborted
      
      that used to work...
       ---
      
      Testing it:
      
      Before:
      
        # perf stat -e cycles -C 0 --cgroup /
        perf: /home/acme/git/linux/tools/include/linux/refcount.h:108: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed.
        Aborted (core dumped)
        #
      
      After:
      
        # perf stat -e cycles -C 0 --cgroup /
      ^C
        Performance counter stats for 'CPU(s) 0':
      
             132,081,393      cycles                    /
      
             2.492942763 seconds time elapsed
      
        #
      Reported-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Acked-by: NElena Reshetova <elena.reshetova@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Krister Johansen <kjlx@templeofstupid.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sudeep Holla <Sudeep.Holla@arm.com>
      Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 79c5fe6d ("perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t")
      Link: http://lkml.kernel.org/n/tip-l7ovfblq14ip2i08m1g0fkhv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cd8dd032
    • T
      perf annotate stdio: Fix --show-total-period · 585d93c5
      Taeung Song 提交于
      We were showing the total number of samples, not the total period as
      asked by the user, fix it.
      Reported-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Link: http://lkml.kernel.org/n/tip-lh2nh89rtqn5x5vbfthw6qml@git.kernel.org
      Fixes: 0c4a5bce ("perf annotate: Display total number of samples with --show-total-period")
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      585d93c5
  11. 21 7月, 2017 5 次提交
  12. 19 7月, 2017 1 次提交
    • J
      perf report: Show branch type in callchain entry · b851dd49
      Jin Yao 提交于
      Show branch type in callchain entry. The branch type is printed
      with other LBR information (such as cycles/abort/...).
      
      For example:
      
        perf record -g -j any,save_type
        perf report --branch-history --stdio --no-children
      
        38.50%  div.c:45                [.] main                    div
                |
                ---main div.c:42 (RET CROSS_2M cycles:2)
                   compute_flag div.c:28 (cycles:2)
                   compute_flag div.c:27 (RET CROSS_2M cycles:1)
                   rand rand.c:28 (cycles:1)
                   rand rand.c:28 (RET CROSS_2M cycles:1)
                   __random random.c:298 (cycles:1)
                   __random random.c:297 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (RET CROSS_2M cycles:9)
      
      Change log
      
      v6: Remove the branch_type_str() since it's moved to branch.c.
      
      v5: Rewrite the branch info print code in util/callchain.c.
      
      v4: Comparing to previous version, the major changes are:
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1500379995-6449-8-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b851dd49