1. 12 4月, 2018 11 次提交
    • J
      perf version: Print status for syscall_table · 8a812bf5
      Jin Yao 提交于
      This patch doesn't print "libaudit" line if HAVE_SYSCALL_TABLE_SUPPORT
      is available and add a line for HAVE_SYSCALL_TABLE_SUPPORT.
      
      For example,
      
      $ ./perf -vv
      perf version 4.13.rc5.gc2f8af9
                       dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
          dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                       glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                        gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
               syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                      libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                      libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                     libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
      numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                     libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
                   libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                    libslang: [ on  ]  # HAVE_SLANG_SUPPORT
                   libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
                   libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
          libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                        zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                        lzma: [ on  ]  # HAVE_LZMA_SUPPORT
                   get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                         bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
      
      The line "syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT" is
      new created.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1523269609-28824-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8a812bf5
    • J
      perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT · 22e9af4e
      Jin Yao 提交于
      To be consistent with other HAVE_XXX_SUPPORT uses in Makefile.config,
      this patch renames HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT and
      updates the C code accordingly.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1523269609-28824-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      22e9af4e
    • J
      perf script: Use HAVE_LIBXXX_SUPPORT to replace NO_LIBXXX · 90ce61b9
      Jin Yao 提交于
      In Makefile.config, we define the conditional compilation variables
      HAVE_LIBPERL_SUPPORT and HAVE_LIBPYTHON_SUPPORT.
      
      To make the C code more consistent, this patch replaces
      NO_LIBPERL/NO_LIBPYTHON in C code with HAVE_LIBPERL_SUPPORT/
      HAVE_LIBPYTHON_SUPPORT.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1523269609-28824-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      90ce61b9
    • A
      Revert "x86/asm: Allow again using asm.h when building for the 'bpf' clang target" · fd97d39b
      Arnaldo Carvalho de Melo 提交于
      This reverts commit ca26cffa.
      
      Newer clang versions accept that asm(_ASM_SP) construct, and now that
      the bpf-script-test-kbuild.c script, used in one of the 'perf test LLVM'
      subtests doesn't include ptrace.h, which ended up including
      arch/x86/include/asm/asm.h, we can revert this patch.
      Suggested-by: NYonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/r/613f0a0d-c433-8f4d-dcc1-c9889deae39e@fb.comAcked-by: NYonghong Song <yhs@fb.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-nqozcv8loq40tkqpfw997993@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fd97d39b
    • A
      perf tests bpf: Remove unused ptrace.h include from LLVM test · c13009c1
      Arnaldo Carvalho de Melo 提交于
      The bpf-script-test-kbuild.c script, used in one of the LLVM subtests,
      includes ptrace.h unnecessarily, and that ends up making it include a
      header that uses asm(_ASM_SP), a feature that is not supported by clang
      <= 4.0, breaking that 'perf test' entry.
      
      This ended up leading to the ca26cffa ("x86/asm: Allow again using
      asm.h when building for the 'bpf' clang target"), adding an ifndef
      __BPF__ to the arch/x86/include/asm/asm.h file.
      
      Newer clang versions accept that asm(_ASM_SP) construct, so just remove
      the ptrace.h include, which paves the way for reverting ca26cffa
      ("x86/asm: Allow again using asm.h when building for the 'bpf' clang
      target").
      Suggested-by: NYonghong Song <yhs@fb.com>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/r/613f0a0d-c433-8f4d-dcc1-c9889deae39e@fb.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-clbcnzbakdp18ibme4wt43ib@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c13009c1
    • A
      perf jvmti: Give hints about package names needed to build · e14b733c
      Arnaldo Carvalho de Melo 提交于
      Give as examples of package names to install to have this built for
      fedora and debian, to help the user a bit.
      
      The part from 'e.g.:' onwards:
      
        No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel
      
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-edbi4r2pvzn7no6ebxbtczng@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e14b733c
    • A
      perf annotate browser: Allow showing offsets in more than just jump targets · 51f39603
      Arnaldo Carvalho de Melo 提交于
      Jesper wanted to see offsets at callq sites when doing some performance
      investigation related to retpolines, so save him some time by providing
      a 'O' hotkey to allow showing offsets from function start at call
      instructions or in all instructions, just go on pressing 'O' till the
      offsets you need appear.
      
      Example:
      
      Starts with:
      
      Samples: 64  of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963
      ixgbe_read_reg  /proc/kcore
      Percent│    ↑ je     2a
             │   ┌──cmp    $0xffffffff,%r13d
             │   ├──je     d0
             │   │  mov    $0x53e3,%edi
             │   │→ callq  __const_udelay
             │   │  sub    $0x1,%r15d
             │   │↑ jne    83
             │   │  mov    0x8(%rbp),%rax
             │   │  testb  $0x20,0x1799(%rax)
             │   │↑ je     2a
             │   │  mov    0x200(%rax),%rdi
             │   │  mov    %r13d,%edx
             │   │  mov    $0xffffffffc02595d8,%rsi
             │   │→ callq  netdev_warn
             │   │↑ jmpq   2a
             │d0:└─→mov    0x8(%rbp),%rsi
             │      mov    %rbp,%rdi
             │      mov    %eax,0x4(%rsp)
             │    → callq  ixgbe_remove_adapter.isra.77
             │      mov    0x4(%rsp),%eax
      Press 'h' for help on key bindings
      ============================================================================
      
      Pess 'O':
      
      Samples: 64  of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963
      ixgbe_read_reg  /proc/kcore
      Percent│    ↑ je     2a
             │   ┌──cmp    $0xffffffff,%r13d
             │   ├──je     d0
             │   │  mov    $0x53e3,%edi
             │99:│→ callq  __const_udelay
             │   │  sub    $0x1,%r15d
             │   │↑ jne    83
             │   │  mov    0x8(%rbp),%rax
             │   │  testb  $0x20,0x1799(%rax)
             │   │↑ je     2a
             │   │  mov    0x200(%rax),%rdi
             │   │  mov    %r13d,%edx
             │   │  mov    $0xffffffffc02595d8,%rsi
             │c6:│→ callq  netdev_warn
             │   │↑ jmpq   2a
             │d0:└─→mov    0x8(%rbp),%rsi
             │      mov    %rbp,%rdi
             │      mov    %eax,0x4(%rsp)
             │db: → callq  ixgbe_remove_adapter.isra.77
             │      mov    0x4(%rsp),%eax
      Press 'h' for help on key bindings
      ============================================================================
      
      Press 'O' again:
      
      Samples: 64  of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963
      ixgbe_read_reg  /proc/kcore
      Percent│8c: ↑ je     2a
             │8e:┌──cmp    $0xffffffff,%r13d
             │92:├──je     d0
             │94:│  mov    $0x53e3,%edi
             │99:│→ callq  __const_udelay
             │9e:│  sub    $0x1,%r15d
             │a2:│↑ jne    83
             │a4:│  mov    0x8(%rbp),%rax
             │a8:│  testb  $0x20,0x1799(%rax)
             │af:│↑ je     2a
             │b5:│  mov    0x200(%rax),%rdi
             │bc:│  mov    %r13d,%edx
             │bf:│  mov    $0xffffffffc02595d8,%rsi
             │c6:│→ callq  netdev_warn
             │cb:│↑ jmpq   2a
             │d0:└─→mov    0x8(%rbp),%rsi
             │d4:   mov    %rbp,%rdi
             │d7:   mov    %eax,0x4(%rsp)
             │db: → callq  ixgbe_remove_adapter.isra.77
             │e0:   mov    0x4(%rsp),%eax
      Press 'h' for help on key bindings
      ============================================================================
      
      Press 'O' again and it will show just jump target offsets.
      Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-upp6pfdetwlsx18ec2uf1od4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      51f39603
    • A
      perf annotate: Allow showing offsets in more than just jump targets · 592c10e2
      Arnaldo Carvalho de Melo 提交于
      Jesper wanted to see offsets at callq sites when doing some performance
      investigation related to retpolines, so save him some time by providing
      an 'struct annotation_options' to control where offsets should appear:
      just on jump targets? That + call instructions? All?
      
      This puts in place the logic to show the offsets, now we need to wire
      this up in the TUI browser (next patch) and on the 'perf annotate --stdio2"
      interface, where we need a more general mechanism to setup the
      'annotation_options' struct from the command line.
      Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-m3jc9c3swobye9tj08gnh5i7@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      592c10e2
    • K
      perf tests: Run dwarf unwind test on arm32 · af72cfb8
      Kim Phillips 提交于
      Enable the unwind test on arm32:
      
        $ perf test unwind
        58: DWARF unwind                                          : Ok
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Brian Robbins <brianrob@microsoft.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180410191624.a3a468670dd4548c66d3d094@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af72cfb8
    • M
      tools headers: Restore READ_ONCE() C++ compatibility · 4d3b57da
      Mark Rutland 提交于
      Our userspace <linux/compiler.h> defines READ_ONCE() in a way that clang
      doesn't like, as we have an anonymous union in which neither field is
      initialized.
      
      WRITE_ONCE() is fine since it initializes the __val field. For
      READ_ONCE() we can keep clang and GCC happy with a dummy initialization
      of the __c field, so let's do that.
      
      At the same time, let's split READ_ONCE() and WRITE_ONCE() over several
      lines for legibility, as we do in the in-kernel <linux/compiler.h>.
      Reported-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
      Reported-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Tested-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Fixes: 6aa7de05 ("locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()")
      Link: http://lkml.kernel.org/r/20180404163445.16492-1-mark.rutland@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4d3b57da
    • A
      perf stat: Enable 1ms interval for printing event counters values · 9dc9a95f
      Alexey Budankov 提交于
      Currently print count interval for performance counters values is
      limited by 10ms so reading the values at frequencies higher than 100Hz
      is restricted by the tool.
      
      This change makes perf stat -I possible on frequencies up to 1KHz and,
      to some extent, makes perf stat -I to be on-par with perf record
      sampling profiling.
      
      When running perf stat -I for monitoring e.g. PCIe uncore counters and
      at the same time profiling some I/O workload by perf record e.g. for
      cpu-cycles and context switches, it is then possible to observe
      consolidated CPU/OS/IO(Uncore) performance picture for that workload.
      
      Tool overhead warning printed when specifying -v option can be missed
      due to screen scrolling in case you have output to the console
      so message is moved into help available by running perf stat -h.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/b842ad6a-d606-32e4-afe5-974071b5198e@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9dc9a95f
  2. 11 4月, 2018 1 次提交
  3. 10 4月, 2018 4 次提交
    • S
      perf/core: Fix perf_uprobe_init() · 0eadcc7a
      Song Liu 提交于
      Similarly to the uprobe PMU fix in perf_kprobe_init(), fix error
      handling in perf_uprobe_init() as well.
      Reported-by: N范龙飞 <long7573@126.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: e12f03d7 ("perf/core: Implement the 'perf_kprobe' PMU")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0eadcc7a
    • M
      perf/core: Fix perf_kprobe_init() · 5da13ab8
      Masami Hiramatsu 提交于
      Fix error handling in perf_kprobe_init():
      
      	==================================================================
      	BUG: KASAN: slab-out-of-bounds in strlen+0x8e/0xa0 lib/string.c:482
      	Read of size 1 at addr ffff88003f9cc5c0 by task syz-executor2/23095
      
      	CPU: 0 PID: 23095 Comm: syz-executor2 Not tainted 4.16.0+ #24
      	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      	Call Trace:
      	 __dump_stack lib/dump_stack.c:77 [inline]
      	 dump_stack+0xca/0x13e lib/dump_stack.c:113
      	 print_address_description+0x6e/0x2c0 mm/kasan/report.c:256
      	 kasan_report_error mm/kasan/report.c:354 [inline]
      	 kasan_report+0x256/0x380 mm/kasan/report.c:412
      	 strlen+0x8e/0xa0 lib/string.c:482
      	 kstrdup+0x21/0x70 mm/util.c:55
      	 alloc_trace_kprobe+0xc8/0x930 kernel/trace/trace_kprobe.c:325
      	 create_local_trace_kprobe+0x4f/0x3a0 kernel/trace/trace_kprobe.c:1438
      	 perf_kprobe_init+0x149/0x1f0 kernel/trace/trace_event_perf.c:264
      	 perf_kprobe_event_init+0xa8/0x120 kernel/events/core.c:8407
      	 perf_try_init_event+0xcb/0x2a0 kernel/events/core.c:9719
      	 perf_init_event kernel/events/core.c:9750 [inline]
      	 perf_event_alloc+0x1367/0x1e20 kernel/events/core.c:10022
      	 SYSC_perf_event_open+0x242/0x2330 kernel/events/core.c:10477
      	 do_syscall_64+0x198/0x640 arch/x86/entry/common.c:287
      	 entry_SYSCALL_64_after_hwframe+0x42/0xb7
      Reported-by: N范龙飞 <long7573@126.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: e12f03d7 ("perf/core: Implement the 'perf_kprobe' PMU")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5da13ab8
    • I
      Merge tag 'perf-urgent-for-mingo-4.17-20180409' of... · e31193a9
      Ingo Molnar 提交于
      Merge tag 'perf-urgent-for-mingo-4.17-20180409' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      . Fix the --stdio2/TUI annotate output to include group details,
        be it for a recorded '{a,b,f}' explicit event group or when
        forcing group display using 'perf report --group' for a set of
        events not recorded as a group (Arnaldo Carvalho de Melo)
      
      . Fix display artifacts in the ui browser (base class for the
        annotate and main report/top TUI browser) related to the extra
        title lines work (Arnaldo Carvalho de Melo)
      
      . perf auxtrace refactorings, leftovers from a previously partially
        processed patchset (Adrian Hunter)
      
      . Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de Melo)
      
      - Synchronize i915_drm.h, silencing a perf build warning and
        in the process automagically adding support for a new ioctl
        command (Arnaldo Carvalho de Melo)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e31193a9
    • P
      perf/core: Fix use-after-free in uprobe_perf_close() · 621b6d2e
      Prashant Bhole 提交于
      A use-after-free bug was caught by KASAN while running usdt related
      code (BCC project. bcc/tests/python/test_usdt2.py):
      
      	==================================================================
      	BUG: KASAN: use-after-free in uprobe_perf_close+0x222/0x3b0
      	Read of size 4 at addr ffff880384f9b4a4 by task test_usdt2.py/870
      
      	CPU: 4 PID: 870 Comm: test_usdt2.py Tainted: G        W         4.16.0-next-20180409 #215
      	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      	Call Trace:
      	 dump_stack+0xc7/0x15b
      	 ? show_regs_print_info+0x5/0x5
      	 ? printk+0x9c/0xc3
      	 ? kmsg_dump_rewind_nolock+0x6e/0x6e
      	 ? uprobe_perf_close+0x222/0x3b0
      	 print_address_description+0x83/0x3a0
      	 ? uprobe_perf_close+0x222/0x3b0
      	 kasan_report+0x1dd/0x460
      	 ? uprobe_perf_close+0x222/0x3b0
      	 uprobe_perf_close+0x222/0x3b0
      	 ? probes_open+0x180/0x180
      	 ? free_filters_list+0x290/0x290
      	 trace_uprobe_register+0x1bb/0x500
      	 ? perf_event_attach_bpf_prog+0x310/0x310
      	 ? probe_event_disable+0x4e0/0x4e0
      	 perf_uprobe_destroy+0x63/0xd0
      	 _free_event+0x2bc/0xbd0
      	 ? lockdep_rcu_suspicious+0x100/0x100
      	 ? ring_buffer_attach+0x550/0x550
      	 ? kvm_sched_clock_read+0x1a/0x30
      	 ? perf_event_release_kernel+0x3e4/0xc00
      	 ? __mutex_unlock_slowpath+0x12e/0x540
      	 ? wait_for_completion+0x430/0x430
      	 ? lock_downgrade+0x3c0/0x3c0
      	 ? lock_release+0x980/0x980
      	 ? do_raw_spin_trylock+0x118/0x150
      	 ? do_raw_spin_unlock+0x121/0x210
      	 ? do_raw_spin_trylock+0x150/0x150
      	 perf_event_release_kernel+0x5d4/0xc00
      	 ? put_event+0x30/0x30
      	 ? fsnotify+0xd2d/0xea0
      	 ? sched_clock_cpu+0x18/0x1a0
      	 ? __fsnotify_update_child_dentry_flags.part.0+0x1b0/0x1b0
      	 ? pvclock_clocksource_read+0x152/0x2b0
      	 ? pvclock_read_flags+0x80/0x80
      	 ? kvm_sched_clock_read+0x1a/0x30
      	 ? sched_clock_cpu+0x18/0x1a0
      	 ? pvclock_clocksource_read+0x152/0x2b0
      	 ? locks_remove_file+0xec/0x470
      	 ? pvclock_read_flags+0x80/0x80
      	 ? fcntl_setlk+0x880/0x880
      	 ? ima_file_free+0x8d/0x390
      	 ? lockdep_rcu_suspicious+0x100/0x100
      	 ? ima_file_check+0x110/0x110
      	 ? fsnotify+0xea0/0xea0
      	 ? kvm_sched_clock_read+0x1a/0x30
      	 ? rcu_note_context_switch+0x600/0x600
      	 perf_release+0x21/0x40
      	 __fput+0x264/0x620
      	 ? fput+0xf0/0xf0
      	 ? do_raw_spin_unlock+0x121/0x210
      	 ? do_raw_spin_trylock+0x150/0x150
      	 ? SyS_fchdir+0x100/0x100
      	 ? fsnotify+0xea0/0xea0
      	 task_work_run+0x14b/0x1e0
      	 ? task_work_cancel+0x1c0/0x1c0
      	 ? copy_fd_bitmaps+0x150/0x150
      	 ? vfs_read+0xe5/0x260
      	 exit_to_usermode_loop+0x17b/0x1b0
      	 ? trace_event_raw_event_sys_exit+0x1a0/0x1a0
      	 do_syscall_64+0x3f6/0x490
      	 ? syscall_return_slowpath+0x2c0/0x2c0
      	 ? lockdep_sys_exit+0x1f/0xaa
      	 ? syscall_return_slowpath+0x1a3/0x2c0
      	 ? lockdep_sys_exit+0x1f/0xaa
      	 ? prepare_exit_to_usermode+0x11c/0x1e0
      	 ? enter_from_user_mode+0x30/0x30
      	random: crng init done
      	 ? __put_user_4+0x1c/0x30
      	 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      	RIP: 0033:0x7f41d95f9340
      	RSP: 002b:00007fffe71e4268 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
      	RAX: 0000000000000000 RBX: 000000000000000d RCX: 00007f41d95f9340
      	RDX: 0000000000000000 RSI: 0000000000002401 RDI: 000000000000000d
      	RBP: 0000000000000000 R08: 00007f41ca8ff700 R09: 00007f41d996dd1f
      	R10: 00007fffe71e41e0 R11: 0000000000000246 R12: 00007fffe71e4330
      	R13: 0000000000000000 R14: fffffffffffffffc R15: 00007fffe71e4290
      
      	Allocated by task 870:
      	 kasan_kmalloc+0xa0/0xd0
      	 kmem_cache_alloc_node+0x11a/0x430
      	 copy_process.part.19+0x11a0/0x41c0
      	 _do_fork+0x1be/0xa20
      	 do_syscall_64+0x198/0x490
      	 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      	Freed by task 0:
      	 __kasan_slab_free+0x12e/0x180
      	 kmem_cache_free+0x102/0x4d0
      	 free_task+0xfe/0x160
      	 __put_task_struct+0x189/0x290
      	 delayed_put_task_struct+0x119/0x250
      	 rcu_process_callbacks+0xa6c/0x1b60
      	 __do_softirq+0x238/0x7ae
      
      	The buggy address belongs to the object at ffff880384f9b480
      	 which belongs to the cache task_struct of size 12928
      
      It occurs because task_struct is freed before perf_event which refers
      to the task and task flags are checked while teardown of the event.
      perf_event_alloc() assigns task_struct to hw.target of perf_event,
      but there is no reference counting for it.
      
      As a fix we get_task_struct() in perf_event_alloc() at above mentioned
      assignment and put_task_struct() in _free_event().
      Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 63b6da39 ("perf: Fix perf_event_exit_task() race")
      Link: http://lkml.kernel.org/r/20180409100346.6416-1-bhole_prashant_q7@lab.ntt.co.jpSigned-off-by: NIngo Molnar <mingo@kernel.org>
      621b6d2e
  4. 09 4月, 2018 4 次提交
    • S
      perf tests clang: Fix function name for clang IR test · fcbd8fa4
      Sandipan Das 提交于
      As stated in tests/llvm-src-base.c, the name of the bpf function should
      be "bpf_func__SyS_epoll_pwait" but this clang test fails as it tries to
      lookup "bpf_func__SyS_epoll_wait".
      
      Before applying patch:
      
      55: builtin clang support                                 :
      55.1: builtin clang compile C source to IR                : FAILED!
      55.2: builtin clang compile C source to ELF object        : Skip
      
      After applying patch:
      
      55: builtin clang support                                 :
      55.1: builtin clang compile C source to IR                : Ok
      55.2: builtin clang compile C source to ELF object        : Ok
      Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Fixes: e67d52d4 ("perf clang: Update test case to use real BPF script")
      Link: http://lkml.kernel.org/r/20180404180419.19056-3-sandipan@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fcbd8fa4
    • S
      perf clang: Add support for recent clang versions · 7854e499
      Sandipan Das 提交于
      The clang API calls used by perf have changed in recent releases and
      builds succeed with libclang-3.9 only. This introduces compatibility
      with libclang-4.0 and above.
      
      Without this patch, we will see the following compilation errors with
      libclang-4.0+:
      
       util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’:
       util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope
         Opts.Inputs.emplace_back(Path, IK_C);
                                        ^~~~
       util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::Module> perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr<clang::vfs::FileSystem>)’:
       util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’
         Clang.setInvocation(&*CI);
                                 ^
       In file included from util/c++/clang.cpp:14:0:
       /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr<clang::CompilerInvocation>)
          void setInvocation(std::shared_ptr<CompilerInvocation> Value);
               ^~~~~~~~~~~~~
      
      Committer testing:
      
      Tested on Fedora 27 after installing the clang-devel and llvm-devel
      packages, versions:
      
        # rpm -qa | egrep llvm\|clang
        llvm-5.0.1-6.fc27.x86_64
        clang-libs-5.0.1-5.fc27.x86_64
        clang-5.0.1-5.fc27.x86_64
        clang-tools-extra-5.0.1-5.fc27.x86_64
        llvm-libs-5.0.1-6.fc27.x86_64
        llvm-devel-5.0.1-6.fc27.x86_64
        clang-devel-5.0.1-5.fc27.x86_64
        #
      
      Make sure you don't have some older version lying around in /usr/local,
      etc, then:
      
        $ make LIBCLANGLLVM=1 -C tools/perf install-bin
      
      And in the end perf will be linked agains these libraries:
      
        # ldd ~/bin/perf | egrep -i llvm\|clang
      	libclangAST.so.5 => /lib64/libclangAST.so.5 (0x00007f8bb2eb4000)
      	libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x00007f8bb29e3000)
      	libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x00007f8bb23f7000)
      	libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x00007f8bb2060000)
      	libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x00007f8bb1d06000)
      	libclangLex.so.5 => /lib64/libclangLex.so.5 (0x00007f8bb1a3e000)
      	libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x00007f8bb17d4000)
      	libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x00007f8bb15c5000)
      	libclangSema.so.5 => /lib64/libclangSema.so.5 (0x00007f8bb0cc9000)
      	libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x00007f8bb0a23000)
      	libclangParse.so.5 => /lib64/libclangParse.so.5 (0x00007f8bb0725000)
      	libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x00007f8bb039a000)
      	libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x00007f8bace98000)
      	libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x00007f8bab735000)
      	libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x00007f8bab4b2000)
      	libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x00007f8bab2a1000)
      	libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x00007f8bab08e000)
        #
      Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Fixes: 00b86691 ("perf clang: Add builtin clang support ant test case")
      Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandipan@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7854e499
    • S
      perf tools: Fix perf builds with clang support · c2fb54a1
      Sandipan Das 提交于
      For libclang, some distro packages provide static libraries (.a) while
      some provide shared libraries (.so). Currently, perf code can only be
      linked with static libraries. This makes perf build possible for both
      cases.
      Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Fixes: d58ac0bf ("perf build: Add clang and llvm compile and linking support")
      Link: http://lkml.kernel.org/r/20180404180419.19056-1-sandipan@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2fb54a1
    • A
      perf tools: No need to include namespaces.h in util.h · ad0902e0
      Arnaldo Carvalho de Melo 提交于
      The only thing that is needed there is a forward declaration for 'struct
      nsinfo', so disentanble this, which in turns allows built-in clang
      builds, i.e. 'make LIBCLANGLLVM=1 -C tools/perf'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-vq26rsuwq1cqylpcyvq89c84@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ad0902e0
  5. 06 4月, 2018 6 次提交
    • A
      perf hists browser: Remove leftover from row returned from refresh · 94e87a8b
      Arnaldo Carvalho de Melo 提交于
      The per-browser screen refresh routine (ui_browser->refresh()) should
      return the first row that should be cleaned after the rows just printed,
      in case not all rows available on the screen gets filled.
      
      When moving the extra title lines logic from the hists browser to the
      generic ui_browser class, one piece of that logic remained in the hists
      browser and then when going back from the annotate browser to the hists
      browser in a case where fewer lines were displayed in the hists browser,
      for instance when filtering the entries per substring, one line of the
      annotate browser would remain on the screen, fix that.
      
      Example of the screen artifact:
      
      ================================================================================
      Samples: 73K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 45172901394
      Overhead  Shared O  Symbol
         0.30%  [kernel]  [k] __indirect_thunk_start
         0.09%  [kernel]  [k] __x86_indirect_thunk_r10
             │      lfence
      ================================================================================
      
      Here from 'perf top' the view was zoomed with '/thunk' to functions
      having that substring, then the first was annotated and from the
      annotate browser ESC was pressed, then the first lines were overwritten,
      but the 'lfence' line remained due to the off by one bug fixed in this
      cset.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: ef9ff601 ("perf ui browser: Move the extra title lines from the hists browser")
      Link: https://lkml.kernel.org/n/tip-odryfso74eaarm0z3e4v9owx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      94e87a8b
    • A
      perf hists browser: Show extra_title_lines in the 'D' debug hotkey · fdae6400
      Arnaldo Carvalho de Melo 提交于
      To help in fixing problems in the browser.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-uj0n76yqh5bf98i0edckd47t@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fdae6400
    • A
      perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering · b238db65
      Adrian Hunter 提交于
      In preparation for supporting AUX area sampling buffers,
      auxtrace_queues__add_buffer() needs to be more generic. To that end, move
      CPU filtering into it.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1520327598-1317-8-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b238db65
    • I
      Merge tag 'perf-urgent-for-mingo-4.17-20180406' of... · ce9f85c3
      Ingo Molnar 提交于
      Merge tag 'perf-urgent-for-mingo-4.17-20180406' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      - Show group details on the title line in the annotate browser
        and 'perf annotate --stdio2' output, so that the per-event
        columns can have headers (Arnaldo Carvalho de Melo)
      
      - Fixup vertical line separating metrics from instructions and
        cleaning unused lines at the bottom, both in the annotate TUI
        browser (Arnaldo Carvalho de Melo)
      
      - Remove duplicated 'samples' in lost samples warning in
        'perf report' (Arnaldo Carvalho de Melo)
      
      - Synchronize i915_drm.h, silencing the perf build process,
        automagically adding support for the new DRM_I915_QUERY
        ioctl (Arnaldo Carvalho de Melo)
      
      - Make auxtrace_queues__add_buffer() allocate struct buffer,
        from a patchkit already applied (Adrian Hunter)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ce9f85c3
    • A
      tools headers uapi: Synchronize i915_drm.h · 01f97511
      Arnaldo Carvalho de Melo 提交于
      To pick up the changes in:
      
        c822e059 drm/i915: expose rcs topology through query uAPI
        a446ae2c drm/i915: add query uAPI
      
      This affects 'perf trace', that automagically gets the definition of the
      new I915_QUERY DRM ioctl:
      
        --- /tmp/build/perf/trace/beauty/generated/ioctl/drm_ioctl_array.c.old 2018-04-05 14:38:33.660111995 -0300
        +++ /tmp/build/perf/trace/beauty/generated/ioctl/drm_ioctl_array.c 2018-04-05 14:40:17.923283914 -0300
        @@ -158,4 +158,5 @@
                [DRM_COMMAND_BASE + 0x36] = "I915_PERF_OPEN",
                [DRM_COMMAND_BASE + 0x37] = "I915_PERF_ADD_CONFIG",
                [DRM_COMMAND_BASE + 0x38] = "I915_PERF_REMOVE_CONFIG",
        +       [DRM_COMMAND_BASE + 0x39] = "I915_QUERY",
         };
      
      I.e. on systems where this is used it will appear when, for instance,
      one does a system wide 'perf trace' session looking for ioctl calls,
      just like it does with the previously implemented DRM_I915 ioctls:
      
        # perf trace -e ioctl --filter-pids 2190
      <SNIP>
        4346.232 ( 0.012 ms): gnome-shell/1455 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7fff3b0cd910) = 0
        4346.246 ( 0.002 ms): gnome-shell/1455 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_MADVISE, arg: 0x7fff3b0cd980) = 0
        4346.252 ( 0.002 ms): gnome-shell/1455 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7fff3b0cdb00) = 0
      <SNIP>
      
      This silences this perf tools build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-5kxuvruuzdbojvf90f8j2wat@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      01f97511
    • A
      perf report: Remove duplicated 'samples' in lost samples warning · 41a43dac
      Arnaldo Carvalho de Melo 提交于
      The following message, emitted when samples are lost due to system
      overload, had one 'samples' too many, ditch it:
      
         Processed 25333 samples and lost 20.88% samples!
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: https://lkml.kernel.org/n/tip-oev1469y02hmfere6r2kkxp6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41a43dac
  6. 05 4月, 2018 5 次提交
    • A
      perf ui browser: Fixup cleaning unused lines at the bottom · caf61de3
      Arnaldo Carvalho de Melo 提交于
      Now that we can have extra title lines we should use ui_browser->rows
      and not ->height when drawing lines, as well as adding
      ui_browser->extra_title_lines to browser->y when cleaning unused lines
      at the bottom, otherwise we end up clobbering with spaces the last line
      just shown by ui_browser->refresh() routine.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: ef9ff601 ("perf ui browser: Move the extra title lines from the hists browser")
      Link: https://lkml.kernel.org/n/tip-dfcpokt1pm5ixm8n9pxwtstz@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      caf61de3
    • A
      perf annotate browser: Fixup vertical line separating metrics from instructions · e726c851
      Arnaldo Carvalho de Melo 提交于
      Now that we can have extra title lines we should use ui_browser->rows
      and not ->height when drawing lines, as it will use ui_browser__gotorc()
      and that will take the extra title lines into account, which was causing
      an off by one at the end of the vertical line drawn by
      __ui_browser__vline(), fix it.
      
      The visual effect was that the last line, with status messages, was
      being overwritten by the vertical line, looking like:
      
      Press 'h' for help on│key bindings
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: ef9ff601 ("perf ui browser: Move the extra title lines from the hists browser")
      Link: https://lkml.kernel.org/n/tip-08y1ln3xjn76zvizz1i1dsvn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e726c851
    • A
      perf annotate: Show group details on the title line · c0459a09
      Arnaldo Carvalho de Melo 提交于
      To match what is shown in the main 'perf report/top' title lines, i.e.
      if a group is being shown, either a real group (recorded with "-e
      '{a,b,c}') or a forced group (using 'perf report --group' for a
      perf.data file recorded without {}) we will show multiple columns,
      one per event, but we were failing to show the group details, so, for:
      
       # perf report --header-only | grep cmdline
       # cmdline : /home/acme/bin/perf record -e {cycles,instructions,cache-misses}
       # perf report --group
      
      The first line was showing just "cycles", now it shows the correct line,
      which is:
      
        Samples: 578  of events 'anon group { cycles, instructions, cache-misses }', 4000 Hz, Event count (approx.): 487421794
        syscall_return_via_sysret  /lib/modules/4.16.0-rc7/build/vmlinux
          0.22   2.97   0.00 │    ↓ jmp    6c
                             │      mov    %cr3,%rdi
          1.33  10.89   4.00 │    ↓ jmp    62
                             │      mov    %rdi,%rax
      <SNIP>
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 6920e285 ("perf annotate browser: Show extra title line with event information")
      Link: https://lkml.kernel.org/n/tip-i41tqh17c2dabnyzjh99r1oz@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c0459a09
    • A
      perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer · 0d75f123
      Adrian Hunter 提交于
      In preparation for supporting AUX area sampling buffers,
      auxtrace_queues__add_buffer() needs to be more generic. To that end,
      move memory allocation for struct buffer into it.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1520327598-1317-7-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0d75f123
    • S
      perf/x86/intel: Move regs->flags EXACT bit init · d1e7e602
      Stephane Eranian 提交于
      This patch removes a redundant store on regs->flags introduced
      by commit:
      
        71eb9ee9 ("perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs")
      
      We were clearing the PERF_EFLAGS_EXACT but it was overwritten by
      regs->flags = pebs->flags later on.
      
      The PERF_EFLAGS_EXACT is a software flag using bit 3 of regs->flags.
      X86 marks this bit as Reserved. To make sure this bit is zero before
      we do any IP processing, we clear it explicitly.
      
      Patch also removes the following assignment:
      
      	regs->flags = pebs->flags | (regs->flags & PERF_EFLAGS_VM);
      
      Because there is no regs->flags to preserve anymore because
      set_linear_ip() is not called until later.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1522909791-32498-1-git-send-email-eranian@google.com
      [ Improve capitalization, punctuation and clarity of comments. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d1e7e602
  7. 04 4月, 2018 5 次提交
    • I
      Merge tag 'perf-core-for-mingo-4.17-20180403' of... · b89e7914
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-4.17-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      - Show only failing syscalls with 'perf trace --failure' (Arnaldo Carvalho de Melo)
      
      	e.g: See what 'openat' syscalls are failing:
      
        # perf trace --failure -e openat
         762.323 ( 0.007 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video2) = -1 ENOENT No such file or directory
         <SNIP N /dev/videoN open attempts... sigh, where is that improvised camera lid?!? >
         790.228 ( 0.008 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video63) = -1 ENOENT No such file or directory
        ^C#
      
      - Show information about the event (freq, nr_samples, total period/nr_events) in
        the annotate --tui and --stdio2 'perf annotate' output, similar to the
        first line in the 'perf report --tui', but just for the samples for a
        the annotated symbol (Arnaldo Carvalho de Melo)
      
      - Introduce 'perf version --build-options' to show what features were
        linked, aliased as well as a shorter 'perf -vv' (Jin Yao)
      
      - Add a "dso_size" sort order (Kim Phillips)
      
      - Remove redundant ')' in the tracepoint output in 'perf trace' (Changbin Du)
      
      - Synchronize x86's cpufeatures.h, no effect on toolss (Arnaldo Carvalho de Melo)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b89e7914
    • C
      perf trace: Remove redundant ')' · 51125a29
      Changbin Du 提交于
      There is a redundant ')' at the tail of each event. So remove it.
      
      $ sudo perf trace --no-syscalls -e 'kmem:*' -a
         899.342 kmem:kfree:(vfs_writev+0xb9) call_site=ffffffff9c453979 ptr=(nil))
         899.344 kmem:kfree:(___sys_recvmsg+0x188) call_site=ffffffff9c9b8b88 ptr=(nil))
      Signed-off-by: NChangbin Du <changbin.du@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1520937601-24952-1-git-send-email-changbin.du@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      51125a29
    • A
      perf annotate stdio2: Print more descriptive event information header · 520d3f01
      Arnaldo Carvalho de Melo 提交于
      To match the recently added event header information to --tui, e.g.:
      
        # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
        Samples: 128  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 48617682
        _raw_spin_lock_irqsave() /proc/kcore
          0.78        nop
          7.03        push   %rbx
          3.12        pushfq
          6.25        pop    %rax
                      nop
                      mov    %rax,%rbx
          3.12        cli
                      nop
                      xor    %eax,%eax
                      mov    $0x1,%edx
         79.69        lock   cmpxchg %edx,(%rdi)
                      test   %eax,%eax
                    ↓ jne    2b
                      mov    %rbx,%rax
                      pop    %rbx
                    ← retq
                2b:   mov    %eax,%esi
                    → callq  *ffffffffb30eaed0
                      mov    %rbx,%rax
                      pop    %rbx
                    ← retq
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-ujy46x7cldyhyxelyf2b9quy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      520d3f01
    • A
      perf annotate browser: Show extra title line with event information · 6920e285
      Arnaldo Carvalho de Melo 提交于
      So at the top we'll have two lines, like this, from 'perf report':
      
        # perf report --group --ignore-vmlinux
      =====================================================================================================
      Samples: 46  of events 'cycles', 4000 Hz, Event count (approx.): 5154895
      _raw_spin_lock_irqsave  /proc/kcore
      Percent              │      nop
                           │      push   %rbx
        0.00  14.29   0.00 │      pushfq
        9.09   0.00   0.00 │      pop    %rax
        9.09   0.00  20.00 │      nop
                           │      mov    %rax,%rbx
                           │      cli
        4.55   7.14   0.00 │      nop
                           │      xor    %eax,%eax
                           │      mov    $0x1,%edx
                           │      lock   cmpxchg %edx,(%rdi)
       77.27  78.57  70.00 │      test   %eax,%eax
                           │    ↓ jne    2b
                           │      mov    %rbx,%rax
        0.00   0.00  10.00 │      pop    %rbx
                           │    ← retq
                           │2b:   mov    %eax,%esi
                           │    → callq  queued_spin_lock_slowpath
                           │      mov    %rbx,%rax
                           │      pop    %rbx
      Press 'h' for help on│key bindings
      =====================================================================================================
      
       9.09 + 9.09 + 4.55 + 77.27 = 100
      14.29 + 7.14 + 78.57 = 100
      20 + 70 + 10 = 100
      
      We can do the math by using 't' to toggle from 'percent' to nr
      
      =====================================================================================================
      Samples: 46  of events 'cycles', 4000 Hz, Event count (approx.): 5154895
      _raw_spin_lock_irqsave  /proc/kcore
      Period                              │      nop
                                          │      push   %rbx
                0       79273           0 │      pushfq
           190455           0           0 │      pop    %rax
           198038           0        3045 │      nop
                                          │      mov    %rax,%rbx
                                          │      cli
           217233       32562           0 │      nop
                                          │      xor    %eax,%eax
                                          │      mov    $0x1,%edx
                                          │      lock   cmpxchg %edx,(%rdi)
          3421649      979174       28273 │      test   %eax,%eax
                                          │    ↓ jne    2b
                                          │      mov    %rbx,%rax
                0           0        5193 │      pop    %rbx
                                          │    ← retq
                                          │2b:   mov    %eax,%esi
                                          │    → callq  queued_spin_lock_slowpath
                                          │      mov    %rbx,%rax
                                          │      pop    %rbx
      Press 'h' for help on│key bindings
      =====================================================================================================
      
      79273 + 190455 + 198038 + 3045 + 217233 + 32562 + 3421649 + 979174 + 28273 + 5193 = 5154895
      
      Or number of samples:
      
      =====================================================================================================
      ooSamples: 46  of events 'cycles', 4000 Hz, Event count (approx.): 5154895
      _raw_spin_lock_irqsave  /proc/kcore
      Samples              │      nop
                           │      push   %rbx
           0      2      0 │      pushfq
           2      0      0 │      pop    %rax
           2      0      2 │      nop
                           │      mov    %rax,%rbx
                           │      cli
           1      1      0 │      nop
                           │      xor    %eax,%eax
                           │      mov    $0x1,%edx
                           │      lock   cmpxchg %edx,(%rdi)
          17     11      7 │      test   %eax,%eax
                           │    ↓ jne    2b
                           │      mov    %rbx,%rax
           0      0      1 │      pop    %rbx
                           │    ← retq
                           │2b:   mov    %eax,%esi
                           │    → callq  queued_spin_lock_slowpath
                           │      mov    %rbx,%rax
                           │      pop    %rbx
      Press 'h' for help on key bindings
      =====================================================================================================
      
      2 + 2 + 2 + 2 + 1 + 1 + 17 + 11 + 7 + 1 = 46
      Suggested-by: NMartin Liška <mliska@suse.cz>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
      Link: https://lkml.kernel.org/n/tip-ezccyxld50wtwyt66np6aomo@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6920e285
    • A
      perf annotate: Introduce annotation__scnprintf_samples_period() method · b213eac2
      Arnaldo Carvalho de Melo 提交于
      To print a string using the total period (nr_events) and the number of
      samples for a given annotation, i.e. for a given symbol, the counterpart
      to hists__scnprintf_samples_period(), that is for all the samples in a
      session (be it a live session, think 'perf top' or a perf.data file,
      think 'perf report').
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
      Link: https://lkml.kernel.org/n/tip-goj2wu4fxutc8vd46mw3yg14@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b213eac2
  8. 03 4月, 2018 4 次提交