1. 30 6月, 2017 1 次提交
    • D
      bpf: prevent leaking pointer via xadd on unpriviledged · 6bdf6abc
      Daniel Borkmann 提交于
      Leaking kernel addresses on unpriviledged is generally disallowed,
      for example, verifier rejects the following:
      
        0: (b7) r0 = 0
        1: (18) r2 = 0xffff897e82304400
        3: (7b) *(u64 *)(r1 +48) = r2
        R2 leaks addr into ctx
      
      Doing pointer arithmetic on them is also forbidden, so that they
      don't turn into unknown value and then get leaked out. However,
      there's xadd as a special case, where we don't check the src reg
      for being a pointer register, e.g. the following will pass:
      
        0: (b7) r0 = 0
        1: (7b) *(u64 *)(r1 +48) = r0
        2: (18) r2 = 0xffff897e82304400 ; map
        4: (db) lock *(u64 *)(r1 +48) += r2
        5: (95) exit
      
      We could store the pointer into skb->cb, loose the type context,
      and then read it out from there again to leak it eventually out
      of a map value. Or more easily in a different variant, too:
      
         0: (bf) r6 = r1
         1: (7a) *(u64 *)(r10 -8) = 0
         2: (bf) r2 = r10
         3: (07) r2 += -8
         4: (18) r1 = 0x0
         6: (85) call bpf_map_lookup_elem#1
         7: (15) if r0 == 0x0 goto pc+3
         R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R6=ctx R10=fp
         8: (b7) r3 = 0
         9: (7b) *(u64 *)(r0 +0) = r3
        10: (db) lock *(u64 *)(r0 +0) += r6
        11: (b7) r0 = 0
        12: (95) exit
      
        from 7 to 11: R0=inv,min_value=0,max_value=0 R6=ctx R10=fp
        11: (b7) r0 = 0
        12: (95) exit
      
      Prevent this by checking xadd src reg for pointer types. Also
      add a couple of test cases related to this.
      
      Fixes: 1be7f75d ("bpf: enable non-root eBPF programs")
      Fixes: 17a52670 ("bpf: verifier (add verifier core)")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bdf6abc
  2. 20 6月, 2017 1 次提交
  3. 17 6月, 2017 1 次提交
    • M
      perf unwind: Report module before querying isactivation in dwfl unwind · 9126cbba
      Milian Wolff 提交于
      The PC returned by dwfl_frame_pc() may map into a not-yet-reported
      module. We have to report it before we continue unwinding. But when we
      query for the isactivation flag in dwfl_frame_pc, libdw will actually do
      one more unwinding step internally which can then break and lead to
      missed frames or broken stacks.
      
      With libunwind we get e.g.:
      
      ~~~~~
        heaptrack_gui  2228 135073.400474:     613969 cycles:
      	          108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
      	          1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
      	          109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0)
      	          10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
      	          1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
      	           92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	           93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	           93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
      	          2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
      	           f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
      	          1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0)
      	           78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      	           20439 __libc_start_main (/usr/lib/libc-2.25.so)
      	           78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      
        heaptrack_gui  2228 135073.401156:     569521 cycles:
      	          131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
      	          1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
      	          21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0)
      	          2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
      	          279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0)
      	           e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0)
      	           f5a1c QGuiApplicationPrivate::createPlatformIntegration (/usr/lib/libQt5Gui.so.5.8.0)
      	           f650c QGuiApplicationPrivate::createEventDispatcher (/usr/lib/libQt5Gui.so.5.8.0)
      	          298524 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
      	           f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
      	          1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0)
      	           78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      	           20439 __libc_start_main (/usr/lib/libc-2.25.so)
      	           78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      ~~~~~
      
      Note the two frames 1589e8 and 78622 in the first sample. These are
      missing when unwinding with libdw. The second sample's breakage is
      more obvious:
      
      ~~~~~
        heaptrack_gui  2228 135073.400474:     613969 cycles:
      	          108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
      	          1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
      	          109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0)
      	          10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
      	          1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
      	           92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	           93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	           93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
      	          2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
      	           f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
      	           20439 __libc_start_main (/usr/lib/libc-2.25.so)
      	           78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      
      heaptrack_gui  2228 135073.401156:     569521 cycles:
      	          131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
      	          1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
      	          21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0)
      	          2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
      	          279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0)
      	           e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0)
      	          723dbf [unknown] ([unknown])
      ~~~~~
      
      This patch fixes this issue and the libdw unwinder mimicks the libunwind
      behavior more closely.
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Acked-by: NJan Kratochvil <jan.kratochvil@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20170602143753.16907-2-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9126cbba
  4. 16 6月, 2017 1 次提交
  5. 15 6月, 2017 2 次提交
    • J
      perf tools: Fix build with ARCH=x86_64 · 7a759cd8
      Jiada Wang 提交于
      With commit: 0a943cb1 (tools build: Add HOSTARCH Makefile variable)
      when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of
      ARCH=x86, so the perf build process searchs header files from
      tools/arch/x86_64/include, which doesn't exist.
      
      The following build failure is seen:
      
        In file included from util/event.c:2:0:
          tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory
          compilation terminated.
      
      Fix this issue by using SRCARCH instead of ARCH in perf, just like the
      main kernel Makefile and tools/objtool's.
      Signed-off-by: NJiada Wang <jiada_wang@mentor.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
      Cc: Jan Stancek <jstancek@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Rui Teng <rui.teng@linux.vnet.ibm.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 0a943cb1 ("tools build: Add HOSTARCH Makefile variable")
      Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7a759cd8
    • A
      perf evsel: Fix probing of precise_ip level for default cycles event · 7a1ac110
      Arnaldo Carvalho de Melo 提交于
      Since commit 18e7a45a ("perf/x86: Reject non sampling events with
      precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute
      with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done
      in the routine used to probe the max precise level when no events were
      passed to 'perf record' or 'perf top', i.e.:
      
      	perf_evsel__new_cycles()
      		perf_event_attr__set_max_precise_ip()
      
      The x86 code, in x86_pmu_hw_config(), which is called all the way from
      sys_perf_event_open() did, starting with the aforementioned commit:
      
                      /* There's no sense in having PEBS for non sampling events: */
                      if (!is_sampling_event(event))
                              return -EINVAL;
      
      Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using
      just the non precise cycles variant.
      
      To make sure that this is the case, I tested it, before this patch,
      with:
      
        # perf probe -L x86_pmu_hw_config
        <x86_pmu_hw_config@/home/acme/git/linux/arch/x86/events/core.c:0>
              0  int x86_pmu_hw_config(struct perf_event *event)
              1  {
              2         if (event->attr.precise_ip) {
      <SNIP>
             17                 if (event->attr.precise_ip > precise)
             18                         return -EOPNOTSUPP;
      
                                /* There's no sense in having PEBS for non sampling events: */
             21                 if (!is_sampling_event(event))
             22                         return -EINVAL;
                        }
      <SNIP>
        # perf probe x86_pmu_hw_config:22
        Added new events:
          probe:x86_pmu_hw_config (on x86_pmu_hw_config:22)
          probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22)
      
        You can now use it in all perf tools, such as:
      
              perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1
      
        # perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1
           0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1      ) ...
           0.015 (         ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
                                             x86_pmu_hw_config ([kernel.kallsyms])
                                             hsw_hw_config ([kernel.kallsyms])
                                             x86_pmu_event_init ([kernel.kallsyms])
                                             perf_try_init_event ([kernel.kallsyms])
                                             perf_event_alloc ([kernel.kallsyms])
                                             SYSC_perf_event_open ([kernel.kallsyms])
                                             sys_perf_event_open ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             return_from_SYSCALL_64 ([kernel.kallsyms])
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
                                             perf_evsel__new_cycles (/home/acme/bin/perf)
                                             perf_evlist__add_default (/home/acme/bin/perf)
                                             cmd_record (/home/acme/bin/perf)
                                             run_builtin (/home/acme/bin/perf)
                                             handle_internal_command (/home/acme/bin/perf)
           0.000 ( 0.021 ms): perf/4150  ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
           0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1      ) ...
           0.025 (         ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
                                             x86_pmu_hw_config ([kernel.kallsyms])
                                             hsw_hw_config ([kernel.kallsyms])
                                             x86_pmu_event_init ([kernel.kallsyms])
                                             perf_try_init_event ([kernel.kallsyms])
                                             perf_event_alloc ([kernel.kallsyms])
                                             SYSC_perf_event_open ([kernel.kallsyms])
                                             sys_perf_event_open ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             return_from_SYSCALL_64 ([kernel.kallsyms])
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
                                             perf_evsel__new_cycles (/home/acme/bin/perf)
                                             perf_evlist__add_default (/home/acme/bin/perf)
                                             cmd_record (/home/acme/bin/perf)
                                             run_builtin (/home/acme/bin/perf)
                                             handle_internal_command (/home/acme/bin/perf)
           0.023 ( 0.004 ms): perf/4150  ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
           0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1      ) ...
           0.030 (         ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
                                             x86_pmu_hw_config ([kernel.kallsyms])
                                             hsw_hw_config ([kernel.kallsyms])
                                             x86_pmu_event_init ([kernel.kallsyms])
                                             perf_try_init_event ([kernel.kallsyms])
                                             perf_event_alloc ([kernel.kallsyms])
                                             SYSC_perf_event_open ([kernel.kallsyms])
                                             sys_perf_event_open ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             return_from_SYSCALL_64 ([kernel.kallsyms])
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
                                             perf_evsel__new_cycles (/home/acme/bin/perf)
                                             perf_evlist__add_default (/home/acme/bin/perf)
                                             cmd_record (/home/acme/bin/perf)
                                             run_builtin (/home/acme/bin/perf)
                                             handle_internal_command (/home/acme/bin/perf)
           0.028 ( 0.004 ms): perf/4150  ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
          41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
          41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
          41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
          41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
          41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
          41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
          41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
        #
      
      I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times.
      
      So fix it by just setting attr.sample_period
      
      Now, after this patch:
      
        # perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
           0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_event_open_cloexec_flag (/home/acme/bin/perf)
           0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_evlist__config (/home/acme/bin/perf)
           0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_evlist__config (/home/acme/bin/perf)
           0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1           ) = 4
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
           0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_evsel__open (/home/acme/bin/perf)
           0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_evsel__open (/home/acme/bin/perf)
           0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_evsel__open (/home/acme/bin/perf)
           0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
                                             syscall (/usr/lib64/libc-2.24.so)
                                             perf_evsel__open (/home/acme/bin/perf)
        [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
        #
      
      The probe one called from perf_event_attr__set_max_precise_ip() works
      the first time, with attr.precise_ip = 3, wit hthe next ones being the
      per cpu ones for the cycles:ppp event.
      
      And here is the text from a report and alternative proposed patch by
      Thomas-Mich Richter:
      
       ---
      
      On s390 the counter and sampling facility do not support a precise IP
      skid level and sometimes returns EOPNOTSUPP when structure member
      precise_ip in struct perf_event_attr is not set to zero.
      
      On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP.  This
      happens only when no events are specified on command line.
      
      The functions called are
      ...
        --> perf_evlist__add_default
            --> perf_evsel__new_cycles
                --> perf_event_attr__set_max_precise_ip
      
      The last function determines the value of structure member precise_ip by
      invoking the perf_event_open() system call and checking the return code.
      The first successful open is the value for precise_ip.
      
      However the value is determined without setting member sample_period and
      indicates no sampling.
      
      On s390 the counter facility and sampling facility are different.  The
      above procedure determines a precise_ip value of 3 using the counter
      facility. Later it uses the sampling facility with a value of 3 and
      fails with EOPNOTSUPP.
      
       ---
      
      v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members
          of unnamed union members in the container struct initialization, so
          move from:
      
      	struct perf_event_attr attr = {
      		...
      		.sample_period = 1,
      	};
      
      to right after it as:
      
      	struct perf_event_attr attr = {
      		...
      	};
      
      	attr.sample_period = 1;
      
      v3: We need to reset .sample_period to 0 to let the users of
      perf_evsel__new_cycles() to properly setup attr.sample_period or
      attr.sample_freq. Reported by Ingo Molnar.
      Reported-and-Acked-by: NThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 18e7a45a ("perf/x86: Reject non sampling events with precise_ip")
      Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7a1ac110
  6. 09 6月, 2017 10 次提交
  7. 08 6月, 2017 6 次提交
  8. 06 6月, 2017 7 次提交
    • M
      perf report: Ensure the perf DSO mapping matches what libdw sees · 2538b9e2
      Milian Wolff 提交于
      In some situations the libdw unwinder stopped working properly.  I.e.
      with libunwind we see:
      
      ~~~~~
      heaptrack_gui  2228 135073.400112:     641314 cycles:
      	            e8ed _dl_fixup (/usr/lib/ld-2.25.so)
      	           15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
      	           ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
      	           608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
      	            f199 call_init.part.0 (/usr/lib/ld-2.25.so)
      	            f2a5 _dl_init (/usr/lib/ld-2.25.so)
      	             db9 _dl_start_user (/usr/lib/ld-2.25.so)
      ~~~~~
      
      But with libdw and without this patch this sample is not properly
      unwound:
      
      ~~~~~
      heaptrack_gui  2228 135073.400112:     641314 cycles:
      	            e8ed _dl_fixup (/usr/lib/ld-2.25.so)
      	           15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
      	           ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
      ~~~~~
      
      Debug output showed me that libdw found a module for the last frame
      address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
      double-checks what libdw sees and what perf knows. If the mappings
      mismatch, we now report the elf known to perf. This fixes the situation
      above, and the libdw unwinder produces the same stack as libunwind.
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2538b9e2
    • M
      perf report: Include partial stacks unwound with libdw · 5ea0416f
      Milian Wolff 提交于
      So far the whole stack was thrown away when any error occurred before
      the maximum stack depth was unwound. This is actually a very common
      scenario though. The stacks that got unwound so far are still
      interesting. This removes a large chunk of differences when comparing
      perf script output for libunwind and libdw perf unwinding.
      
      E.g. with libunwind:
      
      ~~~~~
      heaptrack_gui  2228 135073.388524:     479408 cycles:
              ffffffff811749ed perf_iterate_ctx ([kernel.kallsyms])
              ffffffff81181662 perf_event_mmap ([kernel.kallsyms])
              ffffffff811cf5ed mmap_region ([kernel.kallsyms])
              ffffffff811cfe6b do_mmap ([kernel.kallsyms])
              ffffffff811b0dca vm_mmap_pgoff ([kernel.kallsyms])
              ffffffff811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
              ffffffff81033acb sys_mmap ([kernel.kallsyms])
              ffffffff81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])
                         192ca mmap64 (/usr/lib/ld-2.25.so)
                          59a9 _dl_map_object_from_fd (/usr/lib/ld-2.25.so)
                          83d0 _dl_map_object (/usr/lib/ld-2.25.so)
                          cda1 openaux (/usr/lib/ld-2.25.so)
                         1834f _dl_catch_error (/usr/lib/ld-2.25.so)
                          cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
                          3481 dl_main (/usr/lib/ld-2.25.so)
                         17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
                          4d37 _dl_start (/usr/lib/ld-2.25.so)
                           d87 _start (/usr/lib/ld-2.25.so)
      
      heaptrack_gui  2228 135073.388677:     611329 cycles:
                         1a3e0 strcmp (/usr/lib/ld-2.25.so)
                          82b2 _dl_map_object (/usr/lib/ld-2.25.so)
                          cda1 openaux (/usr/lib/ld-2.25.so)
                         1834f _dl_catch_error (/usr/lib/ld-2.25.so)
                          cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
                          3481 dl_main (/usr/lib/ld-2.25.so)
                         17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
                          4d37 _dl_start (/usr/lib/ld-2.25.so)
                           d87 _start (/usr/lib/ld-2.25.so)
      ~~~~~
      
      With libdw without this patch:
      
      ~~~~~
      heaptrack_gui  2228 135073.388524:     479408 cycles:
              ffffffff811749ed perf_iterate_ctx ([kernel.kallsyms])
              ffffffff81181662 perf_event_mmap ([kernel.kallsyms])
              ffffffff811cf5ed mmap_region ([kernel.kallsyms])
              ffffffff811cfe6b do_mmap ([kernel.kallsyms])
              ffffffff811b0dca vm_mmap_pgoff ([kernel.kallsyms])
              ffffffff811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
              ffffffff81033acb sys_mmap ([kernel.kallsyms])
              ffffffff81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])
      
      heaptrack_gui  2228 135073.388677:     611329 cycles:
      ~~~~~
      
      With this patch applied, the libdw unwinder will produce the same
      output as the libunwind unwinder.
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20170601210021.20046-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5ea0416f
    • K
      perf annotate: Add missing powerpc triplet · 6db47fde
      Kim Phillips 提交于
      On an Ubuntu xenial system, 'perf annotate' says to install powerpc
      objdump on a system that already has binutils-powerpc-linux-gnu
      installed.  Make perf aware of the missing triplet for the
      powerpc-linux-gnu target.
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20170529142754.7fbfb1152fd8f2663de0ea70@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6db47fde
    • J
      perf test: Disable breakpoint signal tests for powerpc · 598762cf
      Jiri Olsa 提交于
      The following tests are failing on powerpc:
      
        # perf test break
        18: Breakpoint overflow signal handler  : FAILED!
        19: Breakpoint overflow sampling        : FAILED!
      
      The powerpc kenel so far does not have support to even create
      instruction breakpoints using the perf event interface, so those tests
      fail early in the config phase.
      
      I added a '->is_supported()' callback to test struct to be able to
      disable specific tests. It seems better than putting ifdefs directly to
      the test array.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170601205450.GA398@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      598762cf
    • N
      perf symbols: Use correct filename for compressed modules in build-id cache · a09935b8
      Namhyung Kim 提交于
      The decompress_kmodule() decompresses kernel modules in order to load
      symbols from it.  In the DSO_BINARY_TYPE__BUILD_ID_CACHE case, it needs
      the full file path to extract the file extension to determine the
      decompression method.  But overwriting 'name' will fail the
      decompression since it might point to a non-existing old file.
      
      Instead, use dso->long_name for having the correct extension and use the
      real filename to decompress.
      
      In the DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP case, both names should
      be the same.  This allows resolving symbols in the old modules.
      
      Before:
      
        $ perf report -i perf.data.old | grep scsi_mod
           0.00%  cc1      [scsi_mod]    [k] 0x0000000000004aa6
           0.00%  as       [scsi_mod]    [k] 0x00000000000099e1
           0.00%  cc1      [scsi_mod]    [k] 0x0000000000009830
           0.00%  cc1      [scsi_mod]    [k] 0x0000000000001b8f
      
      After:
      
           0.00%  cc1      [scsi_mod]    [k] scsi_handle_queue_ramp_up
           0.00%  as       [scsi_mod]    [k] scsi_sg_alloc
           0.00%  cc1      [scsi_mod]    [k] scsi_setup_cmnd
           0.00%  cc1      [scsi_mod]    [k] scsi_get_command
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170531120105.21731-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a09935b8
    • N
      perf symbols: Set module info when build-id event found · 6b335e8f
      Namhyung Kim 提交于
      Like machine__findnew_module_dso(), it should set necessary info for
      kernel modules to find symbol info from the file.  Factor out
      dso__set_module_info() to do it.
      
      This is needed for dso__needs_decompress() to detect such DSOs.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170531120105.21731-2-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6b335e8f
    • N
      perf header: Set proper module name when build-id event found · 1deec1bd
      Namhyung Kim 提交于
      When perf processes build-id event, it creates DSOs with the build-id.
      But it didn't set the module short name (like '[module-name]') so when
      processing a kernel mmap event of the module, it cannot found the DSO as
      it only checks the short names.
      
      That leads for perf to create a same DSO without the build-id info and
      it'll lookup the system path even if the DSO is already in the build-id
      cache.  After kernel was updated, perf cannot find the DSO  and cannot
      show symbols in it anymore.
      
      You can see this if you have an old data file (w/ old kernel version):
      
        $ perf report -i perf.data.old -v |& grep scsi_mod
        build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz : cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
        Failed to open /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz, continuing without symbols
        ...
      
      The second message didn't show the build-id.  With this patch:
      
        $ perf report -i perf.data.old -v |& grep scsi_mod
        build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz: cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
        /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz with build id cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 not found, continuing without symbols
        ...
      
      Now it shows the build-id but still cannot load the symbol table.  This
      is a different problem which will be fixed in the next patch.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170531120105.21731-1-namhyung@kernel.org
      [ Fix the build on older compilers (debian <= 8, fedora <= 21, etc) wrt kmod_path var init ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1deec1bd
  9. 02 6月, 2017 2 次提交
    • A
      perf stat: Only print NMI watchdog hint when enabled · 918c7b06
      Andi Kleen 提交于
      Only print the NMI watchdog hint when that watchdog it actually enabled.
      
      This avoids printing these unnecessarily.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/n/tip-lnw7edxnqsphkmeew857wz1i@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      918c7b06
    • K
      perf annotate: Fix branch instruction with multiple operands · b13bbeee
      Kim Phillips 提交于
      'perf annotate' is dropping the cr* fields from branch instructions.
      
      Fix it by adding support to display branch instructions having
      multiple operands.
      
      Power Arch objdump of int_sqrt:
      
       20.36 | c0000000004d2694:   subf   r10,r10,r3
             | c0000000004d2698: v bgt    cr6,c0000000004d26a0 <int_sqrt+0x40>
        1.82 | c0000000004d269c:   mr     r3,r10
       29.18 | c0000000004d26a0:   mr     r10,r8
             | c0000000004d26a4: v bgt    cr7,c0000000004d26ac <int_sqrt+0x4c>
             | c0000000004d26a8:   mr     r10,r7
      
      Power Arch Before Patch:
      
       20.36 |       subf   r10,r10,r3
             |     v bgt    40
        1.82 |       mr     r3,r10
       29.18 | 40:   mr     r10,r8
             |     v bgt    4c
             |       mr     r10,r7
      
      Power Arch After patch:
      
       20.36 |       subf   r10,r10,r3
             |     v bgt    cr6,40
        1.82 |       mr     r3,r10
       29.18 | 40:   mr     r10,r8
             |     v bgt    cr7,4c
             |       mr     r10,r7
      
      Also support AArch64 conditional branch instructions, which can
      have up to three operands:
      
      Aarch64 Non-simplified (raw objdump) view:
      
             │ffff0000083cd11c: ↑ cbz    w0, ffff0000083cd100 <security_fil▒
      ...
        4.44 │ffff000│083cd134: ↓ tbnz   w0, #26, ffff0000083cd190 <securit▒
      ...
        1.37 │ffff000│083cd144: ↓ tbnz   w22, #5, ffff0000083cd1a4 <securit▒
             │ffff000│083cd148:   mov    w19, #0x20000                   //▒
        1.02 │ffff000│083cd14c: ↓ tbz    w22, #2, ffff0000083cd1ac <securit▒
      ...
        0.68 │ffff000└──3cd16c: ↑ cbnz   w0, ffff0000083cd120 <security_fil▒
      
      Aarch64 Simplified, before this patch:
      
             │    ↑ cbz    40
      ...
        4.44 │   │↓ tbnz   w0, #26, ffff0000083cd190 <security_file_permiss▒
      ...
        1.37 │   │↓ tbnz   w22, #5, ffff0000083cd1a4 <security_file_permiss▒
             │   │  mov    w19, #0x20000                   // #131072
        1.02 │   │↓ tbz    w22, #2, ffff0000083cd1ac <security_file_permiss▒
      ...
        0.68 │   └──cbnz   60
      
      the cbz operand is missing, and the tbz doesn't get simplified processing
      at all because the parsing function failed to match an address.
      
      Aarch64 Simplified, After this patch applied:
      
             │    ↑ cbz    w0, 40
      ...
        4.44 │   │↓ tbnz   w0, #26, d0
      ...
        1.37 │   │↓ tbnz   w22, #5, e4
             │   │  mov    w19, #0x20000                   // #131072
        1.02 │   │↓ tbz    w22, #2, ec
      ...
        0.68 │   └──cbnz   w0, 60
      Originally-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reported-by: NAnton Blanchard <anton@samba.org>
      Reported-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Link: http://lkml.kernel.org/r/20170601092959.f60d98912e8a1b66fd1e4c0e@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b13bbeee
  10. 01 6月, 2017 1 次提交
    • J
      perf trace: Add mmap alias for s390 · 54265664
      Jiri Olsa 提交于
      The s390 architecture maps sys_mmap (nr 90) into sys_old_mmap.  For this
      reason perf trace can't find the proper syscall event to get args format
      from and displays it wrongly as 'continued'.
      
      To fix that fill the "alias" field with "old_mmap" for trace's mmap record
      to get the correct translation.
      
      Before:
           0.042 ( 0.011 ms): vest/43052 fstat(statbuf: 0x3ffff89fd90                ) = 0
           0.042 ( 0.028 ms): vest/43052  ... [continued]: mmap()) = 0x3fffd6e2000
           0.072 ( 0.025 ms): vest/43052 read(buf: 0x3fffd6e2000, count: 4096        ) = 6
      
      After:
           0.045 ( 0.011 ms): fstat(statbuf: 0x3ffff8a0930                           ) = 0
           0.057 ( 0.018 ms): mmap(arg: 0x3ffff8a0858                                ) = 0x3fffd14a000
           0.076 ( 0.025 ms): read(buf: 0x3fffd14a000, count: 4096                   ) = 6
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170531113557.19175-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      54265664
  11. 27 5月, 2017 2 次提交
  12. 26 5月, 2017 1 次提交
  13. 24 5月, 2017 5 次提交
    • I
      tools/include: Sync kernel ABI headers with tooling headers · 6e30437b
      Ingo Molnar 提交于
      Sync (copy) the following v4.12 kernel headers to the tooling headers:
      
        arch/x86/include/asm/disabled-features.h:
        arch/x86/include/uapi/asm/kvm.h:
        arch/powerpc/include/uapi/asm/kvm.h:
        arch/s390/include/uapi/asm/kvm.h:
        arch/arm/include/uapi/asm/kvm.h:
        arch/arm64/include/uapi/asm/kvm.h:
      
         - 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
           fortunately none of the (in-kernel) tooling relied on it
      
         - new KVM_DEV calls added
      
        arch/x86/include/asm/required-features.h:
      
         - 5-level paging hardware ABI detail added
      
        arch/x86/include/asm/cpufeatures.h:
      
         - new CPU feature added
      
        arch/x86/include/uapi/asm/vmx.h:
      
         - new VMX exit conditions
      
      None of the changes requires fixes in the tooling source code.
      
      This addresses the following warnings:
      
        Warning: include/uapi/linux/stat.h differs from kernel
        Warning: arch/x86/include/asm/disabled-features.h differs from kernel
        Warning: arch/x86/include/asm/required-features.h differs from kernel
        Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
        Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
        Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel
      
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e30437b
    • N
      perf tools: Put caller above callee in --children mode · 7111ffff
      Namhyung Kim 提交于
      The __hpp__sort_acc() sorts entries using callchain depth in order to
      put callers above in children mode.  But it assumed the callchain order
      was callee-first.  Now default (for children) is caller-first so the
      order of entries is reverted.
      
      For example, consider following case:
      
        $ perf report --no-children
        ..l
        # Overhead  Command  Shared Object        Symbol
        # ........  .......  ...................  ..........................
        #
            99.44%  a.out    a.out                [.] main
                    |
                    ---main
                       __libc_start_main
                       _start
      
      Then children mode should show 'start' above '__libc_start_main' since
      it's the caller (parent) of the __libc_start_main.  But it's reversed:
      
        # Children      Self  Command  Shared Object    Symbol
        # ........  ........  .......  ...............  .....................
        #
            99.61%     0.00%  a.out    libc-2.25.so     [.] __libc_start_main
            99.61%     0.00%  a.out    a.out            [.] _start
            99.54%    99.44%  a.out    a.out            [.] main
      
      This patch fixes it.
      
        # Children      Self  Command  Shared Object    Symbol
        # ........  ........  .......  ...............  .....................
        #
            99.61%     0.00%  a.out    a.out            [.] _start
            99.61%     0.00%  a.out    libc-2.25.so     [.] __libc_start_main
            99.54%    99.44%  a.out    a.out            [.] main
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7111ffff
    • M
      perf report: Do not drop last inlined frame · 4d53b9d5
      Milian Wolff 提交于
      The very last inlined frame, i.e. the one furthest away from the
      non-inlined frame, was silently dropped. This is apparent when
      comparing the output of `perf script` and `addr2line`:
      
      ~~~~~~
        $ perf script --inline
        ...
        a.out 26722 80836.309329:      72425 cycles:
                           21561 __hypot_finite (/usr/lib/libm-2.25.so)
                            ace3 hypot (/usr/lib/libm-2.25.so)
                             a4a main (a.out)
                                 std::abs<double>
                                 std::_Norm_helper<true>::_S_do_it<double>
                                 std::norm<double>
                                 main
                           20510 __libc_start_main (/usr/lib/libc-2.25.so)
                             bd9 _start (a.out)
      
        $ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
        0x0000000000000a4a
        std::__complex_abs(doublecomplex )
        /usr/include/c++/6.3.1/complex:589
        double std::abs<double>(std::complex<double> const&)
        /usr/include/c++/6.3.1/complex:597
        double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
        /usr/include/c++/6.3.1/complex:654
        double std::norm<double>(std::complex<double> const&)
        /usr/include/c++/6.3.1/complex:664
        main
        /tmp/inlining.cpp:14
      ~~~~~
      
      Note how `std::__complex_abs` is missing from the `perf script`
      output. This is similarly showing up in `perf report`. The patch
      here fixes this issue, and the output becomes:
      
      ~~~~~
        a.out 26722 80836.309329:      72425 cycles:
                           21561 __hypot_finite (/usr/lib/libm-2.25.so)
                            ace3 hypot (/usr/lib/libm-2.25.so)
                             a4a main (a.out)
                                 std::__complex_abs
                                 std::abs<double>
                                 std::_Norm_helper<true>::_S_do_it<double>
                                 std::norm<double>
                                 main
                           20510 __libc_start_main (/usr/lib/libc-2.25.so)
                             bd9 _start (a.out)
      ~~~~~
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4d53b9d5
    • M
      perf report: Always honor callchain order for inlined nodes · 28071f51
      Milian Wolff 提交于
      So far, the inlined nodes where only reversed when we built perf
      against libbfd. If that was not available, the addr2line fallback
      code path was missing the inline_list__reverse call.
      
      Now we always add the nodes in the correct order within
      inline_list__append. This removes the need to reverse the list
      and also ensures that all callers construct the list in the right
      order.
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      28071f51
    • N
      perf script: Add --inline option for debugging · 325fbff5
      Namhyung Kim 提交于
      The --inline option is to show inlined functions in callchains.
      
      For example:
      
        $ perf script
        a.out  5644 11611.467597:     309961 cycles:u:
                           790 main (/home/namhyung/tmp/perf/a.out)
                         20511 __libc_start_main (/usr/lib/libc-2.25.so)
                           8ba _start (/home/namhyung/tmp/perf/a.out)
        ...
      
        $ perf script --inline
        a.out  5644 11611.467597:     309961 cycles:u:
                           790 main (/home/namhyung/tmp/perf/a.out)
                               std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
                               std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                               std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                               main
                         20511 __libc_start_main (/usr/lib/libc-2.25.so)
                           8ba _start (/home/namhyung/tmp/perf/a.out)
        ...
      Reviewed-and-tested-by: NMilian Wolff <milian.wolff@kdab.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-5-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      325fbff5