1. 16 3月, 2017 3 次提交
    • A
      perf script: Add 'brstackinsn' for branch stacks · 48d02a1d
      Andi Kleen 提交于
      Implement printing instruction sequences as hex dump for branch stacks.
      
      This relies on the x86 instruction decoder used by the PT decoder to
      find the lengths of instructions to dump them individually.
      
      This is good enough for pattern matching.
      
      This allows to study hot paths for individual samples, together with
      branch misprediction and cycle count / IPC information if available (on
      Skylake systems).
      
        % perf record -b ...
        % perf script -F brstackinsn
        ...
          read_hpet+67:
                ffffffff9905b843        insn: 74 ea                     # PRED
                ffffffff9905b82f        insn: 85 c9
                ffffffff9905b831        insn: 74 12
                ffffffff9905b833        insn: f3 90
                ffffffff9905b835        insn: 48 8b 0f
                ffffffff9905b838        insn: 48 89 ca
                ffffffff9905b83b        insn: 48 c1 ea 20
                ffffffff9905b83f        insn: 39 f2
                ffffffff9905b841        insn: 89 d0
                ffffffff9905b843        insn: 74 ea                     # PRED
      
      Only works when no special branch filters are specified.
      
      Occasionally the path does not reach up to the sample IP, as the LBRs
      may be frozen before executing a final jump. In this case we print a
      special message.
      
      The instruction dumper piggy backs on the existing infrastructure from
      the IP PT decoder.
      
      An earlier iteration of this patch relied on a disassembler, but this
      version only uses the existing instruction decoder.
      
      Committer note:
      
      Added hint about how to get suitable perf.data files for use with
      '-F brstackinsm':
      
        $ perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
        $
        $ perf script -F brstackinsn
        Display of branch stack assembler requested, but non all-branch filter set
        Hint: run 'perf record -b ...'
        $
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      48d02a1d
    • S
      perf tools: Make perf_event__synthesize_mmap_events() scale · 88b897a3
      Stephane Eranian 提交于
      This patch significantly improves the execution time of
      perf_event__synthesize_mmap_events() when running perf record on systems
      where processes have lots of threads.
      
      It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to
      generate each map line in the maps file.  If you have 1000 threads, then you
      have necessarily 1000 stacks.  For each vma, you need to check if it
      corresponds to a thread's stack.  With a large number of threads, this can take
      a very long time. I have seen latencies >> 10mn.
      
      As of today, perf does not use the fact that a mapping is a stack, therefore we
      can work around the issue by using /proc/pid/tasks/pid/maps.  This entry does
      not try to map a vma to stack and is thus much faster with no loss of
      functonality.
      
      The proc-map-timeout logic is kept in case users still want some upper limit.
      
      In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual
      /proc/pid/task/pid/maps, tasks -> task.  Thanks Arnaldo for catching this.
      
      Committer note:
      
      This problem seems to have been elliminated in the kernel since commit :
      b18cb64e ("fs/proc: Stop trying to report thread stacks").
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com
      Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      88b897a3
    • R
      perf probe: Introduce util func is_sdt_event() · af9100ad
      Ravi Bangoria 提交于
      Factor out the SDT event name checking routine as is_sdt_event().
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170314150658.7065-2-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af9100ad
  2. 15 3月, 2017 4 次提交
    • N
      perf kretprobes: Offset from reloc_sym if kernel supports it · 7ab31d94
      Naveen N. Rao 提交于
      We indicate support for accepting sym+offset with kretprobes through a
      line in ftrace README. Parse the same to identify support and choose the
      appropriate format for kprobe_events.
      
      As an example, without this perf patch, but with the ftrace changes:
      
        naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/tracing/README | grep kretprobe
        place (kretprobe): [<module>:]<symbol>[+<offset>]|<memaddr>
        naveen@ubuntu:~/linux/tools/perf$
        naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return
        probe-definition(0): do_open%return
        symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null)
        0 arguments
        Looking at the vmlinux_path (8 entries long)
        Using /boot/vmlinux for symbols
        Open Debuginfo file: /boot/vmlinux
        Try to find probe point from debuginfo.
        Matched function: do_open [2d0c7d8]
        Probe point found: do_open+0
        Matched function: do_open [35d76b5]
        found inline addr: 0xc0000000004ba984
        Failed to find "do_open%return",
         because do_open is an inlined function and has no return point.
        An error occurred in debuginfo analysis (-22).
        Trying to use symbols.
        Opening /sys/kernel/debug/tracing//kprobe_events write=1
        Writing event: r:probe/do_open do_open+0
        Writing event: r:probe/do_open_1 do_open+0
        Added new events:
          probe:do_open        (on do_open%return)
          probe:do_open_1      (on do_open%return)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe:do_open_1 -aR sleep 1
      
        naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list
        c000000000041370  k  kretprobe_trampoline+0x0    [OPTIMIZED]
        c0000000004433d0  r  do_open+0x0    [DISABLED]
        c0000000004433d0  r  do_open+0x0    [DISABLED]
      
      And after this patch (and the subsequent powerpc patch):
      
        naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return
        probe-definition(0): do_open%return
        symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null)
        0 arguments
        Looking at the vmlinux_path (8 entries long)
        Using /boot/vmlinux for symbols
        Open Debuginfo file: /boot/vmlinux
        Try to find probe point from debuginfo.
        Matched function: do_open [2d0c7d8]
        Probe point found: do_open+0
        Matched function: do_open [35d76b5]
        found inline addr: 0xc0000000004ba984
        Failed to find "do_open%return",
         because do_open is an inlined function and has no return point.
        An error occurred in debuginfo analysis (-22).
        Trying to use symbols.
        Opening /sys/kernel/debug/tracing//README write=0
        Opening /sys/kernel/debug/tracing//kprobe_events write=1
        Writing event: r:probe/do_open _text+4469712
        Writing event: r:probe/do_open_1 _text+4956248
        Added new events:
          probe:do_open        (on do_open%return)
          probe:do_open_1      (on do_open%return)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe:do_open_1 -aR sleep 1
      
        naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list
        c000000000041370  k  kretprobe_trampoline+0x0    [OPTIMIZED]
        c0000000004433d0  r  do_open+0x0    [DISABLED]
        c0000000004ba058  r  do_open+0x8    [DISABLED]
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/496ef9f33c1ab16286ece9dd62aa672807aef91c.1488961018.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7ab31d94
    • N
      perf probe: Factor out the ftrace README scanning · 3da3ea7a
      Naveen N. Rao 提交于
      Simplify and separate out the ftrace README scanning logic into a
      separate helper. This is used subsequently to scan for all patterns of
      interest and to cache the result.
      
      Since we are only interested in availability of probe argument type x,
      we will only scan for that.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/6dc30edc747ba82a236593be6cf3a046fa9453b5.1488961018.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3da3ea7a
    • H
      perf tools: Add 'cgroup_id' sort order keyword · d890a98c
      Hari Bathini 提交于
      This patch introduces a cgroup identifier entry field in perf report to
      identify or distinguish data of different cgroups. It uses the device
      number and inode number of cgroup namespace, included in perf data with
      the new PERF_RECORD_NAMESPACES event, as cgroup identifier.
      
      With the assumption that each container is created with it's own cgroup
      namespace,  this allows assessment/analysis of multiple containers at
      once.
      
      A simple test for this would be to clone a few processes passing
      SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
      different workloads  on each of those contexts,  while running perf
      record command with --namespaces option.
      
      Shown below is the output of perf report, sorted with cgroup identifier,
      on perf.data generated with the above test scenario, clearly indicating
      one context's considerable use of kernel memory in comparison with
      others:
      
      	$ perf report -s cgroup_id,sample --stdio
      	#
      	# Total Lost Samples: 0
      	#
      	# Samples: 5K of event 'kmem:kmalloc'
      	# Event count (approx.): 5965
      	#
      	# Overhead  cgroup id (dev/inode)       Samples
      	# ........  .....................  ............
      	#
      	    81.27%  3/0xeffffffb                   4848
      	    16.24%  3/0xf00000d0                    969
      	     1.16%  3/0xf00000ce                     69
      	     0.82%  3/0xf00000cf                     49
      	     0.50%  0/0x0                            30
      
      While this is a start, there is further scope of improving this. For
      example, instead of cgroup namespace's device and inode numbers, dev
      and inode numbers of some or all namespaces may be used to distinguish
      which processes are running in a given container context.
      
      Also, scripts to map device and inode info to containers sounds
      plausible for better tracing of containers.
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d890a98c
    • H
      perf record: Synthesize namespace events for current processes · e907caf3
      Hari Bathini 提交于
      Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior
      to invocation of perf record. The data for this is taken from /proc/$PID/ns.
      These changes make way for analyzing events with regard to namespaces.
      
      Committer notes:
      
      Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the
      test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread".
      
      Testing it:
      
        # ps axH > /tmp/allthreads
        # perf record -a --namespaces usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ]
        # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l
        602
        # wc -l /tmp/allthreads
        601 /tmp/allthreads
        # tail /tmp/allthreads
        16951 pts/4    T      0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^
        16952 pts/4    T      0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^
        17176 pts/4    T      0:00 git commit --amend --no-post-rewrite
        17204 pts/4    T      0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG
        18939 ?        S      0:00 [kworker/2:1]
        18947 ?        S      0:00 [kworker/3:0]
        18974 ?        S      0:00 [kworker/1:0]
        19047 ?        S      0:00 [kworker/0:1]
        19152 pts/6    S+     0:00 weechat
        19153 pts/7    R+     0:00 ps axH
        # perf report -D | grep PERF_RECORD_NAMESPACES | tail
        0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7
        0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7
        0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7
        0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7
        0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7
        0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7
        0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7
        0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7
        0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
        0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
        #
      
      Humm, investigate why we got two record for the 19155 pid/tid...
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e907caf3
  3. 14 3月, 2017 1 次提交
    • H
      perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info · f3b3614a
      Hari Bathini 提交于
      Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
      by the kernel when fork, clone, setns or unshare are invoked. And update
      perf-record documentation with the new option to record namespace
      events.
      
      Committer notes:
      
      Combined it with a later patch to allow printing it via 'perf report -D'
      and be able to test the feature introduced in this patch. Had to move
      here also perf_ns__name(), that was introduced in another later patch.
      
      Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
      
        util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
           ret  += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
                                               ^
      Testing it:
      
        # perf record --namespaces -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
        #
        # perf report -D
        <SNIP>
        3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
                      [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                       4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      
        0x1151e0 [0x30]: event: 9
        .
        . ... raw event: size 48 bytes
        .  0000:  09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00  ......0..q.h....
        .  0010:  a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00  .9...9...(.c....
        .  0020:  03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00  ................
        <SNIP>
              NAMESPACES events:          1
        <SNIP>
        #
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b3614a
  4. 13 3月, 2017 1 次提交
  5. 04 3月, 2017 16 次提交
  6. 28 2月, 2017 2 次提交
  7. 20 2月, 2017 2 次提交
  8. 18 2月, 2017 2 次提交
  9. 17 2月, 2017 6 次提交
    • J
      perf tools: Replace _SC_NPROCESSORS_CONF with max_present_cpu in cpu_topology_map · da8a58b5
      Jan Stancek 提交于
      There are 2 problems wrt. cpu_topology_map on systems with sparse CPUs:
      
      1. offline/absent CPUs will have their socket_id and core_id set to -1
         which triggers:
         "socket_id number is too big.You may need to upgrade the perf tool."
      
      2. size of cpu_topology_map (perf_env.cpu[]) is allocated based on
         _SC_NPROCESSORS_CONF, but can be indexed with CPU ids going above.
         Users of perf_env.cpu[] are using CPU id as index. This can lead
         to read beyond what was allocated:
         ==19991== Invalid read of size 4
         ==19991==    at 0x490CEB: check_cpu_topology (topology.c:69)
         ==19991==    by 0x490CEB: test_session_topology (topology.c:106)
         ...
      
      For example:
        _SC_NPROCESSORS_CONF == 16
        available: 2 nodes (0-1)
        node 0 cpus: 0 6 8 10 16 22 24 26
        node 0 size: 12004 MB
        node 0 free: 9470 MB
        node 1 cpus: 1 7 9 11 23 25 27
        node 1 size: 12093 MB
        node 1 free: 9406 MB
        node distances:
        node   0   1
          0:  10  20
          1:  20  10
      
      This patch changes HEADER_NRCPUS.nr_cpus_available from _SC_NPROCESSORS_CONF
      to max_present_cpu and updates any user of cpu_topology_map to iterate
      with nr_cpus_avail.
      
      As a consequence HEADER_CPU_TOPOLOGY core_id and socket_id lists get longer,
      but maintain compatibility with pre-patch state - index to cpu_topology_map is
      CPU id.
      
        perf test 36 -v
        36: Session topology                           :
        --- start ---
        test child forked, pid 22211
        templ file: /tmp/perf-test-gmdX5i
        CPU 0, core 0, socket 0
        CPU 1, core 0, socket 1
        CPU 6, core 10, socket 0
        CPU 7, core 10, socket 1
        CPU 8, core 1, socket 0
        CPU 9, core 1, socket 1
        CPU 10, core 9, socket 0
        CPU 11, core 9, socket 1
        CPU 16, core 0, socket 0
        CPU 22, core 10, socket 0
        CPU 23, core 10, socket 1
        CPU 24, core 1, socket 0
        CPU 25, core 1, socket 1
        CPU 26, core 9, socket 0
        CPU 27, core 9, socket 1
        test child finished with 0
        ---- end ----
        Session topology: Ok
      Signed-off-by: NJan Stancek <jstancek@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/d7c05c6445fca74a8442c2c73cfffd349c52c44f.1487146877.git.jstancek@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      da8a58b5
    • J
      perf header: Make build_cpu_topology skip offline/absent CPUs · 43db2843
      Jan Stancek 提交于
      When build_cpu_topo() encounters offline/absent CPUs, it fails to find any
      sysfs entries and returns failure.
      
      This leads to build_cpu_topology() and write_cpu_topology() failing as
      well.
      
      Because HEADER_CPU_TOPOLOGY has not been written, read leaves cpu_topology_map
      NULL and we get NULL ptr deref at:
      
        ...
         cmd_test
          __cmd_test
           test_and_print
            run_test
             test_session_topology
              check_cpu_topology
      
        36: Session topology                           :
        --- start ---
        test child forked, pid 14902
        templ file: /tmp/perf-test-4CKocW
        failed to write feature HEADER_CPU_TOPOLOGY
        perf: Segmentation fault
        Obtained 9 stack frames.
        ./perf(sighandler_dump_stack+0x41) [0x5095f1]
        /lib64/libc.so.6(+0x35250) [0x7f4b7c3c9250]
        ./perf(test_session_topology+0x1db) [0x490ceb]
        ./perf() [0x475b68]
        ./perf(cmd_test+0x5b9) [0x4763c9]
        ./perf() [0x4945a3]
        ./perf(main+0x69f) [0x427e8f]
        /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f4b7c3b5b35]
        ./perf() [0x427fb9]
        test child interrupted
        ---- end ----
        Session topology: FAILED!
      
      This patch makes build_cpu_topology() skip offline/absent CPUs, by checking
      their presence against cpu_map built from online CPUs.
      Signed-off-by: NJan Stancek <jstancek@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/a271b770175524f4961d4903af33798358a4a518.1487146877.git.jstancek@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      43db2843
    • J
      perf cpumap: Add cpu__max_present_cpu() · 92a7e127
      Jan Stancek 提交于
      Similar to cpu__max_cpu() (which returns the max possible CPU), returns
      the max present CPU.
      Signed-off-by: NJan Stancek <jstancek@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/8ea4601b5cacc49927235b4ebac424bd6eeccb06.1487146877.git.jstancek@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      92a7e127
    • A
      perf session: Fix DEBUG=1 build with clang · 8074bf51
      Arnaldo Carvalho de Melo 提交于
      The struct branch_stack->branch_stack.cycles field is a u64 :16
      bitfield, and this somehow confuses clang 4.0 when checking the
      arguments of a printf format, so cast the :16 to unsigned short to help
      it.
      
      Silences this:
      
        util/session.c:935:4: error: format specifies type 'unsigned short' but the argument has type 'u64' (aka 'unsigned long') [-Werror,-Wformat]
                                e->flags.cycles,
                                ^~~~~~~~~~~~~~~
        1 error generated.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-eo2t4uhlbne105z72tvyzkp1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8074bf51
    • A
      perf python: Filter out -specs=/a/b/c from the python binding cc options · 4be92cf0
      Arnaldo Carvalho de Melo 提交于
      The -spec=/path/to/file can be used to change what gcc puts in the cc,
      ld, etc command lines, but this is not present in clang, filter it out
      at the setup.py file by changing python2's internal variable where it
      keeps its initial CFLAGS value.
      
      With this all of perf can be built in at least Fedora 25, fixing this
      problem:
      
          GEN      /tmp/build/perf/python/perf.so
          CC       /tmp/build/perf/builtin-buildid-list.o
        clang-4.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
        clang-4.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
        error: command 'clang' failed with exit status 1
      
      Now I need to change all the containers where I have clang to build
      perf with it, so that we can check that in other distros (opensuse, debian,
      ubuntu, etc) this also works.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-g9lhgr162ao8ao29vvf0hgm1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4be92cf0
    • A
      tools perf scripting python: clang doesn't have -spec, remove it · 8bd8c653
      Arnaldo Carvalho de Melo 提交于
      Gcc has a -spec option to override what options to pass to cc, etc, and
      in some distros this is used, like in fedora, where we end up getting
      this passed to gcc that makes clang, that doesn't have this option to
      stop the build:
      
        CC       /tmp/build/perf/util/scripting-engines/trace-event-python.o
      clang-4.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
      
      So filter this out when the compiler used is clang, this way we
      can build the python scripting support in tools/perf/.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-2gosxoiouf24pnlknp7w7q4z@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8bd8c653
  10. 15 2月, 2017 3 次提交