1. 27 3月, 2017 8 次提交
  2. 25 3月, 2017 3 次提交
  3. 23 3月, 2017 7 次提交
  4. 22 3月, 2017 8 次提交
  5. 21 3月, 2017 7 次提交
  6. 17 3月, 2017 4 次提交
  7. 16 3月, 2017 3 次提交
    • A
      perf script: Add 'brstackinsn' for branch stacks · 48d02a1d
      Andi Kleen 提交于
      Implement printing instruction sequences as hex dump for branch stacks.
      
      This relies on the x86 instruction decoder used by the PT decoder to
      find the lengths of instructions to dump them individually.
      
      This is good enough for pattern matching.
      
      This allows to study hot paths for individual samples, together with
      branch misprediction and cycle count / IPC information if available (on
      Skylake systems).
      
        % perf record -b ...
        % perf script -F brstackinsn
        ...
          read_hpet+67:
                ffffffff9905b843        insn: 74 ea                     # PRED
                ffffffff9905b82f        insn: 85 c9
                ffffffff9905b831        insn: 74 12
                ffffffff9905b833        insn: f3 90
                ffffffff9905b835        insn: 48 8b 0f
                ffffffff9905b838        insn: 48 89 ca
                ffffffff9905b83b        insn: 48 c1 ea 20
                ffffffff9905b83f        insn: 39 f2
                ffffffff9905b841        insn: 89 d0
                ffffffff9905b843        insn: 74 ea                     # PRED
      
      Only works when no special branch filters are specified.
      
      Occasionally the path does not reach up to the sample IP, as the LBRs
      may be frozen before executing a final jump. In this case we print a
      special message.
      
      The instruction dumper piggy backs on the existing infrastructure from
      the IP PT decoder.
      
      An earlier iteration of this patch relied on a disassembler, but this
      version only uses the existing instruction decoder.
      
      Committer note:
      
      Added hint about how to get suitable perf.data files for use with
      '-F brstackinsm':
      
        $ perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
        $
        $ perf script -F brstackinsn
        Display of branch stack assembler requested, but non all-branch filter set
        Hint: run 'perf record -b ...'
        $
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      48d02a1d
    • S
      perf tools: Make perf_event__synthesize_mmap_events() scale · 88b897a3
      Stephane Eranian 提交于
      This patch significantly improves the execution time of
      perf_event__synthesize_mmap_events() when running perf record on systems
      where processes have lots of threads.
      
      It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to
      generate each map line in the maps file.  If you have 1000 threads, then you
      have necessarily 1000 stacks.  For each vma, you need to check if it
      corresponds to a thread's stack.  With a large number of threads, this can take
      a very long time. I have seen latencies >> 10mn.
      
      As of today, perf does not use the fact that a mapping is a stack, therefore we
      can work around the issue by using /proc/pid/tasks/pid/maps.  This entry does
      not try to map a vma to stack and is thus much faster with no loss of
      functonality.
      
      The proc-map-timeout logic is kept in case users still want some upper limit.
      
      In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual
      /proc/pid/task/pid/maps, tasks -> task.  Thanks Arnaldo for catching this.
      
      Committer note:
      
      This problem seems to have been elliminated in the kernel since commit :
      b18cb64e ("fs/proc: Stop trying to report thread stacks").
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com
      Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      88b897a3
    • R
      perf probe: Introduce util func is_sdt_event() · af9100ad
      Ravi Bangoria 提交于
      Factor out the SDT event name checking routine as is_sdt_event().
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170314150658.7065-2-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af9100ad