1. 13 2月, 2016 6 次提交
  2. 12 2月, 2016 4 次提交
    • A
      perf hists: Do column alignment on the format iterator · 89fee709
      Arnaldo Carvalho de Melo 提交于
      We were doing column alignment in the format function for each cell,
      returning a string padded with spaces so that when the next column is
      printed the cursor is at its column alignment.
      
      This ends up needlessly printing trailing spaces, do it at the format
      iterator, that is where we know if it is needed, i.e. if there is more
      columns to be printed.
      
      This eliminates the need for triming lines when doing a dump using 'P'
      in the TUI browser and also produces far saner results with things like
      piping 'perf report' to 'less'.
      
      Right now only the formatters for sym->name and the 'locked' column
      (perf mem report), that are the ones that end up at the end of lines
      in the default 'perf report', 'perf top' and 'perf mem report' tools,
      the others will be done in a subsequent patch.
      
      In the end the 'width' parameter for the formatters now mean, in
      'printf' terms, the 'precision', where before it was the field 'width'.
      Reported-by: NDave Jones <davej@codemonkey.org.uk>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      89fee709
    • A
      perf tools: Add comment explaining the repsep_snprintf function · 37d9bb58
      Arnaldo Carvalho de Melo 提交于
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-4j67nvlfwbnkg85b969ewnkr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      37d9bb58
    • W
      perf symbols: Fix symbols searching for module in buildid-cache · e7ee4047
      Wang Nan 提交于
      Before this patch, if a sample is triggered inside a module not in
      /lib/modules/`uname -r`/, even if the module is in buildid-cache, 'perf
      report' will still be unable to find the correct symbol.  For example:
      
        # rm -rf ~/.debug/
        # perf buildid-cache -a ./mymodule.ko
        # perf probe -m ./mymodule.ko -a get_mymodule_val
        Added new event:
          probe:get_mymodule_val (on get_mymodule_val in mymodule)
      
        You can now use it in all perf tools, such as:
      
       	perf record -e probe:get_mymodule_val -aR sleep 1
      
        # perf record -e probe:get_mymodule_val cat /proc/mymodule
        mymodule:3
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]
      
        # perf report --stdio
        [SNIP]
        #
        # Overhead  Command  Shared Object     Symbol
        # ........  .......  ................  ......................
        #
          100.00%  cat      [mymodule]        [k] 0x0000000000000001
      
        # perf report -vvvv --stdio
        dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
        symbol__new: get_mymodule_val 0x70-0x8a
        [SNIP]
      
      This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
      is true only when its file is found in some well know directories. All
      files loaded from buildid-cache are treated as user programs. Following
      dso__load_sym() set map->pgoff incorrectly.
      
      This patch gives kernel modules in buildid-cache a chance to adjust
      value of kmod. After dso__load() get the type of symbols, if it is
      buildid, check the last 3 chars of original filename against '.ko', and
      adjust the value of kmod if the file is a kernel module.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1454680939-24963-3-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e7ee4047
    • T
      perf config: Add '--system' and '--user' options to select which config file is used · c7ac2417
      Taeung Song 提交于
      The '--system' option means $(sysconfdir)/perfconfig and '--user' means
      $HOME/.perfconfig. If none is used, both system and user config file are
      read.  E.g.:
      
          # perf config [<file-option>] [options]
      
          With an specific config file:
      
          # perf config --user | --system
      
          or both user and system config file:
      
          # perf config
      Signed-off-by: NTaeung Song <treeze.taeung@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1455126685-32367-2-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c7ac2417
  3. 05 2月, 2016 4 次提交
    • S
      perf jit: add source line info support · 598b7c69
      Stephane Eranian 提交于
      This patch adds source line information support to perf for jitted code.
      
      The source line info must be emitted by the runtime, such as JVMTI.
      
      Perf injects extract the source line info from the jitdump file and adds
      the corresponding .debug_lines section in the ELF image generated for
      each jitted function.
      
      The source line enables matching any address in the profile with a
      source file and line number.
      
      The improvement is visible in perf annotate with the source code
      displayed alongside the assembly code.
      
      The dwarf code leverages the support from OProfile which is also
      released under GPLv2.  Copyright 2007 OProfile authors.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Carl Love <cel@us.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John McCutchan <johnmccutchan@google.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sonny Rao <sonnyrao@chromium.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1448874143-7269-5-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      598b7c69
    • S
      perf inject: Add jitdump mmap injection support · 9b07e27f
      Stephane Eranian 提交于
      This patch adds a --jit/-j option to perf inject.
      
      This options injects MMAP records into the perf.data file to cover the
      jitted code mmaps. It also emits ELF images for each function in the
      jidump file.  Those images are created where the jitdump file is.  The
      MMAP records point to that location as well.
      
      Typical flow:
      
        $ perf record -k mono -- java -agentpath:libpjvmti.so java_class
        $ perf inject --jit -i perf.data -o perf.data.jitted
        $ perf report -i perf.data.jitted
      
      Note that jitdump.h support is not limited to Java, it works with any
      jitted environment modified to emit the jitdump file format, include
      those where code can be jitted multiple times and moved around.
      
      The jitdump.h format is adapted from the Oprofile project.
      
      The genelf.c (ELF binary generation) depends on MD5 hash encoding for
      the buildid. To enable this, libssl-dev must be installed. If not, then
      genelf.c defaults to using urandom to generate the buildid, which is not
      ideal.  The Makefile auto-detects the presence on libssl-dev.
      
      This version mmaps the jitdump file to create a marker MMAP record in
      the perf.data file. The marker is used to detect jitdump and cause perf
      inject to inject the jitted mmaps and generate ELF images for jitted
      functions.
      
      In V8, the following fixes and changes were made among other things:
      
        -  the jidump header format include a new flags field to be used
           to carry information about the configuration of the runtime agent.
           Contributed by: Adrian Hunter <adrian.hunter@intel.com>
      
        - Fix mmap pgoff: MMAP event pgoff must be the offset within the ELF file
          at which the code resides.
          Contributed by: Adrian Hunter <adrian.hunter@intel.com>
      
        - Fix ELF virtual addresses: perf tools expect the ELF virtual addresses of dynamic
          objects to match the file offset.
          Contributed by: Adrian Hunter <adrian.hunter@intel.com>
      
        - JIT MMAP injection does not obey finished_round semantics. JIT MMAP injection injects all
          MMAP events in one go, so it does not obey finished_round semantics, so drop the
          finished_round events from the output perf.data file.
          Contributed by: Adrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Carl Love <cel@us.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John McCutchan <johnmccutchan@google.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sonny Rao <sonnyrao@chromium.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
      [ Moved inject.build_ids ordering bits to a separate patch, fixed the NO_LIBELF=1 build ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9b07e27f
    • S
      perf symbols: add Java demangling support · e9c4bcdd
      Stephane Eranian 提交于
      Add Java function descriptor demangling support.  Something bfd cannot
      do.
      
      Use the JAVA_DEMANGLE_NORET flag to avoid decoding the return type of
      functions.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Carl Love <cel@us.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John McCutchan <johnmccutchan@google.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sonny Rao <sonnyrao@chromium.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1448874143-7269-2-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e9c4bcdd
    • M
      perf tools: handle spaces in file names obtained from /proc/pid/maps · 89fee59b
      Marcin Ślusarz 提交于
      Steam frequently puts game binaries in folders with spaces.
      
      Note: "(deleted)" markers are now treated as part of the file name.
      Signed-off-by: NMarcin Ślusarz <marcin.slusarz@gmail.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Fixes: 60648033 ("perf tools: Use sscanf for parsing /proc/pid/maps")
      Link: http://lkml.kernel.org/r/20160119190303.GA17579@marcin-Inspiron-7720Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      89fee59b
  4. 04 2月, 2016 1 次提交
    • J
      perf stat: Fix interval output values · 51fd2df1
      Jiri Olsa 提交于
      We broke interval data displays with commit:
      
        3f416f22 ("perf stat: Do not clean event's private stats")
      
      This commit removed stats cleaning, which is important for '-r' option
      to carry counters data over the whole run. But it's necessary to clean
      it for interval mode, otherwise the displayed value is avg of all
      previous values.
      
      Before:
        $ perf stat -e cycles -a -I 1000 record
        #           time             counts unit events
             1.000240796         75,216,287      cycles
             2.000512791        107,823,524      cycles
      
        $ perf stat report
        #           time             counts unit events
             1.000240796         75,216,287      cycles
             2.000512791         91,519,906      cycles
      
      Now:
        $ perf stat report
        #           time             counts unit events
             1.000240796         75,216,287      cycles
             2.000512791        107,823,524      cycles
      
      Notice the second value being bigger (91,.. < 107,..).
      
      This could be easily verified by using perf script which displays raw
      stat data:
      
        $ perf script
        CPU  THREAD       VAL         ENA         RUN        TIME EVENT
          0      -1  23855779  1000209530  1000209530  1000240796 cycles
          1      -1  33340397  1000224964  1000224964  1000240796 cycles
          2      -1  15835415  1000226695  1000226695  1000240796 cycles
          3      -1   2184696  1000228245  1000228245  1000240796 cycles
          0      -1  97014312  2000514533  2000514533  2000512791 cycles
          1      -1  46121497  2000543795  2000543795  2000512791 cycles
          2      -1  32269530  2000543566  2000543566  2000512791 cycles
          3      -1   7634472  2000544108  2000544108  2000512791 cycles
      
      The sum of the first 4 values is the first interval aggregated value:
      
        23855779 + 33340397 + 15835415 + 2184696 = 75,216,287
      
      The sum of the second 4 values minus first value is the second interval
      aggregated value:
      
        97014312 + 46121497 + 32269530 + 7634472 - 75216287 = 107,823,524
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1454485436-20639-1-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      51fd2df1
  5. 03 2月, 2016 22 次提交
  6. 02 2月, 2016 3 次提交
    • A
      perf tools: Fix thread lifetime related segfaut in intel_pt · 3a4acda1
      Adrian Hunter 提交于
      intel_pt_process_auxtrace_info() creates a pt->unknown_thread thread
      that eventually needs to be freed by the last thread__put() on it, when
      its refcount hits zero, which may happen in
      intel_pt_process_auxtrace_info() error handling path and triggers the
      following segfault, which would happen as well at intel_pt_free, when
      tools using this intel_pt codebase frees up resources:
      
        # perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
        0  a  anaconda-ks.cfg  bin   perf.data	perf.data.old  perf-f23-bringup.todo
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.217 MB perf.data ]
        #
        # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs
        Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
        intel_pt_synth_events: failed to synthesize 'instructions' event type
        Segmentation fault (core dumped)
        #
      
      The problem is: there's a union in 'struct thread' combines a list_head
      and a rb_node. The standard life cycle of a thread is: init rb_node in
      the constructor, insert it into machine->threads rbtree using rb_node,
      move it to machine->dead_threads using list_head, clean in the last
      thread__put: list_del_init(&thread->node).
      
      In the above command, it clean a thread before adding it into list,
      causes the above segfault.
      
      Since pt->unknown_thread will never live in an rbtree, initialize its
      list node so that when list_del_init() is done on it we don't segfault.
      
      After this patch:
      
        # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs
        Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
        intel_pt_synth_events: failed to synthesize 'instructions' event type
        0x248 [0x88]: failed to process type: 70
        #
      Reported-by: NTong Zhang <ztong@vt.edu>
      Reported-by: NWang Nan <wangnan0@huawei.com>
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Link: http://lkml.kernel.org/r/1454296865-19749-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3a4acda1
    • N
      perf hists: Update hists' total period when adding entries · 0f58474e
      Namhyung Kim 提交于
      Currently the hist entry addition path doesn't update total_period of
      hists and it's calculated during 'resort' path.  But the resort path
      needs to know the total period before doing its job because it's used
      for calculating percent limit of callchains in hist entries.
      
      So this patch update the total period during the addition path.  It
      makes the percent limit of callchains working (again).
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1453909257-26015-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0f58474e
    • N
      perf hists: Fix min callchain hits calculation · 744070e0
      Namhyung Kim 提交于
      The total period should be get using hists__total_period() since it
      takes filtered entries into account.  In addition, if callchain mode is
      'fractal', the total period should be the entry's period.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1453909257-26015-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      744070e0