1. 20 4月, 2017 5 次提交
  2. 14 3月, 2017 1 次提交
    • H
      perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info · f3b3614a
      Hari Bathini 提交于
      Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
      by the kernel when fork, clone, setns or unshare are invoked. And update
      perf-record documentation with the new option to record namespace
      events.
      
      Committer notes:
      
      Combined it with a later patch to allow printing it via 'perf report -D'
      and be able to test the feature introduced in this patch. Had to move
      here also perf_ns__name(), that was introduced in another later patch.
      
      Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
      
        util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
           ret  += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
                                               ^
      Testing it:
      
        # perf record --namespaces -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
        #
        # perf report -D
        <SNIP>
        3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
                      [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                       4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      
        0x1151e0 [0x30]: event: 9
        .
        . ... raw event: size 48 bytes
        .  0000:  09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00  ......0..q.h....
        .  0010:  a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00  .9...9...(.c....
        .  0020:  03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00  ................
        <SNIP>
              NAMESPACES events:          1
        <SNIP>
        #
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b3614a
  3. 04 3月, 2017 1 次提交
  4. 15 2月, 2017 1 次提交
  5. 14 2月, 2017 1 次提交
  6. 12 1月, 2017 1 次提交
  7. 15 11月, 2016 1 次提交
    • J
      perf report: Add branch flag to callchain cursor node · 410024db
      Jin Yao 提交于
      Since the branch ip has been added to call stack for easier browsing,
      this patch adds more branch information. For example, add a flag to
      indicate if this ip is a branch, and also add with the branch flag.
      
      Then we can know if the cursor node represents a branch and know what
      the branch flag it has.
      
      The branch history code has a loop detection pass that removes loops. It
      would be nice for knowing how many loops were removed then in next
      steps, we can compute out the average number of iterations.
      
      For example:
      
      Before remove_loops(),
      entry0: from = 0x100, to = 0x200
      entry1: from = 0x300, to = 0x250
      entry2: from = 0x300, to = 0x250
      entry3: from = 0x300, to = 0x250
      entry4: from = 0x700, to = 0x800
      
      After remove_loops()
      entry0: from = 0x100, to = 0x200
      entry1: from = 0x300, to = 0x250
      entry2: from = 0x700, to = 0x800
      
      The original entry2 and entry3 are removed. So the number of iterations
      (from = 0x300, to = 0x250) is equal to removed number + 1 (2 + 1).
      
      iterations = removed number + 1;
      average iteractions = Sum(iteractions) / number of samples
      
      This formula ignores other cases, for example, iterations cross multiple
      buffers and one buffer contains 2+ loops. Because in practice, it's good
      enough.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linux-kernel@vger.kernel.org
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/n/1477876794-30749-2-git-send-email-yao.jin@linux.intel.com
      [ Renamed 'iter' to 'nr_loop_iter' for clarity ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      410024db
  8. 03 10月, 2016 1 次提交
    • A
      perf tools: Experiment with cppcheck · 18ef15c6
      Arnaldo Carvalho de Melo 提交于
      Experimenting a bit using cppcheck[1], a static checker brought to my
      attention by Colin, reducing the scope of some variables, reducing the
      line of source code lines in the process:
      
        $ cppcheck --enable=style tools/perf/util/thread.c
        Checking tools/perf/util/thread.c...
        [tools/perf/util/thread.c:17]: (style) The scope of the variable 'leader' can be reduced.
        [tools/perf/util/thread.c:133]: (style) The scope of the variable 'err' can be reduced.
        [tools/perf/util/thread.c:273]: (style) The scope of the variable 'err' can be reduced.
      
      Will continue later, but these are already useful, keep them.
      
      1: https://sourceforge.net/p/cppcheck/wiki/Home/
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ixws7lbycihhpmq9cc949ti6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      18ef15c6
  9. 05 9月, 2016 2 次提交
  10. 27 7月, 2016 1 次提交
  11. 22 6月, 2016 1 次提交
  12. 07 6月, 2016 1 次提交
    • H
      perf unwind: Move unwind__prepare_access from thread_new into thread__insert_map · 8132a2a8
      He Kuang 提交于
      To determine the libunwind methods to use, we should get the
      32bit/64bit information from maps of a thread. When a thread is newly
      created, the information is not prepared. This patch moves
      unwind__prepare_access() into thread__insert_map() so we can get the
      information we need from maps. Meanwhile, let thread__insert_map()
      return value and show messages on error.
      Signed-off-by: NHe Kuang <hekuang@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1464924803-22214-5-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8132a2a8
  13. 20 5月, 2016 3 次提交
  14. 17 5月, 2016 2 次提交
  15. 06 5月, 2016 3 次提交
    • C
      perf callchain: Fix incorrect ordering of entries · 9919a65e
      Chris Phlipot 提交于
      The existing implementation of thread__resolve_callchain, under certain
      circumstances, can assemble callchain entries in the incorrect order.
      
      The callchain entries are resolved incorrectly for a sample when all of
      the following conditions are met:
      
      1. callchain_param.order is set to ORDER_CALLER
      
      2. thread__resolve_callchain_sample is able to resolve callchain entries
         for the sample.
      
      3. unwind__get_entries is also able to resolve callchain entries for the
         sample.
      
      The fix is accomplished by reversing the order in which
      thread__resolve_callchain_sample and unwind__get_entries are called when
      callchain_param.order is set to ORDER_CALLER.
      
      Unwind specific code from thread__resolve_callchain is also moved into a
      new static function to improve readability of the fix.
      
      How to Reproduce the Existing Bug:
      
      Modifying perf script to print call trees in the opposite order or
      applying the remaining patches from this series and comparing the
      results output from export-to-postgtresql.py are the easiest ways to see
      the bug, however it can still be seen in current builds using perf
      report.
      
      Here is how i can reproduce the bug using perf report:
      
        # perf record --call-graph=dwarf stress -c 1 -t 5
      
      when i run this command:
      
        # perf report --call-graph=flat,0,0,callee
      
      This callchain, containing kernel (handle_irq_event, etc) and userspace
      samples (__libc_start_main, etc) is contained in the output, which looks
      correct (callee order):
      
                      gen8_irq_handler
                      handle_irq_event_percpu
                      handle_irq_event
                      handle_edge_irq
                      handle_irq
                      do_IRQ
                      ret_from_intr
                      __random
                      rand
                      0x558f2a04dded
                      0x558f2a04c774
                      __libc_start_main
                      0x558f2a04dcd9
      
      Now run this command using caller order:
      
        # perf report --call-graph=flat,0,0,caller
      
      It is expected to see the exact reverse of the above when using caller
      order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the
      bottom) in the output, but it is nowhere to be found.
      
      instead you see this:
      
                      ret_from_intr
                      do_IRQ
                      handle_irq
                      handle_edge_irq
                      handle_irq_event
                      handle_irq_event_percpu
                      gen8_irq_handler
                      0x558f2a04dcd9
                      __libc_start_main
                      0x558f2a04c774
                      0x558f2a04dded
                      rand
                      __random
      
      Notice how internally the kernel symbols are reversed and the user space
      symbols are reversed, but the kernel symbols still appear above the user
      space symbols.
      
      if this patch is applied and perf script is re-run, you will see the
      expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler"
      at the bottom):
      
                      0x558f2a04dcd9
                      __libc_start_main
                      0x558f2a04c774
                      0x558f2a04dded
                      rand
                      __random
                      ret_from_intr
                      do_IRQ
                      handle_irq
                      handle_edge_irq
                      handle_irq_event
                      handle_irq_event_percpu
                      gen8_irq_handler
      Signed-off-by: NChris Phlipot <cphlipot0@gmail.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlipot0@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9919a65e
    • J
      perf hists: Move sort__has_parent into struct perf_hpp_list · de7e6a7c
      Jiri Olsa 提交于
      Now we have sort dimensions private for struct hists, we need to make
      dimension booleans hists specific as well.
      
      Moving sort__has_parent into struct perf_hpp_list.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1462276488-26683-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      de7e6a7c
    • A
      perf machine: Introduce number of threads member · d2c11034
      Arnaldo Carvalho de Melo 提交于
      To be used, for instance, for pre-allocating an rb_tree array for
      sorting by other keys besides the current pid one.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ja0ifkwue7ttjhbwijn6g6eu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d2c11034
  16. 27 4月, 2016 1 次提交
  17. 19 4月, 2016 1 次提交
  18. 18 4月, 2016 2 次提交
  19. 15 4月, 2016 1 次提交
    • A
      perf callchain: Start moving away from global per thread cursors · 91d7b2de
      Arnaldo Carvalho de Melo 提交于
      The recent perf_evsel__fprintf_callchain() move to evsel.c added several
      new symbol requirements to the python binding, for instance:
      
        # perf test -v python
        16: Try 'import perf' in python, checking link problems      :
        --- start ---
        test child forked, pid 18030
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
        callchain_cursor
        test child finished with -1
        ---- end ----
        Try 'import perf' in python, checking link problems: FAILED!
        #
      
      This would require linking against callchain.c to access to the global
      callchain_cursor variables.
      
      Since lots of functions already receive as a parameter a
      callchain_cursor struct pointer, make that be the case for some more
      function so that we can start phasing out usage of yet another global
      variable.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91d7b2de
  20. 23 3月, 2016 1 次提交
  21. 14 12月, 2015 1 次提交
  22. 11 12月, 2015 1 次提交
    • W
      perf tools: Clear struct machine during machine__init() · 93b0ba3c
      Wang Nan 提交于
      There are so many test cases use stack allocated 'struct machine'.
      Including:
        test__hists_link
        test__hists_filter
        test__mmap_thread_lookup
        test__thread_mg_share
        test__hists_output
        test__hists_cumulate
      
      Also, in non-test code (for example, machine__new_host()) there are
      code use 'malloc()' to alloc struct machine.
      
      These are dangerous operations, cause some tests fail or hung in
      machines__exit(). For example, in
      
       machines__exit ->
         machine__destroy_kernel_maps ->
           map_groups__remove ->
             maps__remove ->
               pthread_rwlock_wrlock
      
      a incorrectly initialized lock causes unintended behavior.
      
      This patch memset(0) that structure in machine__init() to ensure all
      fields in 'struct machine' are initialized to zero.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1449541544-67621-17-git-send-email-wangnan0@huawei.com
      [ Use memset, see 'man bzero' ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      93b0ba3c
  23. 10 12月, 2015 1 次提交
  24. 08 12月, 2015 1 次提交
  25. 27 11月, 2015 2 次提交
    • W
      perf machine: Adjust dso->long_name for offline module · c03d5184
      Wang Nan 提交于
      Something unexpected may happen if copy statically linked perf to a
      production environment:
      
        # ./perf probe -m ./mymodule.ko my_func
        [mymodule] with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols
        Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko
          Error: Failed to add events.
        # ./perf buildid-cache -a ./mymodule.ko
        # ./perf probe -m ./mymodule.ko my_func
        Added new event:
          probe:my_func        (on my_func in /home/wangnan/kmodule/mymodule.ko)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:my_func -aR sleep 1
      
      Where:
      
        # ldd ./perf
       	not a dynamic executable
        # strace -e open ./perf probe -m ./mymodule.ko my_func
        ...
        open("/home/wangnan/kmodule/mymodule.ko", O_RDONLY) = 3
        open("/home/wangnan/kmodule/../lib64/elfutils/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
        ...
        open("/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
        open("/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
        open("/usr/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
        open("/usr/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
        open("[mymodule]", O_RDONLY)            = -1 ENOENT (No such file or directory)
        open("/home/wangnan/.debug/.build-id/32/6ab42550ef3d24944f53c817533728367effeb", O_RDONLY) = -1 ENOENT (No such file or directory)
        open("[mymodule]", O_RDONLY)            = -1 ENOENT (No such file or directory)
      
      In the above example, probe fails before we put the module into
      buildid-cache. However, user would expect it success in both case
      because perf is able to find probe points actually.
      
      The reason is because perf won't utilize module's full path if it failed
      to open debuginfo. In:
      
           convert_to_probe_trace_events ->
              find_probe_trace_events_from_map ->
                  get_target_map ->
                      kernel_get_module_map ->
                          machine__findnew_module_map ->
                              map_groups__find_by_name
      
      map_groups__find_by_name() is able to find the map of that module, but
      this information is found from /proc/module before it knows the real
      path of the offline module. Therefore, the map->dso->long_name is set to
      something like '[mymodule]', which prevent dso__load() find the real
      path of the module file.
      
      In another aspect, if dso__load() can get the offline module through
      buildid cache, it can read symble table from that ko. Even if debuginfo
      is not available, 'perf probe' can success if the '.symtab' can be
      found.
      
      This patch improves machine__findnew_module_map(): when dso->long_name
      is leading with '[' (doesn't find path of module when parsing
      /proc/modules), fixes it by dso__set_long_name(), so following
      dso__load() is possible to find the symbol table.
      
      This patch won't interfere with buildid matching. Here is the test
      result:
      
        # ./perf probe -m ./mymodule.ko my_func
        Added new event:
          probe:my_func        (on my_func in /home/wangnan/kmodule/mymodule.ko)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:my_func -aR sleep 1
      
        # ./perf probe -d '*'
        Removed event: probe:my_func
        # mv ./mymodule.{ko,.bak}
        # mv ./moduleb.ko mymodule.ko
        # ./perf probe -m ./mymodule.ko my_func
        /home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols
        Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko
          Error: Failed to add events.
      
        # ./perf probe -v -m ./mymodule.ko my_func
        probe-definition(0): my_func
        symbol:my_func file:(null) line:0 offset:0 return:0 lazy:(null)
        0 arguments
        Could not open debuginfo. Try to use symbols.
        symsrc__init: build id mismatch for /home/wangnan/kmodule/mymodule.ko.
        /home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols
        Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko
          Error: Failed to add events. Reason: No such file or directory (Code: -2)
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1448510397-187965-1-git-send-email-wangnan0@huawei.com
      [ Renamed adjust_dso_long_name() do dso__adjust_kmod_long_name() ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c03d5184
    • N
      perf callchain: Honor hide_unresolved · b49a8fe5
      Namhyung Kim 提交于
      If user requested to hide unresolved entries, skip unresolved callchains
      as well as hist entries.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1448521700-32062-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b49a8fe5
  26. 20 11月, 2015 3 次提交
    • M
      perf machine: Fix machine__findnew_module_map to put dso · 566c69c3
      Masami Hiramatsu 提交于
      Fix machine__findnew_module_map to drop the reference to the dso because
      it is already referenced by both machine__findnew_module_dso() and
      map__new2().
      
      Refcnt debugger shows:
      
        ==== [1] ====
        Unreclaimed dso: 0x1ffd980
        Refcount +1 => 1 at
          ./perf(dso__new+0x1ff) [0x4a62df]
          ./perf(__dsos__addnew+0x29) [0x4a6e19]
          ./perf() [0x4b8b91]
          ./perf(modules__parse+0xfc) [0x4a9d5c]
          ./perf() [0x4b8460]
          ./perf(machine__create_kernel_maps+0x150) [0x4bb550]
          ./perf(machine__new_host+0xfa) [0x4bb75a]
          ./perf(init_probe_symbol_maps+0x93) [0x506623]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
          ./perf() [0x4220a9]
      
      This map_groups__insert(0x4b8b91) already gets a reference to the new
      dso:
      
        ----
        eu-addr2line -e ./perf -f 0x4b8b91
        map_groups__insert inlined at util/machine.c:586 in
        machine__create_module
        util/map.h:207
        ----
      
      So this dso refcnt will be released when map_groups gets released.
      
        [snip]
        Refcount +1 => 2 at
          ./perf(dso__get+0x34) [0x4a65f4]
          ./perf() [0x4b8b35]
          ./perf(modules__parse+0xfc) [0x4a9d5c]
          ./perf() [0x4b8460]
          ./perf(machine__create_kernel_maps+0x150) [0x4bb550]
          ./perf(machine__new_host+0xfa) [0x4bb75a]
          ./perf(init_probe_symbol_maps+0x93) [0x506623]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
          ./perf() [0x4220a9]
      
      Here, machine__findnew_module_dso(0x4b8b35) gets the dso (and stores it
      in a local variable):
      
        ----
        # eu-addr2line -e ./perf -f 0x4b8b35
        machine__findnew_module_dso inlined at util/machine.c:578 in
        machine__create_module
        util/machine.c:514
        ----
      
        Refcount +1 => 3 at
          ./perf(dso__get+0x34) [0x4a65f4]
          ./perf(map__new2+0x76) [0x4be1c6]
          ./perf() [0x4b8b4f]
          ./perf(modules__parse+0xfc) [0x4a9d5c]
          ./perf() [0x4b8460]
          ./perf(machine__create_kernel_maps+0x150) [0x4bb550]
          ./perf(machine__new_host+0xfa) [0x4bb75a]
          ./perf(init_probe_symbol_maps+0x93) [0x506623]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
          ./perf() [0x4220a9]
      
      But also map__new2() gets the dso which will be put when the map is
      released.
      
      So, we have to drop the constructor reference obtained in
      machine__findnew_module_dso().
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064035.30709.58824.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      566c69c3
    • M
      perf tools: Fix machine__create_kernel_maps to put kernel dso refcount · 1154c957
      Masami Hiramatsu 提交于
      Fix machine__create_kernel_maps() to put kernel dso because the dso has
      been gotten via __machine__create_kernel_maps().
      
      Refcnt debugger shows:
        ==== [0] ====
        Unreclaimed dso: 0x3036ab0
        Refcount +1 => 1 at
          ./perf(dso__new+0x1ff) [0x4a62df]
          ./perf(__dsos__addnew+0x29) [0x4a6e19]
          ./perf(dsos__findnew+0xd1) [0x4a7181]
          ./perf(machine__findnew_kernel+0x27) [0x4a5e17]
          ./perf() [0x4b8cf2]
          ./perf(machine__create_kernel_maps+0x28) [0x4bb428]
          ./perf(machine__new_host+0xfa) [0x4bb74a]
          ./perf(init_probe_symbol_maps+0x93) [0x506613]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
          ./perf() [0x4220a9]
        [snip]
        Refcount +1 => 2 at
          ./perf(dsos__findnew+0x7e) [0x4a712e]
          ./perf(machine__findnew_kernel+0x27) [0x4a5e17]
          ./perf() [0x4b8cf2]
          ./perf(machine__create_kernel_maps+0x28) [0x4bb428]
          ./perf(machine__new_host+0xfa) [0x4bb74a]
          ./perf(init_probe_symbol_maps+0x93) [0x506613]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
          ./perf() [0x4220a9]
        [snip]
        Refcount -1 => 1 at
          ./perf(dso__put+0x2f) [0x4a664f]
          ./perf(machine__delete+0xfe) [0x4b93ee]
          ./perf(exit_probe_symbol_maps+0x28) [0x5066b8]
          ./perf() [0x45628a]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
          ./perf() [0x4220a9]
      
      Actually, dsos__findnew gets the dso before returning it, so the dso
      user (in this case machine__create_kernel_maps) has to put the dso after
      used.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064033.30709.98954.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1154c957
    • M
      perf machine: Fix to destroy kernel maps when machine exits · ebe9729c
      Masami Hiramatsu 提交于
      Actually machine__exit forgot to call machine__destroy_kernel_maps.
      
      This fixes some memory leaks on map as below.
      
      Without this fix.
        ----
        ./perf probe vfs_read
        Added new event:
          probe:vfs_read       (on vfs_read)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:vfs_read -aR sleep 1
      
        REFCNT: BUG: Unreclaimed objects found.
        REFCNT: Total 4 objects are not reclaimed.
           To see all backtraces, rerun with -v option
        ----
      With this fix.
        ----
        ./perf probe vfs_read
        Added new event:
          probe:vfs_read       (on vfs_read)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:vfs_read -aR sleep 1
      
        REFCNT: BUG: Unreclaimed objects found.
        REFCNT: Total 2 objects are not reclaimed.
           To see all backtraces, rerun with -v option
        ----
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064024.30709.43577.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ebe9729c