1. 15 10月, 2014 1 次提交
    • A
      perf evsel: Subclassing · ce8ccff5
      Arnaldo Carvalho de Melo 提交于
      Provide a method to be called at tool start to config the perf_evsel
      instance size, together with optional constructor and destructor.
      
      This will be used so that perf_evsel doesn't always include a struct
      hists, tools that works with hists/hist_entries, like report, top and
      annotate, will, at start, tell the evsel class the size they need per
      instance.
      
      v2: Don't use exit as a name of a member of function parameter, as this
          breaks the build on at least fedora14 and rhel6.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-7t8cay0ieryox4gqosie85ek@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ce8ccff5
  2. 14 10月, 2014 1 次提交
    • A
      perf session: Remove last reference to hists struct · 2a1731fb
      Arnaldo Carvalho de Melo 提交于
      Now perf_session doesn't require that the evsels in its evlist are hists
      containing ones.
      
      Tools that are hists based and want to do per evsel events_stats
      updates, if at some point this turns into a necessity, should do it in
      the tool specific code, keeping the session class hists agnostic.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-cli1bgwpo82mdikuhy3djsuy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2a1731fb
  3. 11 10月, 2014 2 次提交
    • A
      perf tools: Move events_stats struct to event.h · 4318bcb7
      Arnaldo Carvalho de Melo 提交于
      This is the only bit of hist.h that session.[ch] will end up using, so
      move it out of hist.h to make that abundantly clear.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-l9ftsl21ggw0c1g2ig87otmd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4318bcb7
    • A
      perf session: Don't count per evsel events · c2329ade
      Arnaldo Carvalho de Melo 提交于
      PERF_RECORD_SAMPLE was not being counted here and is the only per-evsel
      thing anyway, the other events were not mapping to a evsel.
      
      With this we don't require that evsels used with a perf_session need to
      have space for hists, like the ones in annotate, report, top.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-kzchpz0l1mhrsfpkirz086m2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2329ade
  4. 10 10月, 2014 1 次提交
    • A
      perf evsel: Add hists helper · 4ea062ed
      Arnaldo Carvalho de Melo 提交于
      Not all tools need a hists instance per perf_evsel, so lets pave the way
      to remove evsel->hists while leaving a way to access the hists from a
      specially allocated evsel, one that comes with space at the end where
      lives the evsel.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-qlktkhe31w4mgtbd84035sr2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4ea062ed
  5. 03 10月, 2014 1 次提交
    • J
      perf callchain: Move callchain_param to util object in to fix python test · 23aadb1f
      Jiri Olsa 提交于
      In following commit we changed the location of callchains data:
      
        72a128aa
        perf tools: Move callchain config from record_opts to callchain_param
      
      Now all callchains stuff stays in callchain_param struct, which adds its
      dependency for evsel.c object and breaks python perf.so usage
      (unresolved callchain_param).
      
      Moving callchain_param into callchain.c and adding it into
      python-ext-sources unleash just another dependency hell, so I ended up
      adding callchain_param into util.c for now.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1412179229-19466-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      23aadb1f
  6. 02 10月, 2014 2 次提交
    • A
      perf record: Fix error message for --filter option not coming after tracepoint · 281f92f2
      Arnaldo Carvalho de Melo 提交于
        [root@zoo ~]# perf record --filter "common_pid != PERF_PID" -a
        -F option should follow a -e tracepoint option.
      
      The -F option is for --freq, not --filter. Fix it up to show:
      
        [root@zoo ~]# perf record --filter "common_pid != PERF_PID" -a
        --filter option should follow a -e tracepoint option
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-z0yrm8stn9w3423nkov3eksg@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      281f92f2
    • W
      perf symbols: Improve DSO long names lookup speed with rbtree · 4598a0a6
      Waiman Long 提交于
      With workload that spawns and destroys many threads and processes, it
      was found that perf-mem could took a long time to post-process the perf
      data after the target workload had completed its operation.
      
      The performance bottleneck was found to be the lookup and insertion of
      the new DSO structures (thousands of them in this case).
      
      In a dual-socket Ivy-Bridge E7-4890 v2 machine (30-core, 60-thread), the
      perf profile below shows what perf was doing after the profiled AIM7
      shared workload completed:
      
      -     83.94%  perf  libc-2.11.3.so     [.] __strcmp_sse42
         - __strcmp_sse42
            - 99.82% map__new
                 machine__process_mmap_event
                 perf_session_deliver_event
                 perf_session__process_event
                 __perf_session__process_events
                 cmd_record
                 cmd_mem
                 run_builtin
                 main
                 __libc_start_main
      -     13.17%  perf  perf               [.] __dsos__findnew
           __dsos__findnew
           map__new
           machine__process_mmap_event
           perf_session_deliver_event
           perf_session__process_event
           __perf_session__process_events
           cmd_record
           cmd_mem
           run_builtin
           main
           __libc_start_main
      
      So about 97% of CPU times were spent in the map__new() function trying
      to insert new DSO entry into the DSO linked list. The whole
      post-processing step took about 9 minutes.
      
      The DSO structures are currently searched linearly. So the total
      processing time will be proportional to n^2.
      
      To overcome this performance problem, the DSO code is modified to also
      put the DSO structures in a RB tree sorted by its long name in
      additional to being in a simple linked list. With this change, the
      processing time will become proportional to n*log(n) which will be much
      quicker for large n. However, the short name will still be searched
      using the old linear searching method.  With that patch in place, the
      same perf-mem post-processing step took less than 30 seconds to
      complete.
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Douglas Hatch <doug.hatch@hp.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott J Norton <scott.norton@hp.com>
      Link: http://lkml.kernel.org/r/1412098575-27863-3-git-send-email-Waiman.Long@hp.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4598a0a6
  7. 30 9月, 2014 2 次提交
  8. 26 9月, 2014 15 次提交
  9. 18 9月, 2014 12 次提交
  10. 24 8月, 2014 1 次提交
    • J
      perf tools: Add +field argument support for --field option · 2f3f9bcf
      Jiri Olsa 提交于
      Adding support to add field(s) to default field order via using the '+'
      prefix, like for report:
      
        $ perf report
        Samples: 10  of event 'cycles', Event count (approx.): 4463799
        Overhead  Command  Shared Object      Symbol
          32.40%  ls       [kernel.kallsyms]  [k] filemap_fault
          28.19%  ls       [kernel.kallsyms]  [k] get_page_from_freelist
          23.38%  ls       [kernel.kallsyms]  [k] enqueue_entity
          15.04%  ls       [kernel.kallsyms]  [k] mmap_region
      
        $ perf report -F +period,sample
        Samples: 10  of event 'cycles', Event count (approx.): 4463799
        Overhead        Period       Samples  Command  Shared Object      Symbol
          32.40%       1446493             1  ls       [kernel.kallsyms]  [k] filemap_fault
          28.19%       1258486             1  ls       [kernel.kallsyms]  [k] get_page_from_freelist
          23.38%       1043754             1  ls       [kernel.kallsyms]  [k] enqueue_entity
          15.04%        671160             1  ls       [kernel.kallsyms]  [k] mmap_region
      
      Works in general for commands using --field option.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1408715919-25990-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2f3f9bcf
  11. 23 8月, 2014 2 次提交