1. 13 7月, 2013 3 次提交
  2. 09 7月, 2013 11 次提交
    • R
      perf tools: Fix perf version generation · a4147f0f
      Robert Richter 提交于
      The tag of the perf version is wrongly determined, always the latest tag
      is taken regardless of the HEAD commit:
      
       $ perf --version
       perf version 3.9.rc8.gd7f5d3
       $ git describe d7f5d3
       v3.9-rc7-154-gd7f5d33
       $ head -n 4 Makefile
       VERSION = 3
       PATCHLEVEL = 9
       SUBLEVEL = 0
       EXTRAVERSION = -rc7
      
      In other cases no tag might be found.
      
      This patch fixes this.
      
      This new implementation handles also the case if there are no tags at
      all found in the git repo but there is a commit id.
      Signed-off-by: NRobert Richter <robert.richter@calxeda.com>
      Link: http://lkml.kernel.org/r/1368006214-12912-1-git-send-email-rric@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a4147f0f
    • W
      perf symbols: Fix vdso list searching · f9ceffb6
      Waiman Long 提交于
      When "perf record" was used on a large machine with a lot of CPUs, the
      perf post-processing time (the time after the workload was done until
      the perf command itself exited) could take a lot of minutes and even
      hours depending on how large the resulting perf.data file was.
      
      While running AIM7 1500-user high_systime workload on a 80-core x86-64
      system with a 3.9 kernel (with only the -s -a options used), the
      workload itself took about 2 minutes to run and the perf.data file had a
      size of 1108.746 MB. However, the post-processing step took more than 10
      minutes.
      
      With a gprof-profiled perf binary, the time spent by perf was as
      follows:
      
        %   cumulative   self              self     total
       time   seconds   seconds    calls   s/call   s/call  name
       96.90    822.10   822.10   192156     0.00     0.00  dsos__find
        0.81    828.96     6.86 172089958     0.00     0.00  rb_next
        0.41    832.44     3.48 48539289     0.00     0.00  rb_erase
      
      So 97% (822 seconds) of the time was spent in a single dsos_find()
      function. After analyzing the call-graph data below:
      
       -----------------------------------------------
                       0.00  822.12  192156/192156      map__new [6]
       [7]     96.9    0.00  822.12  192156         vdso__dso_findnew [7]
                     822.10    0.00  192156/192156      dsos__find [8]
                       0.01    0.00  192156/192156      dsos__add [62]
                       0.01    0.00  192156/192366      dso__new [61]
                       0.00    0.00       1/45282525     memdup [31]
                       0.00    0.00  192156/192230      dso__set_long_name [91]
       -----------------------------------------------
                     822.10    0.00  192156/192156      vdso__dso_findnew [7]
       [8]     96.9  822.10    0.00  192156         dsos__find [8]
       -----------------------------------------------
      
      It was found that the vdso__dso_findnew() function failed to locate
      VDSO__MAP_NAME ("[vdso]") in the dso list and have to insert a new
      entry at the end for 192156 times. This problem is due to the fact that
      there are 2 types of name in the dso entry - short name and long name.
      The initial dso__new() adds "[vdso]" to both the short and long names.
      After that, vdso__dso_findnew() modifies the long name to something
      like /tmp/perf-vdso.so-NoXkDj. The dsos__find() function only compares
      the long name. As a result, the same vdso entry is duplicated many
      time in the dso list. This bug increases memory consumption as well
      as slows the symbol processing time to a crawl.
      
      To resolve this problem, the dsos__find() function interface was
      modified to enable searching either the long name or the short
      name. The vdso__dso_findnew() will now search only the short name
      while the other call sites search for the long name as before.
      
      With this change, the cpu time of perf was reduced from 848.38s to
      15.77s and dsos__find() only accounted for 0.06% of the total time.
      
        0.06     15.73     0.01   192151     0.00     0.00  dsos__find
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: "Chandramouleeswaran, Aswin" <aswin@hp.com>
      Cc: "Norton, Scott J" <scott.norton@hp.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1368110568-64714-1-git-send-email-Waiman.Long@hp.com
      [ replaced TRUE/FALSE with stdbool.h equivalents, fixing builds where
        those macros are not present (NO_LIBPYTHON=1 NO_LIBPERL=1), fix from Jiri Olsa ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9ceffb6
    • A
      perf evsel: Fix missing increment in sample parsing · 54bd2692
      Adrian Hunter 提交于
      The final sample format bit used to be PERF_SAMPLE_STACK_USER which
      neglected to do a final increment of the array pointer.  The result is
      that the following parsing might start at the wrong place.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1372944040-32690-16-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      54bd2692
    • A
      perf tools: Update symbol_conf.nr_events when processing attribute events · 7e0d6fc9
      Adrian Hunter 提交于
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1372944040-32690-11-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7e0d6fc9
    • A
      perf tools: Fix new_term() missing free on error path · 4be8be6b
      Adrian Hunter 提交于
      On the error path, newly allocated 'term' must be freed.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1372944040-32690-6-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4be8be6b
    • A
      perf tools: Fix parse_events_terms() segfault on error path · b2c34fde
      Adrian Hunter 提交于
      On the error path, 'data.terms' may not have been initialised.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1372944040-32690-5-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b2c34fde
    • D
      perf evsel: Fix count parameter to read call in event_format__new · 7cab84e8
      David Ahern 提交于
      per realloc above the length of the buffer is alloc_size, not BUFSIZ.
      Adjust length per size as done for buf start.
      
      Addresses some valgrind complaints:
      
      ==1870== Syscall param read(buf) points to unaddressable byte(s)
      ==1870==    at 0x4E3F610: __read_nocancel (in /lib64/libpthread-2.14.90.so)
      ==1870==    by 0x44AEE1: event_format__new (unistd.h:45)
      ==1870==    by 0x44B025: perf_evsel__newtp (evsel.c:158)
      ==1870==    by 0x451919: add_tracepoint_event (parse-events.c:395)
      ==1870==    by 0x479815: parse_events_parse (parse-events.y:292)
      ==1870==    by 0x45463A: parse_events_option (parse-events.c:861)
      ==1870==    by 0x44FEE4: get_value (parse-options.c:113)
      ==1870==    by 0x450767: parse_options_step (parse-options.c:192)
      ==1870==    by 0x450C40: parse_options (parse-options.c:422)
      ==1870==    by 0x42735F: cmd_record (builtin-record.c:918)
      ==1870==    by 0x419D72: run_builtin (perf.c:319)
      ==1870==    by 0x4195F2: main (perf.c:376)
      ==1870==  Address 0xcffebf0 is 0 bytes after a block of size 8,192 alloc'd
      ==1870==    at 0x4C2A62F: malloc (vg_replace_malloc.c:270)
      ==1870==    by 0x4C2A7A3: realloc (vg_replace_malloc.c:662)
      ==1870==    by 0x44AF07: event_format__new (evsel.c:121)
      ==1870==    by 0x44B025: perf_evsel__newtp (evsel.c:158)
      ==1870==    by 0x451919: add_tracepoint_event (parse-events.c:395)
      ==1870==    by 0x479815: parse_events_parse (parse-events.y:292)
      ==1870==    by 0x45463A: parse_events_option (parse-events.c:861)
      ==1870==    by 0x44FEE4: get_value (parse-options.c:113)
      ==1870==    by 0x450767: parse_options_step (parse-options.c:192)
      ==1870==    by 0x450C40: parse_options (parse-options.c:422)
      ==1870==    by 0x42735F: cmd_record (builtin-record.c:918)
      ==1870==    by 0x419D72: run_builtin (perf.c:319)
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1372793245-4136-2-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7cab84e8
    • J
      perf tools: Fix -x/--exclude-other option for report command · 0276c22a
      Jiri Olsa 提交于
      Currently we have symbol_conf.exclude_other being set as true every time
      so the -x/--exclude-other has nothing to do.
      
      Also we have no way to see the data with symbol_conf.exclude_other being
      false which is useful sometimes.
      
      Fixing it by making symbol_conf.exclude_other false by default.
      
      1) Example without -x option:
      
        $ perf report -i perf.data.delete -p perf_session__delete -s parent
      
        +  99.91%  [other]
        +   0.08%  perf_session__delete
        +   0.00%  perf_session__delete_dead_threads
        +   0.00%  perf_session__delete_threads
      
      2) Example with -x option:
      
        $ ./perf report -i perf.data.delete -p perf_session__delete -s parent -x
      
        +  96.22%  perf_session__delete
        +   1.89%  perf_session__delete_dead_threads
        +   1.89%  perf_session__delete_threads
      
      In Example 1) we get the sorted out data together with the rest
      "[other]". This could help us estimate how much time we spent in the
      sorted data.
      
      In Example 2) the total is just the sorted data.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-sg8fvu0fyqohf9ur9l38lhkw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0276c22a
    • N
      perf evlist: Enhance perf_evlist__start_workload() · bcf3145f
      Namhyung Kim 提交于
      When perf tries to start a workload, it relies on a pipe which the
      workload was blocked for reading.  After closing the pipe on the parent,
      the workload (child) can start the actual work via exec().
      
      However, if another process was forked after creating a workload, this
      mechanism cannot work since the other process (child) also inherits the
      pipe, so that closing the pipe in parent cannot unblock the workload.
      Fix it by using explicit write call can then closing it.
      
      For similar reason, the pipe fd on parent should be marked as CLOEXEC so
      that it can be closed after another child exec'ed.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1372230862-15861-13-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bcf3145f
    • J
      perf record: Remove -A/--append option · 563aecb2
      Jiri Olsa 提交于
      It's no longer working and needed.
      
      Quite straightforward discussion/vote was in here:
      http://marc.info/?t=137028288300004&r=1&w=2Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-8fgdva12hl8w3xzzpsvvg7nx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      563aecb2
    • J
      perf tools: Include termios.h explicitly · 756bbc84
      Joonsoo Kim 提交于
      Building perf for android fails because it can't find the definition of
      struct winsize.
      
      This definition is in termios.h, so I add this header to util.h to solve
      the problem.
      
      It is missed by commit '2c803e52' which moves get_term_dimensions() from
      builtin-top.c to util.c, but missed to move termios.h header.
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1371603750-15053-3-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      756bbc84
  3. 30 5月, 2013 2 次提交
  4. 29 5月, 2013 2 次提交
  5. 28 5月, 2013 21 次提交
  6. 01 4月, 2013 1 次提交