1. 23 10月, 2009 2 次提交
    • F
      perf tools: Bind callchains to the first sort dimension column · a4fb581b
      Frederic Weisbecker 提交于
      Currently, the callchains are displayed using a constant left
      margin. So depending on the current sort dimension
      configuration, callchains may appear to be well attached to the
      first sort dimension column field which is mostly the case,
      except when the first dimension of sorting is done by comm,
      because these are right aligned.
      
      This patch binds the callchain to the first letter in the first
      column, whatever type of column it is (dso, comm, symbol).
      Before:
      
           0.80%             perf  [k] __lock_acquire
                   __lock_acquire
                   lock_acquire
                   |
                   |--58.33%-- _spin_lock
                   |          |
                   |          |--28.57%-- inotify_should_send_event
                   |          |          fsnotify
                   |          |          __fsnotify_parent
      
      After:
      
           0.80%             perf  [k] __lock_acquire
                             __lock_acquire
                             lock_acquire
                             |
                             |--58.33%-- _spin_lock
                             |          |
                             |          |--28.57%-- inotify_should_send_event
                             |          |          fsnotify
                             |          |          __fsnotify_parent
      
      Also, for clarity, we don't put anymore the callchain as is but:
      
      - If we have a top level ancestor in the callchain, start it
        with a first ascii hook.
      
        Before:
      
           0.80%             perf  [kernel]                        [k] __lock_acquire
                             __lock_acquire
                               lock_acquire
                             |
                             |--58.33%-- _spin_lock
                             |          |
                             |          |--28.57%-- inotify_should_send_event
                             |          |          fsnotify
                            [..]       [..]
      
         After:
      
           0.80%             perf  [kernel]                         [k] __lock_acquire
                             |
                             --- __lock_acquire
                                 lock_acquire
                                |
                                |--58.33%-- _spin_lock
                                |          |
                                |          |--28.57%-- inotify_should_send_event
                                |          |          fsnotify
                               [..]       [..]
      
      - Otherwise, if we have several top level ancestors, then
        display these like we did before:
      
             1.69%           Xorg
                             |
                             |--21.21%-- vread_hpet
                             |          0x7fffd85b46fc
                             |          0x7fffd85b494d
                             |          0x7f4fafb4e54d
                             |
                             |--15.15%-- exaOffscreenAlloc
                             |
                             |--9.09%-- I830WaitLpRing
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      LKML-Reference: <1256246604-17156-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a4fb581b
    • F
      perf tools: Fix missing top level callchain · af0a6fa4
      Frederic Weisbecker 提交于
      While recursively printing the branches of each callchains, we
      forget to display the root. It is never printed.
      
      Say we have:
      
          symbol
          f1
          f2
           |
           -------- f3
           |        f4
           |
           ---------f5
                    f6
      
      Actually we never see that, instead it displays:
      
          symbol
          |
          --------- f3
          |         f4
          |
          --------- f5
                    f6
      
      However f1 is always the same than "symbol" and if we are
      sorting by symbols first then "symbol", f1 and f2 will be well
      aligned like in the above example, so displaying f1 looks
      redundant here.
      
      But if we are sorting by something else first (dso, comm,
      etc...), displaying f1 doesn't look redundant but rather
      necessary because the symbol is not well aligned anymore with
      its callchain:
      
           comm     dso        symbol
           f1
           f2
           |
           --------- [...]
      
      And we want the callchain to be obvious.
      So we fix the bug by printing the root branch, but we also
      filter its first entry if we are sorting by symbols first.
      Reported-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1256246604-17156-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      af0a6fa4
  2. 21 10月, 2009 2 次提交
  3. 20 10月, 2009 3 次提交
    • A
      perf tools: Add ->unmap_ip operation to struct map · ed52ce2e
      Arnaldo Carvalho de Melo 提交于
      We need this because we get section relative addresses when
      reading the symtabs, but when a tool like 'perf annotate' needs
      to match these address to what 'objdump -dS' produces we need
      the address + section back again.
      
      So in annotate now we look at the 'struct hist_entry' instances
      (that weren't really being used) so that we iterate only over
      the symbols that had some hit and get the map where that
      particular hit happened so that we can get the right address to
      match with annotate.
      
      Verified that at least:
      
       perf annotate mmap_read_counter # Uses the ~/bin/perf binary
       perf annotate --vmlinux /home/acme/git/build/perf/vmlinux intel_pmu_enable_all
      
      on a 'perf record perf top' session seems to work.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1255979877-12533-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ed52ce2e
    • A
      perf timechart: Improve the visual appearance of scheduler delays · 2e600d01
      Arjan van de Ven 提交于
      [from KS feedback]
      
      Currently, scheduler delays are shown in a mostly transparent,
      light yellow color. This color is rather hard to see on several
      screens, especially projectors.
      
      This patch changes the color of the scheduler delays to be a
      much more "hard" yellow that survived the kernel summit
      projector.
      Reported-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: Arjan van de Ven <arjan@linux.intel.com
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091020064731.20ae126a@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e600d01
    • A
      perf tools: Add missing tools/perf/util/include/string.h · 20639c15
      Arnaldo Carvalho de Melo 提交于
      To cure a bunch of:
      
      In file included from util/include/linux/bitmap.h:1,
                       from util/header.h:8,
                       from builtin-trace.c:7:
      util/include/../../../../include/linux/bitmap.h:8:26: error:
      linux/string.h: No such file or directory make: ***
      [builtin-trace.o] Error 1 make: *** Waiting for unfinished
      jobs....
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1255972296-11500-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      20639c15
  4. 19 10月, 2009 3 次提交
    • F
      perf tools: Use DECLARE_BITMAP instead of an open-coded array · db9f11e3
      Frederic Weisbecker 提交于
      Use DECLARE_BITMAP instead of an open coded array for our bitmap
      of featured sections.
      
      This makes the array an unsigned long instead of a u64 but since
      we use a 256 bits bitmap, the array size shouldn't vary between
      different boxes.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1255795038-13751-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      db9f11e3
    • F
      perf tools: Introduce bitmask'ed additional headers · 2ba08250
      Frederic Weisbecker 提交于
      This provides a new set of bitmasked headers. A new field is
      added in the perf headers that implements a bitmap storing
      optional features present in the perf.data file.
      
      The layout can be pictured like this:
      
      (Usual perf headers)(Features bitmap)[Feature 0][Feature
      n][Feature 255]
      
      If the bit n is set, then the feature n is used in this file.
      They are all set in order. This brings a backward and forward
      compatibility.
      
      The trace_info section has moved into such optional features,
      this is the first and only one for now.
      
      This is backward compatible with the .32 file version although
      it doesn't support the previous separate trace.info file.
      
      And finally it doesn't support the current interim development
      version.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1255792354-11304-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2ba08250
    • F
      perf tools: Use kernel bitmap library · 5a116dd2
      Frederic Weisbecker 提交于
      Use the kernel bitmap library for internal perf tools uses.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1255792354-11304-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5a116dd2
  5. 16 10月, 2009 1 次提交
    • I
      perf tools: Bump version to 0.0.2 · 210f9cb2
      Ingo Molnar 提交于
      We released the first version of perf with 0.0.1 in v2.6.31,
      time to double our version number to 0.0.2 ;-)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      210f9cb2
  6. 15 10月, 2009 14 次提交
  7. 13 10月, 2009 3 次提交
  8. 12 10月, 2009 1 次提交
    • R
      perf tools: Fix const char type propagation · cbef79a8
      Randy Dunlap 提交于
      The following perf build warnings/errors in function
      argument types:
      
        builtin-sched.c:1894: warning: passing argument 1 of 'sort_dimension__add' discards qualifiers from pointer target type
        util/trace-event-parse.c:685: warning: passing argument 2 of 'read_expected' discards qualifiers from pointer target type
        util/trace-event-parse.c:741: warning: passing argument 4 of 'test_type_token' discards qualifiers from pointer target type
        util/trace-event-parse.c:706: warning: passing argument 2 of 'read_expected_item' discards qualifiers from pointer target type
      
      ... trigger because older GCC is not able to prove that
      sort_dimension__add() does not change the string.
      
      Some goes for test_type_token().
      
      Fix this by improving type consistency.
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20091005131729.78444bfb.randy.dunlap@oracle.com>
      [ Also remove ugly type cast now unnecessary. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cbef79a8
  9. 09 10月, 2009 4 次提交
    • F
      perf tools: Provide backward compatibility with previous perf.data version · 26dd2cb0
      Frederic Weisbecker 提交于
      We have merged the trace.info file into perf.data by adding one
      section in the perf headers. This makes it incompatible with
      previous version: the new perf tools can't read the older
      perf.data.
      
      To support the previous format, we check the headers size. If they
      have the same size than in the previous format, then ignore the
      trace info section that doesn't exist.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255032449-12022-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      26dd2cb0
    • F
      perf tools: Fix thread comm resolution in perf sched · 97ea1a7f
      Frederic Weisbecker 提交于
      This reverts commit 9a92b479 ("perf
      tools: Improve thread comm resolution in perf sched") and fixes the
      real bug.
      
      The bug was elsewhere:
      
      We are failing to resolve thread names in perf sched because the
      table of threads we are building, on top of comm events, has a per
      process granularity. But perf sched, unlike the other perf tools,
      needs a per thread granularity as we are profiling every tasks
      individually.
      
      So fix it by building our threads table using the tid instead of
      the pid as the thread identifier.
      
      v2: Revert the previous fix - it is not really needed
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      97ea1a7f
    • A
      perf tools: Improve kernel/modules symbol lookup · 2e538c4a
      Arnaldo Carvalho de Melo 提交于
      This removes the ovelapping of vmlinux addresses with modules,
      using the ELF section name when using --vmlinux and creating a
      unique DSO name when using /proc/kallsyms ([kernel].N).
      
      This is done by creating multiple 'struct map' instances for
      address ranges backed by DSOs that have just the symbols for that
      range and a name that is derived from the ELF section name.o
      
      Now it is possible to ask for just the symbols in some particular
      kernel section:
      
      $ perf report -m --vmlinux ../build/tip-recvmmsg/vmlinux \
      	--dsos [kernel].vsyscall_fn | head -15
          52.73%             Xorg  [.] vread_hpet
          18.61%          firefox  [.] vread_hpet
          14.50%     npviewer.bin  [.] vread_hpet
           6.83%           compiz  [.] vread_hpet
           5.73%         glxgears  [.] vread_hpet
           0.63%             java  [.] vread_hpet
           0.30%   gnome-terminal  [.] vread_hpet
           0.23%             perf  [.] vread_hpet
           0.18%            xchat  [.] vread_hpet
      $
      
      Now we don't have to first lookup the list of modules and then, if
      it fails, vmlinux symbols, its just a simple lookup for the map
      then the symbols, just like for threads.
      
      Reports generated using /proc/kallsyms and --vmlinux should provide
      the same results, modulo the DSO name for sections other than
      ".text".
      
      But they don't right now because things like:
      
       ffffffff81011c20-ffffffff81012068 system_call
       ffffffff81011c30-ffffffff81011c9b system_call_after_swapgs
       ffffffff81011c9c-ffffffff81011cb6 system_call_fastpath
       ffffffff81011cb7-ffffffff81011cbb ret_from_sys_call
      
      I.e. overlapping symbols, again some ASM special case that we have
      to fixup.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1254934136-8503-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e538c4a
    • A
      perf tools: Up the verbose level for some really verbose stuff · da21d1b5
      Arnaldo Carvalho de Melo 提交于
      Like printing every symbol created.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      da21d1b5
  10. 08 10月, 2009 2 次提交
    • F
      perf tools: Improve thread comm resolution in perf sched · 9a92b479
      Frederic Weisbecker 提交于
      When we get sched traces that involve a task that was already
      created before opening the event, we won't have the comm event for
      it.
      
      So if we can't find the comm event for a given thread, we look at
      the traces that may contain these informations.
      
      Before:
      
       ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
       kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
       kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
       :5124:5124            |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
       :6244:6244            |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
       firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
       npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
       :6245:6245            |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |
      
      After:
      
       ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
       kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
       kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
       firefox:5124          |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
       npviewer.bin:6244     |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
       firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
       npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
       npviewer.bin:6245     |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9a92b479
    • F
      perf tools: Unify perf.data mapping and events handling · 016e92fb
      Frederic Weisbecker 提交于
      This librarizes the perf.data file mapping and handling in various
      perf tools, roughly reducing the amount of code and fixing the
      places that mmap from beginning of the file whereas we want to mmap
      from the beginning of the data, leading to page fault because the
      mmap window is too small since the trace info are written in the
      file too.
      
      TODO:
      
       - convert perf timechart too
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <20091007104729.GD5043@nowhere>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      016e92fb
  11. 07 10月, 2009 1 次提交
    • F
      perf tools: Merge trace.info content into perf.data · 03456a15
      Frederic Weisbecker 提交于
      This drops the trace.info file and move its contents into the
      common perf.data file.
      
      This is done by creating a new trace_info section into this file. A
      user of perf headers needs to call perf_header__set_trace_info() to
      save the trace meta informations into the perf.data file.
      
      A file created by perf after his patch is unsupported by previous
      version because the size of the headers have increased.
      
      That said, it's two new fields that have been added in the end of
      the headers, and those could be ignored by previous versions if
      they just handled the dynamic header size and then ignore the
      unknow part. The offsets guarantee the compatibility. We'll do a
      -stable fix for that.
      
      But current previous versions handle the header size using its
      static size, not dynamic, then it's not backward compatible with
      trace records.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20091006213643.GA5343@nowhere>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      03456a15
  12. 06 10月, 2009 4 次提交