1. 15 10月, 2009 13 次提交
  2. 13 10月, 2009 4 次提交
  3. 12 10月, 2009 4 次提交
    • I
      perf tools: Fix the NO_64BIT build on pure 64-bit systems · 55621ccf
      Ingo Molnar 提交于
      Randy Dunlap reported that 'make NO_64BIT=1' fails to build
      a pure 32-b it binary on 64-bit/64-bit x86 systems.
      
      The reason is that we dont pass in the -m32 and GCC defaults
      to -m64.
      
      So pass it in - and also extend the warning message about libelf
      dependencies - glibc-dev[el] is needed as well beyond the libelf
      library.
      Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: Message-Id: <20091005131729.78444bfb.randy.dunlap@oracle.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      55621ccf
    • M
      perf sched: Add -C option to measure on a specific CPU · 55ffb7a6
      Mike Galbraith 提交于
      To refresh, trying to sched record only one CPU results in bogus
      latencies as below.
      
      I fixed^Wmade it stop doing the bad thing today, by
      following task migration events properly.
      
      Before:
      
        marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10
        marge:/root/tmp # perf sched lat
         -----------------------------------------------------------------------------------------
          Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms |
         -----------------------------------------------------------------------------------------
          Xorg:4943             |      1.290 ms |        1 | avg: 1670.132 ms | max: 1670.132 ms |
          hald-addon-stor:3569  |      0.091 ms |        3 | avg:  658.609 ms | max: 1975.797 ms |
          hald-addon-stor:3573  |      0.209 ms |        4 | avg:  499.138 ms | max: 1990.565 ms |
          audispd:4270          |      0.012 ms |        1 | avg:    0.015 ms | max:    0.015 ms |
        ....
      
        marge:/root/tmp # perf sched trace|grep 'Xorg:4943'
                 swapper-0     [000]   401.184013288: sched_stat_runtime: task: Xorg:4943 runtime: 1233188 [ns], vruntime: 19105169779 [ns]
         rt2870TimerQHan-4947  [000]   402.854140127: sched_stat_wait: task: Xorg:4943 wait: 580073 [ns]
         rt2870TimerQHan-4947  [000]   402.854141770: sched_migrate_task: task Xorg:4943 [140] from: 1  to: 0
         rt2870TimerQHan-4947  [000]   402.854143854: sched_stat_wait: task: Xorg:4943 wait: 0 [ns]
         rt2870TimerQHan-4947  [000]   402.854145397: sched_switch: task rt2870TimerQHan:4947 [140] (D) ==> Xorg:4943 [140]
                    Xorg-4943  [000]   402.854193133: sched_stat_runtime: task: Xorg:4943 runtime: 56546 [ns], vruntime: 11766332500 [ns]
                    Xorg-4943  [000]   402.854196842: sched_switch: task Xorg:4943 [140] (S) ==> swapper:0 [140]
      
      After:
      
        marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10
        marge:/root/tmp # perf sched lat
         -----------------------------------------------------------------------------------------
          Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms |
         -----------------------------------------------------------------------------------------
          amarokapp:11150       |    271.297 ms |      878 | avg:    0.130 ms | max:    1.057 ms |
          konsole:5965          |      1.370 ms |       12 | avg:    0.092 ms | max:    0.855 ms |
          Xorg:4943             |    179.980 ms |     1109 | avg:    0.087 ms | max:    1.206 ms |
          hald-addon-stor:3574  |      0.212 ms |        9 | avg:    0.040 ms | max:    0.169 ms |
          hald-addon-stor:3570  |      0.223 ms |        9 | avg:    0.037 ms | max:    0.223 ms |
          klauncher:5864        |      0.550 ms |        8 | avg:    0.032 ms | max:    0.048 ms |
      
      The 'Maximum delay ms' results are now sane.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      55ffb7a6
    • M
      perf tools: Fix counter sample frequency breakage · 7e4ff9e3
      Mike Galbraith 提交于
      Commit 42e59d7d switched to a default sample frequency of
      1KHz, which overrides any user supplied count, causing sched, top
      and timechart to miss events due to their discrete events
      being flagged PERF_SAMPLE_PERIOD.
      
      Override default sample frequency when the user profides a
      period count, and make both record and top honor that user
      supplied option.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <1255326963.15107.2.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7e4ff9e3
    • R
      perf tools: Fix const char type propagation · cbef79a8
      Randy Dunlap 提交于
      The following perf build warnings/errors in function
      argument types:
      
        builtin-sched.c:1894: warning: passing argument 1 of 'sort_dimension__add' discards qualifiers from pointer target type
        util/trace-event-parse.c:685: warning: passing argument 2 of 'read_expected' discards qualifiers from pointer target type
        util/trace-event-parse.c:741: warning: passing argument 4 of 'test_type_token' discards qualifiers from pointer target type
        util/trace-event-parse.c:706: warning: passing argument 2 of 'read_expected_item' discards qualifiers from pointer target type
      
      ... trigger because older GCC is not able to prove that
      sort_dimension__add() does not change the string.
      
      Some goes for test_type_token().
      
      Fix this by improving type consistency.
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20091005131729.78444bfb.randy.dunlap@oracle.com>
      [ Also remove ugly type cast now unnecessary. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cbef79a8
  4. 09 10月, 2009 4 次提交
    • F
      perf tools: Provide backward compatibility with previous perf.data version · 26dd2cb0
      Frederic Weisbecker 提交于
      We have merged the trace.info file into perf.data by adding one
      section in the perf headers. This makes it incompatible with
      previous version: the new perf tools can't read the older
      perf.data.
      
      To support the previous format, we check the headers size. If they
      have the same size than in the previous format, then ignore the
      trace info section that doesn't exist.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255032449-12022-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      26dd2cb0
    • F
      perf tools: Fix thread comm resolution in perf sched · 97ea1a7f
      Frederic Weisbecker 提交于
      This reverts commit 9a92b479 ("perf
      tools: Improve thread comm resolution in perf sched") and fixes the
      real bug.
      
      The bug was elsewhere:
      
      We are failing to resolve thread names in perf sched because the
      table of threads we are building, on top of comm events, has a per
      process granularity. But perf sched, unlike the other perf tools,
      needs a per thread granularity as we are profiling every tasks
      individually.
      
      So fix it by building our threads table using the tid instead of
      the pid as the thread identifier.
      
      v2: Revert the previous fix - it is not really needed
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      97ea1a7f
    • A
      perf tools: Improve kernel/modules symbol lookup · 2e538c4a
      Arnaldo Carvalho de Melo 提交于
      This removes the ovelapping of vmlinux addresses with modules,
      using the ELF section name when using --vmlinux and creating a
      unique DSO name when using /proc/kallsyms ([kernel].N).
      
      This is done by creating multiple 'struct map' instances for
      address ranges backed by DSOs that have just the symbols for that
      range and a name that is derived from the ELF section name.o
      
      Now it is possible to ask for just the symbols in some particular
      kernel section:
      
      $ perf report -m --vmlinux ../build/tip-recvmmsg/vmlinux \
      	--dsos [kernel].vsyscall_fn | head -15
          52.73%             Xorg  [.] vread_hpet
          18.61%          firefox  [.] vread_hpet
          14.50%     npviewer.bin  [.] vread_hpet
           6.83%           compiz  [.] vread_hpet
           5.73%         glxgears  [.] vread_hpet
           0.63%             java  [.] vread_hpet
           0.30%   gnome-terminal  [.] vread_hpet
           0.23%             perf  [.] vread_hpet
           0.18%            xchat  [.] vread_hpet
      $
      
      Now we don't have to first lookup the list of modules and then, if
      it fails, vmlinux symbols, its just a simple lookup for the map
      then the symbols, just like for threads.
      
      Reports generated using /proc/kallsyms and --vmlinux should provide
      the same results, modulo the DSO name for sections other than
      ".text".
      
      But they don't right now because things like:
      
       ffffffff81011c20-ffffffff81012068 system_call
       ffffffff81011c30-ffffffff81011c9b system_call_after_swapgs
       ffffffff81011c9c-ffffffff81011cb6 system_call_fastpath
       ffffffff81011cb7-ffffffff81011cbb ret_from_sys_call
      
      I.e. overlapping symbols, again some ASM special case that we have
      to fixup.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1254934136-8503-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e538c4a
    • A
      perf tools: Up the verbose level for some really verbose stuff · da21d1b5
      Arnaldo Carvalho de Melo 提交于
      Like printing every symbol created.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      da21d1b5
  5. 08 10月, 2009 2 次提交
    • F
      perf tools: Improve thread comm resolution in perf sched · 9a92b479
      Frederic Weisbecker 提交于
      When we get sched traces that involve a task that was already
      created before opening the event, we won't have the comm event for
      it.
      
      So if we can't find the comm event for a given thread, we look at
      the traces that may contain these informations.
      
      Before:
      
       ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
       kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
       kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
       :5124:5124            |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
       :6244:6244            |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
       firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
       npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
       :6245:6245            |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |
      
      After:
      
       ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
       kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
       kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
       firefox:5124          |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
       npviewer.bin:6244     |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
       firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
       npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
       npviewer.bin:6245     |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9a92b479
    • F
      perf tools: Unify perf.data mapping and events handling · 016e92fb
      Frederic Weisbecker 提交于
      This librarizes the perf.data file mapping and handling in various
      perf tools, roughly reducing the amount of code and fixing the
      places that mmap from beginning of the file whereas we want to mmap
      from the beginning of the data, leading to page fault because the
      mmap window is too small since the trace info are written in the
      file too.
      
      TODO:
      
       - convert perf timechart too
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <20091007104729.GD5043@nowhere>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      016e92fb
  6. 07 10月, 2009 2 次提交
    • F
      perf tools: Merge trace.info content into perf.data · 03456a15
      Frederic Weisbecker 提交于
      This drops the trace.info file and move its contents into the
      common perf.data file.
      
      This is done by creating a new trace_info section into this file. A
      user of perf headers needs to call perf_header__set_trace_info() to
      save the trace meta informations into the perf.data file.
      
      A file created by perf after his patch is unsupported by previous
      version because the size of the headers have increased.
      
      That said, it's two new fields that have been added in the end of
      the headers, and those could be ignored by previous versions if
      they just handled the dynamic header size and then ignore the
      unknow part. The offsets guarantee the compatibility. We'll do a
      -stable fix for that.
      
      But current previous versions handle the header size using its
      static size, not dynamic, then it's not backward compatible with
      trace records.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20091006213643.GA5343@nowhere>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      03456a15
    • F
      perf tools: Start the perf.data mapping at data offset in perf trace · b209aa1f
      Frederic Weisbecker 提交于
      Currently, we are mapping perf.data in the beginning of the file
      and use the data offset as a buffer offset.
      
      This may exceed the mapping area if the data offset is upper than
      page_size * mmap_window and result in a page fault (thing that
      happen if we merge trace.info in perf.data).
      
      Instead, let's start the mapping in the page that matches our data
      offset.
      
      v2: Drop a junk from another patch (trace_report() removal)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <1254856886-10348-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b209aa1f
  7. 06 10月, 2009 11 次提交