1. 02 10月, 2009 1 次提交
    • A
      perf tools: Rewrite and improve support for kernel modules · 439d473b
      Arnaldo Carvalho de Melo 提交于
      Representing modules as struct map entries, backed by a DSO, etc,
      using /proc/modules to find where the module is loaded.
      
      DSOs now can have a short and long name, so that in verbose mode we
      can show exactly which .ko or vmlinux image was used.
      
      As kernel modules now are a DSO separate from the kernel, we can
      ask for just the hits for a particular set of kernel modules, just
      like we can do with shared libraries:
      
      [root@doppio linux-2.6-tip]# perf report -n --vmlinux
      /home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15
          84.58%      13266             Xorg  [k] drm_clflush_pages
           4.02%        630             Xorg  [k] trace_kmalloc.clone.0
           3.95%        619             Xorg  [k] drm_ioctl
           2.07%        324             Xorg  [k] drm_addbufs
           1.68%        263             Xorg  [k] drm_gem_close_ioctl
           0.77%        120             Xorg  [k] drm_setmaster_ioctl
           0.70%        110             Xorg  [k] drm_lastclose
           0.68%        106             Xorg  [k] drm_open
           0.54%         85             Xorg  [k] drm_mm_search_free
      [root@doppio linux-2.6-tip]#
      
      Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko
      would have the same effect. Allowing specifying just 'drm.ko' is left
      for another patch.
      
      Processing kallsyms so that per kernel module struct map are
      instantiated was also left for another patch. That will allow
      removing the module name from each of its symbols.
      
      struct symbol was reduced by removing the ->module backpointer and
      moving it (well now the map) to struct symbol_entry in perf top,
      that is its only user right now.
      
      The total linecount went down by ~500 lines.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Avi Kivity <avi@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      439d473b
  2. 30 9月, 2009 2 次提交
    • A
      perf tools: Use rb_tree for maps · 1b46cddf
      Arnaldo Carvalho de Melo 提交于
      Threads can have many and kernel modules will be represented as a
      tree of maps as well.
      
      Ah, and for a perf.data with 146607 samples:
      
      Before:
      
      [root@doppio ~]# perf stat -r 5 perf report > /dev/null
      
       Performance counter stats for 'perf report' (5 runs):
      
           699.823680  task-clock-msecs         #      0.991 CPUs    ( +-   0.454% )
                   74  context-switches         #      0.000 M/sec   ( +-   1.709% )
                    2  CPU-migrations           #      0.000 M/sec   ( +-  17.008% )
                23114  page-faults              #      0.033 M/sec   ( +-   0.000% )
           1381257019  cycles                   #   1973.721 M/sec   ( +-   0.290% )
           1456894438  instructions             #      1.055 IPC     ( +-   0.007% )
             18779818  cache-references         #     26.835 M/sec   ( +-   0.380% )
               641799  cache-misses             #      0.917 M/sec   ( +-   1.200% )
      
          0.705972729  seconds time elapsed   ( +-   0.501% )
      
      [root@doppio ~]#
      
      After
      
       Performance counter stats for 'perf report' (5 runs):
      
           691.261451  task-clock-msecs         #      0.993 CPUs    ( +-   0.307% )
                   72  context-switches         #      0.000 M/sec   ( +-   0.829% )
                    6  CPU-migrations           #      0.000 M/sec   ( +-  18.409% )
                23127  page-faults              #      0.033 M/sec   ( +-   0.000% )
           1366395876  cycles                   #   1976.670 M/sec   ( +-   0.153% )
           1443136016  instructions             #      1.056 IPC     ( +-   0.012% )
             17956402  cache-references         #     25.976 M/sec   ( +-   0.325% )
               661924  cache-misses             #      0.958 M/sec   ( +-   1.335% )
      
          0.696127275  seconds time elapsed   ( +-   0.377% )
      
      I.e. we see some speedup too.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      LKML-Reference: <20090928174846.GA3361@ghostprotocols.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1b46cddf
    • J
      perf tools: Put common histogram functions in their own file · 3d1d07ec
      John Kacur 提交于
      Move histogram related functions into their own files (hist.c and
      hist.h) and make use of them in builtin-annotate.c and
      builtin-report.c.
      Signed-off-by: NJohn Kacur <jkacur@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <alpine.LFD.2.00.0909281531180.8316@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3d1d07ec
  3. 25 9月, 2009 3 次提交
  4. 24 9月, 2009 2 次提交
  5. 23 9月, 2009 1 次提交
    • M
      perf tools: Fix module symbol loading bug · 508c4d08
      Mike Galbraith 提交于
      Avi Kivity reported 'perf annotate' failures with modules, the
      requested function was not annotated.
      
      If there are no modules currently loaded, or the last module
      scanned is not loaded, dso__load_modules() steps on the value from
      dso__load_vmlinux(), so we happily load the kallsyms symbols on top
      of what we've already loaded.
      
      Fix that such that the total count of symbols loaded is returned.
      Should module symbol load fail after parsing of vmlinux, is's a
      hard failure, so do not silently fall-back to kallsyms.
      Reported-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: rostedt@goodmis.org
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <1253697658.11461.36.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      508c4d08
  6. 21 9月, 2009 5 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
    • A
      perf util: SVG performance improvements · 611a546b
      Arjan van de Ven 提交于
      Tweak the output SVG to increase performance in SVG viewers by
      limiting the different types of font sizes and by smarter
      transformations on the text.
      
      At least with Inkscape this gives a notable performance improvement
      during zoom and scrolling.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20090920181438.3a49cb93@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      611a546b
    • A
      perf util: Make the timechart SVG width dynamic · 5094b655
      Arjan van de Ven 提交于
      This patch adds a command line option for timechart that allows the
      user to specify the width of the SVG file.
      
      This patch also makes sure that each second of recording has at
      least 200 units (pixels at 96 DPI) of width.  This impacts
      recordings longer than 5 seconds; recordings shorter than 5 second
      will scale up to have a width of 1000 units for the whole recording
      (as before).
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20090920181416.69570c5d@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5094b655
    • A
      perf timechart: Show the duration of scheduler delays in the SVG · a92fe7b3
      Arjan van de Ven 提交于
      Given that scheduler latencies are the hot thing nowadays, show the
      duration of said latencies in the SVG in text form.
      
      In addition, if the latency is more than 10 msec, pick a brighter
      yellow color as a way to point these long delays out.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20090920181353.796f4509@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a92fe7b3
    • A
      perf timechart: Show the name of the waker/wakee in timechart · 4f1202c8
      Arjan van de Ven 提交于
      Timechart currently shows thin green lines for sending or receiving
      wakeups. This patch also prints (in a very small font) the name of
      the process that is being woken/wakes up this process.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20090920181328.68baa978@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4f1202c8
  7. 20 9月, 2009 2 次提交
  8. 19 9月, 2009 5 次提交
  9. 18 9月, 2009 3 次提交
  10. 16 9月, 2009 3 次提交
    • I
      perf sched: Add 'perf sched map' scheduling event map printout · 0ec04e16
      Ingo Molnar 提交于
      This prints a textual context-switching outline of workload
      captured via perf sched record.
      
      For example, on a 16 CPU box it outputs:
      
         N1  O1  .   .   .   S1  .   .   .   B0  .  *I0  C1  .   M1  .    23002.773423 secs
         N1  O1  .  *Q0  .   S1  .   .   .   B0  .   I0  C1  .   M1  .    23002.773423 secs
         N1  O1  .   Q0  .   S1  .   .   .   B0  .  *R1  C1  .   M1  .    23002.773485 secs
         N1  O1  .   Q0  .   S1  .  *S0  .   B0  .   R1  C1  .   M1  .    23002.773478 secs
        *L0  O1  .   Q0  .   S1  .   S0  .   B0  .   R1  C1  .   M1  .    23002.773523 secs
         L0  O1  .  *.   .   S1  .   S0  .   B0  .   R1  C1  .   M1  .    23002.773531 secs
         L0  O1  .   .   .   S1  .   S0  .   B0  .   R1  C1 *T1  M1  .    23002.773547 secs T1 => irqbalance:2089
         L0  O1  .   .   .   S1  .   S0  .  *P0  .   R1  C1  T1  M1  .    23002.773549 secs
        *N1  O1  .   .   .   S1  .   S0  .   P0  .   R1  C1  T1  M1  .    23002.773566 secs
         N1  O1  .   .   .  *J0  .   S0  .   P0  .   R1  C1  T1  M1  .    23002.773571 secs
         N1  O1  .   .   .   J0  .   S0 *B0  P0  .   R1  C1  T1  M1  .    23002.773592 secs
         N1  O1  .   .   .   J0  .  *U0  B0  P0  .   R1  C1  T1  M1  .    23002.773582 secs
         N1  O1  .   .   .  *S1  .   U0  B0  P0  .   R1  C1  T1  M1  .    23002.773604 secs
         N1  O1  .   .   .   S1  .   U0  B0 *.   .   R1  C1  T1  M1  .    23002.773615 secs
         N1  O1  .   .   .   S1  .   U0  B0  .   .  *K0  C1  T1  M1  .    23002.773631 secs
         N1  O1  .  *M0  .   S1  .   U0  B0  .   .   K0  C1  T1  M1  .    23002.773624 secs
         N1  O1  .   M0  .   S1  .   U0 *.   .   .   K0  C1  T1  M1  .    23002.773644 secs
         N1  O1  .   M0  .   S1  .   U0  .   .   .  *R1  C1  T1  M1  .    23002.773662 secs
         N1  O1  .   M0  .   S1  .  *.   .   .   .   R1  C1  T1  M1  .    23002.773648 secs
         N1  O1  .  *.   .   S1  .   .   .   .   .   R1  C1  T1  M1  .    23002.773680 secs
         N1  O1  .   .   .  *L0  .   .   .   .   .   R1  C1  T1  M1  .    23002.773717 secs
        *N0  O1  .   .   .   L0  .   .   .   .   .   R1  C1  T1  M1  .    23002.773709 secs
        *N1  O1  .   .   .   L0  .   .   .   .   .   R1  C1  T1  M1  .    23002.773747 secs
      
      Columns stand for individual CPUs, from CPU0 to CPU15, and the
      two-letter shortcuts stand for tasks that are running on a CPU.
      
      '*' denotes the CPU that had the event.
      
      A dot signals an idle CPU.
      
      New tasks are assigned new two-letter shortcuts - when they occur
      first they are printed. In the above example 'T1' stood for irqbalance:
      
            T1 => irqbalance:2089
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0ec04e16
    • I
      perf sched: Make idle thread and comm/pid names more consistent · 80ed0987
      Ingo Molnar 提交于
      Peter noticed that we have 3 ways of referring to the idle thread:
      
       [idle]:0
       swapper:0
       swapper-0
      
      Standardize on 'swapper:0'.
      Reported-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      80ed0987
    • I
      perf sched: Account for lost events, increase default buffering · dc02bf71
      Ingo Molnar 提交于
      Output such lost event and state machine weirdness stats:
      
         TOTAL:                |  14974.910 ms |    46384 |
        ---------------------------------------------------
         INFO: 8.865% lost events (19132 out of 215819, in 8 chunks)
         INFO: 0.198% state machine bugs (49 out of 24708) (due to lost events?)
      
      And increase buffering to -m 1024 (4 MB) by default. Since we
      use output multiplexing that kind of space is needed.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc02bf71
  11. 14 9月, 2009 1 次提交
    • I
      perf tools: Implement counter output multiplexing · ea57c4f5
      Ingo Molnar 提交于
      Finish the -M/--multiplex option implementation:
      
       - separate it out from group_fd
      
       - correctly set it via the ioctl and dont mmap counters that
         are multiplexed
      
       - modify the perf record event loop to deal with buffer-less
         counters.
      
       - remove the -g option from perf sched record
      
       - account for unordered events in perf sched latency
      
       - (add -f to perf sched record to ease measurements)
      
       - skip idle threads (pid==0) in latency output
      
      The result is better latency output by 'perf sched latency':
      
       -----------------------------------------------------------------------------------
        Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
       -----------------------------------------------------------------------------------
        ksoftirqd/8       |    0.071 ms |        2 | avg:    0.458 ms | max:    0.913 ms |
        at-spi-registry   |    0.609 ms |       19 | avg:    0.013 ms | max:    0.023 ms |
        perf              |    3.316 ms |       16 | avg:    0.013 ms | max:    0.054 ms |
        Xorg              |    0.392 ms |       19 | avg:    0.011 ms | max:    0.018 ms |
        sleep             |    0.537 ms |        2 | avg:    0.009 ms | max:    0.009 ms |
       -----------------------------------------------------------------------------------
        TOTAL:            |    4.925 ms |       58 |
       ---------------------------------------------
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea57c4f5
  12. 13 9月, 2009 6 次提交
    • I
      perf sched: Implement the 'perf sched record' subcommand · 1fc35b29
      Ingo Molnar 提交于
      Implement the 'perf sched record' subcommand that adds a
      default list of events, turns on raw sampling and system-wide
      tracing and passes off the rest of the command to perf record.
      
      This is more convenient than having to specify the events all
      the time.
      
      Before:
      
       $ perf record -a -R -e sched:sched_switch:r -e sched:sched_stat_wait:r -e sched:sched_stat_sleep:r -e sched:sched_stat_iowait:r -e sched:sched_process_exit:r -e sched:sched_process_fork:r -e sched:sched_wakeup:r -e sched:sched_migrate_task:r -c 1 sleep 1
      
      After:
      
       $ perf sched record -f sleep 1
      
      Also fix an assumption in the event string parser that assumed
      that strings passed in can be modified. (In this case they wont
      be as they come from a readonly constant section.)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fc35b29
    • I
      perf sched: Clean up PID sorting logic · b5fae128
      Ingo Molnar 提交于
      Use a sort list for thread atoms insertion as well - instead of
      hardcoded for PID.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b5fae128
    • I
      perf sched: Display time in milliseconds, reorganize output · d9340c1d
      Ingo Molnar 提交于
      After:
      
      -----------------------------------------------------------------------------------
       Task              |  runtime ms | switches | average delay ms | maximum delay ms |
      -----------------------------------------------------------------------------------
       migration/0       |    0.000 ms |        1 | avg:    0.047 ms | max:    0.047 ms |
       ksoftirqd/0       |    0.000 ms |        1 | avg:    0.039 ms | max:    0.039 ms |
       migration/1       |    0.000 ms |        3 | avg:    0.013 ms | max:    0.016 ms |
       migration/3       |    0.000 ms |        2 | avg:    0.003 ms | max:    0.004 ms |
       migration/4       |    0.000 ms |        1 | avg:    0.022 ms | max:    0.022 ms |
       distccd           |    0.000 ms |        1 | avg:    0.004 ms | max:    0.004 ms |
       distccd           |    0.000 ms |        1 | avg:    0.014 ms | max:    0.014 ms |
       distccd           |    0.000 ms |        2 | avg:    0.000 ms | max:    0.000 ms |
       distccd           |    0.000 ms |        2 | avg:    0.012 ms | max:    0.019 ms |
       distccd           |    0.000 ms |        1 | avg:    0.002 ms | max:    0.002 ms |
       as                |    0.000 ms |        2 | avg:    0.019 ms | max:    0.019 ms |
       as                |    0.000 ms |        3 | avg:    0.015 ms | max:    0.017 ms |
       as                |    0.000 ms |        1 | avg:    0.009 ms | max:    0.009 ms |
       perf              |    0.000 ms |        1 | avg:    0.001 ms | max:    0.001 ms |
       gcc               |    0.000 ms |        1 | avg:    0.021 ms | max:    0.021 ms |
       run-mozilla.sh    |    0.000 ms |        2 | avg:    0.010 ms | max:    0.017 ms |
       mozilla-plugin-   |    0.000 ms |        1 | avg:    0.006 ms | max:    0.006 ms |
       gcc               |    0.000 ms |        2 | avg:    0.013 ms | max:    0.013 ms |
      -----------------------------------------------------------------------------------
      
      (The runtime ms column is not filled in yet.)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9340c1d
    • F
      perf sched: Fix bad event alignment · 46538818
      Frederic Weisbecker 提交于
      perf sched raises the following error when it meets a sched
      switch event:
      
      perf: builtin-sched.c:286: register_pid: Assertion `!(pid >= 65536)' failed.
      Abandon
      
      Currently in x86-64, the sched switch events have a hole in the
      middle of the structure:
      
      	u16 common_type;
      	u8 common_flags;
      	u8 common_preempt_count;
      	u32 common_pid;
      	u32 common_tgid;
      
      	char prev_comm[16];
      	u32 prev_pid;
      	u32 prev_prio;
      			<--- there
      	u64 prev_state;
      	char next_comm[16];
      	u32 next_pid;
      	u32 next_prio;
      
      Gcc inserts a 4 bytes hole there for prev_state to be u64
      aligned. And the events are exported to userspace with this
      hole.
      
      But in userspace, from perf sched, we fetch it using a
      structure that has a new field in the beginning: u32 size. This
      is because our trace is exported with its size as a field. But
      now that we have this new field, the hole in the middle
      disappears because it makes prev_state becoming well aligned.
      
      And since we are using a pointer to the raw trace using this
      struct, instead of reading prev_state, we are reading the hole.
      
      We could fix it by keeping the size seperate from the struct
      but actually there a lot of other potential problems: some
      fields may be saved as long in a 64 bits system and later read
      as long in a 32 bits system. Also this direct cast doesn't care
      about the endianness differences between the host traced
      machine and the machine in which we do the post processing.
      
      So instead of using such dangerous direct casts, fetch the
      values using the trace parsing API that already takes care of
      all these problems.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      46538818
    • F
      perf tools: Allow the specification of all tracepoints at once · bcd3279f
      Frederic Weisbecker 提交于
      Currently, when one wants to activate every tracepoint
      counters of a subsystem from perf record, the current sequence
      is needed:
      
        perf record -e subsys:ev1 -e subsys:ev2 -e subsys:ev3
      
      This may annoy the most patient of us.
      
      Now we can just do:
      
        perf record -e subsys:*
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bcd3279f
    • I
      perf sched: Import schedbench.c · ec156764
      Ingo Molnar 提交于
      Import the schedbench.c tool that i wrote some time ago to
      simulate scheduler behavior but never finished. It's a good
      basis for perf sched nevertheless.
      
      Most of its guts are not hooked up to the perf event loop
      yet - that will be done in the patches to come.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ec156764
  13. 05 9月, 2009 1 次提交
  14. 03 9月, 2009 3 次提交
    • I
      perf trace: Fix read_string() · 6f4596d9
      Ingo Molnar 提交于
      We did not account for the enclosing \0. Depending on what malloc()
      gave us this resulted in corrupted version string printouts.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6f4596d9
    • I
      perf trace: Print out in nanoseconds · 00fc9786
      Ingo Molnar 提交于
      Print out more accurate timestamps - usecs does not cut it
      anymore on fast enough boxes ;-)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      00fc9786
    • I
      perf tools: Seek to the end of the header area · 2e01d179
      Ingo Molnar 提交于
      Leave the input fd at the data area.
      
      It does not matter right now - but seeking at the end of it
      certainly did not make sense.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e01d179
  15. 02 9月, 2009 1 次提交
    • I
      perf tools: Work around strict aliasing related warnings · 65014ab3
      Ingo Molnar 提交于
      Older versions of GCC are rather stupid about strict aliasing:
      
        util/trace-event-parse.c: In function 'parse_cmdlines':
        util/trace-event-parse.c:93: warning: dereferencing type-punned pointer will break strict-aliasing rules
        util/trace-event-parse.c: In function 'parse_proc_kallsyms':
        util/trace-event-parse.c:155: warning: dereferencing type-punned pointer will break strict-aliasing rules
        util/trace-event-parse.c:157: warning: dereferencing type-punned pointer will break strict-aliasing rules
        util/trace-event-parse.c:158: warning: dereferencing type-punned pointer will break strict-aliasing rules
        util/trace-event-parse.c: In function 'parse_ftrace_printk':
        util/trace-event-parse.c:294: warning: dereferencing type-punned pointer will break strict-aliasing rules
        util/trace-event-parse.c:295: warning: dereferencing type-punned pointer will break strict-aliasing rules
        make: *** [util/trace-event-parse.o] Error 1
      
      Make it clear to GCC that we intend with those pointers, by passing
      them through via an explicit (void *) cast.
      
      We might want to add -fno-strict-aliasing as well, like the kernel
      itself does.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      65014ab3
  16. 31 8月, 2009 1 次提交
    • F
      perf tools: Complete support for dynamic strings · 561f732c
      Frederic Weisbecker 提交于
      Complete support for __str_loc type strings of ftrace events
      which have dynamic offsets values set for each of them inside
      their sammples.
      
      Before:
              geany-5759  [000]     0.000000: lock_release: name
              geany-5759  [000]     0.000000: lock_release: name
              geany-5759  [000]     0.000000: lock_release: name
        kondemand/0-362   [000]     0.000000: lock_release: name
            pdflush-421   [000]     0.000000: lock_release: name
      
      After:
              geany-5759  [000]     0.000000: lock_release: &u->lock
              geany-5759  [000]     0.000000: lock_release: key
              geany-5759  [000]     0.000000: lock_release: &group->notification_mutex
        kondemand/0-362   [000]     0.000000: lock_release: &rq->lock
            pdflush-421   [000]     0.000000: lock_release: &rq->lock
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1251693921-6579-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      561f732c