1. 13 9月, 2009 21 次提交
    • I
      perf sched: Add 'perf sched trace', improve documentation · c13f0d3c
      Ingo Molnar 提交于
      Alias 'perf sched trace' to 'perf trace', for workflow completeness.
      
      Add a bit of documentation for perf sched.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c13f0d3c
    • I
      perf sched: Implement the 'perf sched record' subcommand · 1fc35b29
      Ingo Molnar 提交于
      Implement the 'perf sched record' subcommand that adds a
      default list of events, turns on raw sampling and system-wide
      tracing and passes off the rest of the command to perf record.
      
      This is more convenient than having to specify the events all
      the time.
      
      Before:
      
       $ perf record -a -R -e sched:sched_switch:r -e sched:sched_stat_wait:r -e sched:sched_stat_sleep:r -e sched:sched_stat_iowait:r -e sched:sched_process_exit:r -e sched:sched_process_fork:r -e sched:sched_wakeup:r -e sched:sched_migrate_task:r -c 1 sleep 1
      
      After:
      
       $ perf sched record -f sleep 1
      
      Also fix an assumption in the event string parser that assumed
      that strings passed in can be modified. (In this case they wont
      be as they come from a readonly constant section.)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fc35b29
    • I
      perf sched: Clean up PID sorting logic · b5fae128
      Ingo Molnar 提交于
      Use a sort list for thread atoms insertion as well - instead of
      hardcoded for PID.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b5fae128
    • I
      perf sched: Finish latency => atom rename and misc cleanups · b1ffe8f3
      Ingo Molnar 提交于
      - Rename 'latency' field/variable names to the better 'atom' ones
      
       - Reduce the number of #include lines and consolidate them
      
       - Gather file scope variables at the top of the file
      
       - Remove unused bits
      
      No change in functionality.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b1ffe8f3
    • I
      perf sched: Add 'perf sched latency' and 'perf sched replay' · f2858d8a
      Ingo Molnar 提交于
      Separate the option parsing cleanly and add two variants:
      
       - 'perf sched latency' (can be abbreviated via 'perf sched lat')
       - 'perf sched replay'  (can be abbreviated via 'perf sched rep')
      
      Also add a repeat count option to replay and add a separation
      set of options for replay.
      
      Do the sorting setup only in the latency sub-command.
      
      Display separate help screens for 'perf sched' and
      'perf sched replay -h' - i.e. further separation of the
      sub-commands.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f2858d8a
    • F
      perf sched: Implement multidimensional sorting · daa1d7a5
      Frederic Weisbecker 提交于
      Implement multidimensional sorting on perf sched so that
      you can sort either by number of switches, latency average,
      latency maximum, runtime.
      
      perf sched -l -s avg,max  (this is the default)
      
      -----------------------------------------------------------------------------------
       Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
      -----------------------------------------------------------------------------------
       gnome-power-man   |    0.113 ms |        1 | avg: 4998.531 ms | max: 4998.531 ms |
       xfdesktop         |    1.190 ms |        7 | avg:  136.475 ms | max:  940.933 ms |
       xfce-mcs-manage   |    2.194 ms |       22 | avg:   38.534 ms | max:  735.174 ms |
       notification-da   |    2.749 ms |       31 | avg:   27.436 ms | max:  731.791 ms |
       xfce4-session     |    3.343 ms |       28 | avg:   26.796 ms | max:  734.891 ms |
       xfwm4             |    3.159 ms |       22 | avg:   12.406 ms | max:  241.333 ms |
       xchat             |   42.789 ms |      214 | avg:   11.886 ms | max:  100.349 ms |
       xfce4-terminal    |    5.386 ms |       22 | avg:   11.414 ms | max:  241.611 ms |
       firefox           |  151.992 ms |      123 | avg:    9.543 ms | max:  153.717 ms |
       xfce4-panel       |   24.324 ms |       47 | avg:    8.189 ms | max:  242.352 ms |
       :5090             |    6.932 ms |      111 | avg:    8.131 ms | max:  102.665 ms |
       events/0          |    0.758 ms |       12 | avg:    1.964 ms | max:   21.879 ms |
       Xorg              |  280.558 ms |      340 | avg:    1.864 ms | max:   99.526 ms |
       geany             |   63.391 ms |      295 | avg:    1.099 ms | max:    9.334 ms |
       reiserfs/0        |    0.039 ms |        2 | avg:    0.854 ms | max:    1.487 ms |
       kondemand/0       |    8.251 ms |      245 | avg:    0.691 ms | max:   34.372 ms |
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      daa1d7a5
    • F
      perf sched: Fix nsec to msec conversion · 73622626
      Frederic Weisbecker 提交于
      We are dividing a time in ns by 1e9. This is a nsec to sec
      conversion. What we want is msecs. Fix it by dividing by 1e6.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      73622626
    • F
      perf sched: Export the total, max latency and total runtime to thread atoms list · 66685678
      Frederic Weisbecker 提交于
      Add a field in the thread atom list that keeps track of the
      total and max latencies and also the total runtime. This makes
      a faster output and also prepares for sorting.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      66685678
    • F
      perf sched: Add involuntarily sleeping task in work atoms · c6ced611
      Frederic Weisbecker 提交于
      Currently in perf sched, we are measuring the scheduler wakeup
      latencies.
      
      Now we also want measure the time a task wait to be scheduled
      after it gets preempted.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c6ced611
    • F
      perf sched: Rename struct lat_snapshot to struct work atoms · 17562205
      Frederic Weisbecker 提交于
      To measures the latencies, we capture the sched atoms data into
      a specific structure named struct lat_snapshot.
      
      As this structure can be used for other purposes of scheduler
      profiling and mirrors what happens in a thread work atom, lets
      rename it to struct work_atom and propagate this renaming in
      other functions and structures names to keep it coherent.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      17562205
    • I
      perf sched: Output runtime and context switch totals · 3e304147
      Ingo Molnar 提交于
      After:
      
      -----------------------------------------------------------------------------------
       Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
      -----------------------------------------------------------------------------------
       make              |    0.678 ms |       13 | avg:    0.018 ms | max:    0.050 ms |
       gcc               |    0.014 ms |        2 | avg:    0.320 ms | max:    0.627 ms |
       gcc               |    0.000 ms |        2 | avg:    0.185 ms | max:    0.369 ms |
      ...
      -----------------------------------------------------------------------------------
       TOTAL:            |   21.316 ms |       63 |
      ---------------------------------------------
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3e304147
    • I
      perf sched: Add runtime stats · ea92ed5a
      Ingo Molnar 提交于
      Extend the latency tracking structure with scheduling atom
      runtime info - and sum it up during per task display.
      
      (Also clean up a few details.)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea92ed5a
    • I
      perf sched: Display time in milliseconds, reorganize output · d9340c1d
      Ingo Molnar 提交于
      After:
      
      -----------------------------------------------------------------------------------
       Task              |  runtime ms | switches | average delay ms | maximum delay ms |
      -----------------------------------------------------------------------------------
       migration/0       |    0.000 ms |        1 | avg:    0.047 ms | max:    0.047 ms |
       ksoftirqd/0       |    0.000 ms |        1 | avg:    0.039 ms | max:    0.039 ms |
       migration/1       |    0.000 ms |        3 | avg:    0.013 ms | max:    0.016 ms |
       migration/3       |    0.000 ms |        2 | avg:    0.003 ms | max:    0.004 ms |
       migration/4       |    0.000 ms |        1 | avg:    0.022 ms | max:    0.022 ms |
       distccd           |    0.000 ms |        1 | avg:    0.004 ms | max:    0.004 ms |
       distccd           |    0.000 ms |        1 | avg:    0.014 ms | max:    0.014 ms |
       distccd           |    0.000 ms |        2 | avg:    0.000 ms | max:    0.000 ms |
       distccd           |    0.000 ms |        2 | avg:    0.012 ms | max:    0.019 ms |
       distccd           |    0.000 ms |        1 | avg:    0.002 ms | max:    0.002 ms |
       as                |    0.000 ms |        2 | avg:    0.019 ms | max:    0.019 ms |
       as                |    0.000 ms |        3 | avg:    0.015 ms | max:    0.017 ms |
       as                |    0.000 ms |        1 | avg:    0.009 ms | max:    0.009 ms |
       perf              |    0.000 ms |        1 | avg:    0.001 ms | max:    0.001 ms |
       gcc               |    0.000 ms |        1 | avg:    0.021 ms | max:    0.021 ms |
       run-mozilla.sh    |    0.000 ms |        2 | avg:    0.010 ms | max:    0.017 ms |
       mozilla-plugin-   |    0.000 ms |        1 | avg:    0.006 ms | max:    0.006 ms |
       gcc               |    0.000 ms |        2 | avg:    0.013 ms | max:    0.013 ms |
      -----------------------------------------------------------------------------------
      
      (The runtime ms column is not filled in yet.)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9340c1d
    • I
      perf sched: Clean up latency and replay sub-commands · 46f392c9
      Ingo Molnar 提交于
      - Separate the latency and the replay commands more cleanly
      
       - Use consistent naming
      
       - Display help page on 'perf sched' outlining comments,
         instead of aborting
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      46f392c9
    • F
      perf sched: Add sched latency profiling · cdce9d73
      Frederic Weisbecker 提交于
      Add the -l --latency option that reports statistics about the
      scheduler latencies.
      
      For now, the latencies are measured in the following sequence
      scope:
      
      - task A is sleeping (D or S state)
      - task B wakes up A
               ^
               |
               |
      
         latency timeframe
      
               |
               |
               v
      - task A is scheduled in
      
      Start by recording every scheduler events:
      
      	perf record -e sched:*
      
      and then fetch the results:
      
      	perf sched -l
      
       Tasks                     count          total              avg            max
      
      migration/0                  2             39849            19924           28826
      ksoftirqd/0                  7            756383           108054          373014
      migration/1                  5             45391             9078           10452
      ksoftirqd/1                  2            399055           199527          359130
      events/0                     8           4780110           597513         4500250
      events/1                     9           6353057           705895         2986012
      kblockd/0                   42          37805097           900121         5077684
      
      The snapshot are in nanoseconds.
      
      - Count: number of snapshots taken for the given task
      - Total: total latencies in nanosec
      - Avg  : average of latency between wake up and sched in
      - Max  : max snapshot latency
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdce9d73
    • F
      perf sched: Make it easier to plug in new sub profilers · 419ab0d6
      Frederic Weisbecker 提交于
      Create a sched event structure of handlers in which various
      sched events reader can plug their own callbacks.
      
      This makes easier the addition of new perf sched sub commands.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      419ab0d6
    • F
      perf sched: Fix bad event alignment · 46538818
      Frederic Weisbecker 提交于
      perf sched raises the following error when it meets a sched
      switch event:
      
      perf: builtin-sched.c:286: register_pid: Assertion `!(pid >= 65536)' failed.
      Abandon
      
      Currently in x86-64, the sched switch events have a hole in the
      middle of the structure:
      
      	u16 common_type;
      	u8 common_flags;
      	u8 common_preempt_count;
      	u32 common_pid;
      	u32 common_tgid;
      
      	char prev_comm[16];
      	u32 prev_pid;
      	u32 prev_prio;
      			<--- there
      	u64 prev_state;
      	char next_comm[16];
      	u32 next_pid;
      	u32 next_prio;
      
      Gcc inserts a 4 bytes hole there for prev_state to be u64
      aligned. And the events are exported to userspace with this
      hole.
      
      But in userspace, from perf sched, we fetch it using a
      structure that has a new field in the beginning: u32 size. This
      is because our trace is exported with its size as a field. But
      now that we have this new field, the hole in the middle
      disappears because it makes prev_state becoming well aligned.
      
      And since we are using a pointer to the raw trace using this
      struct, instead of reading prev_state, we are reading the hole.
      
      We could fix it by keeping the size seperate from the struct
      but actually there a lot of other potential problems: some
      fields may be saved as long in a 64 bits system and later read
      as long in a 32 bits system. Also this direct cast doesn't care
      about the endianness differences between the host traced
      machine and the machine in which we do the post processing.
      
      So instead of using such dangerous direct casts, fetch the
      values using the trace parsing API that already takes care of
      all these problems.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      46538818
    • I
      perf sched: Tighten up the code · ad236fd2
      Ingo Molnar 提交于
      Various small cleanups - removal of debug printks and dead
      functions, etc.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ad236fd2
    • I
      perf sched: Implement the scheduling workload replay engine · fbf94829
      Ingo Molnar 提交于
      Integrate the schedbench.c bits with the raw trace events
      that we get from the perf machinery, and activate the
      workload replayer/simulator.
      
      Example of a captured 'make -j' workload:
      
      $ perf sched
      
        run measurement overhead: 90 nsecs
        sleep measurement overhead: 2724743 nsecs
        the run test took 1000081 nsecs
        the sleep test took 2981111 nsecs
        version = 0.5
        ...
        nr_run_events:        70
        nr_sleep_events:      66
        nr_wakeup_events:     9
        target-less wakeups:  71
        multi-target wakeups: 47
        run events optimized: 139
        task      0 (                perf:      6607), nr_events: 2
        task      1 (                perf:      6608), nr_events: 6
        task      2 (                    :         0), nr_events: 1
        task      3 (                make:      6609), nr_events: 5
        task      4 (                  sh:      6610), nr_events: 4
        task      5 (                make:      6611), nr_events: 6
        task      6 (                  sh:      6612), nr_events: 4
        task      7 (                make:      6613), nr_events: 5
        task      8 (        migration/11:        25), nr_events: 1
        task      9 (        migration/13:        29), nr_events: 1
        task     10 (        migration/15:        33), nr_events: 1
        task     11 (         migration/9:        21), nr_events: 1
        task     12 (                  sh:      6614), nr_events: 4
        task     13 (                make:      6615), nr_events: 5
        task     14 (                  sh:      6616), nr_events: 4
        task     15 (                make:      6617), nr_events: 7
        task     16 (         migration/3:         9), nr_events: 1
        task     17 (         migration/5:        13), nr_events: 1
        task     18 (         migration/7:        17), nr_events: 1
        task     19 (         migration/1:         5), nr_events: 1
        task     20 (                  sh:      6618), nr_events: 4
        task     21 (                make:      6619), nr_events: 5
        task     22 (                  sh:      6620), nr_events: 4
        task     23 (                make:      6621), nr_events: 10
        task     24 (                  sh:      6623), nr_events: 3
        task     25 (                 gcc:      6624), nr_events: 4
        task     26 (                 gcc:      6625), nr_events: 4
        task     27 (                 gcc:      6626), nr_events: 5
        task     28 (            collect2:      6627), nr_events: 5
        task     29 (                  sh:      6622), nr_events: 1
        task     30 (                make:      6628), nr_events: 7
        task     31 (                  sh:      6630), nr_events: 4
        task     32 (                 gcc:      6631), nr_events: 4
        task     33 (                  sh:      6629), nr_events: 1
        task     34 (                 gcc:      6632), nr_events: 4
        task     35 (                 gcc:      6633), nr_events: 4
        task     36 (            collect2:      6634), nr_events: 4
        task     37 (                make:      6635), nr_events: 8
        task     38 (                  sh:      6637), nr_events: 4
        task     39 (                  sh:      6636), nr_events: 1
        task     40 (                 gcc:      6638), nr_events: 4
        task     41 (                 gcc:      6639), nr_events: 4
        task     42 (                 gcc:      6640), nr_events: 4
        task     43 (            collect2:      6641), nr_events: 4
        task     44 (                make:      6642), nr_events: 6
        task     45 (                  sh:      6643), nr_events: 5
        task     46 (                  sh:      6644), nr_events: 3
        task     47 (                  sh:      6645), nr_events: 4
        task     48 (                make:      6646), nr_events: 6
        task     49 (                  sh:      6647), nr_events: 3
        task     50 (                make:      6648), nr_events: 5
        task     51 (                  sh:      6649), nr_events: 5
        task     52 (                  sh:      6650), nr_events: 6
        task     53 (                make:      6651), nr_events: 4
        task     54 (                make:      6652), nr_events: 5
        task     55 (                make:      6653), nr_events: 4
        task     56 (                make:      6654), nr_events: 4
        task     57 (                make:      6655), nr_events: 5
        task     58 (                  sh:      6656), nr_events: 4
        task     59 (                 gcc:      6657), nr_events: 9
        task     60 (         ksoftirqd/3:        10), nr_events: 1
        task     61 (                 gcc:      6658), nr_events: 4
        task     62 (                make:      6659), nr_events: 5
        task     63 (                  sh:      6660), nr_events: 3
        task     64 (                 gcc:      6661), nr_events: 5
        task     65 (            collect2:      6662), nr_events: 4
        ------------------------------------------------------------
        #1  : 256.745, ravg: 256.74, cpu: 0.00 / 0.00
        #2  : 439.372, ravg: 275.01, cpu: 0.00 / 0.00
        #3  : 411.971, ravg: 288.70, cpu: 0.00 / 0.00
        #4  : 385.500, ravg: 298.38, cpu: 0.00 / 0.00
        #5  : 366.526, ravg: 305.20, cpu: 0.00 / 0.00
        #6  : 381.281, ravg: 312.81, cpu: 0.00 / 0.00
        #7  : 410.756, ravg: 322.60, cpu: 0.00 / 0.00
        #8  : 368.009, ravg: 327.14, cpu: 0.00 / 0.00
        #9  : 408.098, ravg: 335.24, cpu: 0.00 / 0.00
        #10 : 368.582, ravg: 338.57, cpu: 0.00 / 0.00
      
      I.e. we successfully analyzed the trace, replayed it
      via real threads and measured the replayed workload's
      scheduling properties.
      
      This is how it looked like in 'top' output:
      
         PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
        7164 mingo     20   0 1434m 8080  888 R 57.0  0.1   0:02.04 :perf
        7165 mingo     20   0 1434m 8080  888 R 41.8  0.1   0:01.52 :perf
        7228 mingo     20   0 1434m 8080  888 R 39.8  0.1   0:01.44 :gcc
        7225 mingo     20   0 1434m 8080  888 R 33.8  0.1   0:01.26 :gcc
        7202 mingo     20   0 1434m 8080  888 R 31.2  0.1   0:01.16 :sh
        7222 mingo     20   0 1434m 8080  888 R 25.2  0.1   0:00.96 :sh
        7211 mingo     20   0 1434m 8080  888 R 21.9  0.1   0:00.82 :sh
        7213 mingo     20   0 1434m 8080  888 D 19.2  0.1   0:00.74 :sh
        7194 mingo     20   0 1434m 8080  888 D 18.6  0.1   0:00.72 :make
      
      There's still various kinks in it - more patches to come.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fbf94829
    • I
      perf sched: Import schedbench.c · ec156764
      Ingo Molnar 提交于
      Import the schedbench.c tool that i wrote some time ago to
      simulate scheduler behavior but never finished. It's a good
      basis for perf sched nevertheless.
      
      Most of its guts are not hooked up to the perf event loop
      yet - that will be done in the patches to come.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ec156764
    • I
      perf: Add 'perf sched' tool · 0a02ad93
      Ingo Molnar 提交于
      This turn-key tool allows scheduler measurements to be
      conducted and the results be displayed numerically.
      
      First baby step towards that goal: clone the new command off of
      perf trace.
      
      Fix a few other details along the way:
      
       - add (minimal) perf trace documentation
      
       - reorder a few places
      
       - list perf trace in the mainporcelain list as well
         as it's a very useful utility.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0a02ad93
  2. 03 9月, 2009 3 次提交
    • I
      perf trace: Fix parsing of perf.data · 8886f42d
      Ingo Molnar 提交于
      We started parsing perf.data at head 0. This caused -D to
      segfault and it could possibly also case incorrect trace
      entries to be displayed.
      
      Parse it at data_offset instead.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8886f42d
    • I
      perf trace: Sample timestamps as well · 6ddf259d
      Ingo Molnar 提交于
      Before:
      
                  perf-21082 [013]     0.000000: sched_wakeup_new: task perf:21083 [120] success=1 [015]
                  perf-21082 [013]     0.000000: sched_migrate_task: task perf:21082 [120] from: 13  to: 15
                  perf-21082 [013]     0.000000: sched_process_fork: parent perf:21082  child perf:21083
                  true-21083 [015]     0.000000: sched_wakeup: task migration/15:33 [0] success=1 [015]
                  perf-21082 [013]     0.000000: sched_switch: task perf:21082 [120] (S) ==> swapper:0 [140]
                  true-21083 [015]     0.000000: sched_switch: task perf:21083 [120] (R) ==> migration/15:33 [0]
                  true-21083 [011]     0.000000: sched_process_exit: task true:21083 [120]
      
      After:
      
                  perf-21082 [013] 14674.797613: sched_wakeup_new: task perf:21083 [120] success=1 [015]
                  perf-21082 [013] 14674.797506: sched_migrate_task: task perf:21082 [120] from: 13  to: 15
                  perf-21082 [013] 14674.797610: sched_process_fork: parent perf:21082  child perf:21083
                  true-21083 [015] 14674.797725: sched_wakeup: task migration/15:33 [0] success=1 [015]
                  perf-21082 [013] 14674.797722: sched_switch: task perf:21082 [120] (S) ==> swapper:0 [140]
                  true-21083 [015] 14674.797729: sched_switch: task perf:21083 [120] (R) ==> migration/15:33 [0]
                  true-21083 [011] 14674.798159: sched_process_exit: task true:21083 [120]
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ddf259d
    • I
      perf trace: Sample the CPU too · cd6feeea
      Ingo Molnar 提交于
      Sample, record, parse and print the CPU field - it had all zeroes before.
      
      Before (watch the second column, the CPU values):
      
                  perf-32685 [000]     0.000000: sched_wakeup_new: task perf:32686 [120] success=1 [011]
                  perf-32685 [000]     0.000000: sched_migrate_task: task perf:32685 [120] from: 1  to: 11
                  perf-32685 [000]     0.000000: sched_process_fork: parent perf:32685  child perf:32686
                  true-32686 [000]     0.000000: sched_wakeup: task migration/11:25 [0] success=1 [011]
                  true-32686 [000]     0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015]
                  true-32686 [000]     0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015]
                  perf-32685 [000]     0.000000: sched_switch: task perf:32685 [120] (S) ==> swapper:0 [140]
                  true-32686 [000]     0.000000: sched_switch: task perf:32686 [120] (R) ==> migration/11:25 [0]
                  true-32686 [000]     0.000000: sched_switch: task perf:32686 [120] (R) ==> distccd:12793 [125]
                  true-32686 [000]     0.000000: sched_switch: task true:32686 [120] (R) ==> distccd:12793 [125]
                  true-32686 [000]     0.000000: sched_process_exit: task true:32686 [120]
                  true-32686 [000]     0.000000: sched_stat_wait: task: distccd:12793 wait: 6767985949080 [ns]
                  true-32686 [000]     0.000000: sched_stat_wait: task: distccd:12793 wait: 6767986139446 [ns]
                  true-32686 [000]     0.000000: sched_stat_sleep: task: distccd:12793 sleep: 132844 [ns]
                  true-32686 [000]     0.000000: sched_stat_sleep: task: distccd:12793 sleep: 131724 [ns]
      
      After:
      
                  perf-32685 [001]     0.000000: sched_wakeup_new: task perf:32686 [120] success=1 [011]
                  perf-32685 [001]     0.000000: sched_migrate_task: task perf:32685 [120] from: 1  to: 11
                  perf-32685 [001]     0.000000: sched_process_fork: parent perf:32685  child perf:32686
                  true-32686 [011]     0.000000: sched_wakeup: task migration/11:25 [0] success=1 [011]
                  true-32686 [015]     0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015]
                  true-32686 [015]     0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015]
                  perf-32685 [001]     0.000000: sched_switch: task perf:32685 [120] (S) ==> swapper:0 [140]
                  true-32686 [011]     0.000000: sched_switch: task perf:32686 [120] (R) ==> migration/11:25 [0]
                  true-32686 [015]     0.000000: sched_switch: task perf:32686 [120] (R) ==> distccd:12793 [125]
                  true-32686 [015]     0.000000: sched_switch: task true:32686 [120] (R) ==> distccd:12793 [125]
                  true-32686 [015]     0.000000: sched_process_exit: task true:32686 [120]
                  true-32686 [015]     0.000000: sched_stat_wait: task: distccd:12793 wait: 6767985949080 [ns]
                  true-32686 [015]     0.000000: sched_stat_wait: task: distccd:12793 wait: 6767986139446 [ns]
                  true-32686 [015]     0.000000: sched_stat_sleep: task: distccd:12793 sleep: 132844 [ns]
                  true-32686 [015]     0.000000: sched_stat_sleep: task: distccd:12793 sleep: 131724 [ns]
      
      So we can now see how this workload migrated between CPUs.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cd6feeea
  3. 31 8月, 2009 1 次提交
    • F
      perf tools: Resolve idle thread cmdline for perf trace · 3a2684ca
      Frederic Weisbecker 提交于
      The cmd-trace tool used the cmdline file and resolved the idle
      thread using a hardcoded check for the 0 task pid.
      
      Now we have a centralized way to do that from perf using
      register_idle_thread() API.
      
      Before:
      	:0-0     [000]     0.000000: irq_handler_entry: irq=0 handler=name
      	:0-0     [000]     0.000000: irq_handler_entry: irq=0 handler=name
      
      After:
      	[idle]-0     [000]     0.000000: irq_handler_entry: irq=0 handler=name
      	[idle]-0     [000]     0.000000: irq_handler_entry: irq=0 handler=name
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1251693921-6579-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a2684ca
  4. 22 8月, 2009 1 次提交
    • M
      perf trace: Add OPT_END to option array of perf-trace · 1909629f
      Masami Hiramatsu 提交于
      Add OPT_END to option array of perf-trace for fixing a SEGV bug when
      showing perf-trace help message.
      
      Without this patch;
       ./perf trace -h
      
       usage: perf trace [<options>] <command>
      
          -D, --dump-raw-trace  dump raw trace in ASCII
          -v, --verbose         be more verbose (show symbol address, etc)
          -f, Segmentation fault
      
      With this patch:
       ./perf trace -h
      
       usage: perf trace [<options>] <command>
      
          -D, --dump-raw-trace  dump raw trace in ASCII
          -v, --verbose         be more verbose (show symbol address, etc)
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Zhaolei <zhaolei@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20090821185603.11039.62109.stgit@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1909629f
  5. 18 8月, 2009 1 次提交
  6. 17 8月, 2009 1 次提交
    • F
      perf tools: Add perf trace · 5f9c39dc
      Frederic Weisbecker 提交于
      This adds perf trace into the set of perf tools.
      
      It is written to fetch the tracepoint samples from perf events
      and display them, according to the events information given by
      the debugfs files through the util/trace* tools.
      
      It is a rough first shot and doesn't yet handle the cpu,
      timestamps fields and some other things.
      
      Example:
      
       perf record -f -e workqueue:workqueue_execution:record -F 1 -a
       perf trace
      
             kblockd/0-236   [000]     0.000000: workqueue_execution: thread=:236 func=cfq_kick_queue+0x0
           kondemand/0-360   [000]     0.000000: workqueue_execution: thread=:360 func=do_dbs_timer+0x0
           kondemand/0-360   [000]     0.000000: workqueue_execution: thread=:360 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
           kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
      
      Todo:
      
      - A lot of things!
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Jon Masters <jonathan@jonmasters.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Zhaolei <zhaolei@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Anton Blanchard <anton@samba.org>
      LKML-Reference: <1250518688-7207-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5f9c39dc