1. 11 5月, 2010 1 次提交
    • A
      perf hist: Introduce hists class and move lots of methods to it · 1c02c4d2
      Arnaldo Carvalho de Melo 提交于
      In cbbc79a5 we introduced support for multiple events by introducing a
      new "event_stat_id" struct and then made several perf_session methods
      receive a point to it instead of a pointer to perf_session, and kept the
      event_stats and hists rb_tree in perf_session.
      
      While working on the new newt based browser, I realised that it would be
      better to introduce a new class, "hists" (short for "histograms"),
      renaming the "event_stat_id" struct and the perf_session methods that
      were really "hists" methods, as they manipulate only struct hists
      members, not touching anything in the other perf_session members.
      
      Other optimizations, such as calculating the maximum lenght of a symbol
      name present in an hists instance will be possible as we add them,
      avoiding a re-traversal just for finding that information.
      
      The rationale for the name "hists" to replace "event_stat_id" is that we
      may have multiple sets of hists for the same event_stat id, as, for
      instance, the 'perf diff' tool has, so event stat id is not what
      characterizes what this struct and the functions that manipulate it do.
      
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1c02c4d2
  2. 10 5月, 2010 12 次提交
  3. 09 5月, 2010 13 次提交
    • T
      perf/live-mode: Handle payload-less events · 794e43b5
      Tom Zanussi 提交于
      Some events, such as the PERF_RECORD_FINISHED_ROUND event consist of
      only an event header and no data.  In this case, a 0-length payload
      will be read, and the 0 return value will be wrongly interpreted as an
      'unexpected end of event stream'.
      
      This patch allows for proper handling of data-less events by skipping
      0-length reads.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <1273038527.6383.51.camel@tropicana>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      794e43b5
    • F
      tracing: Factorize lock events in a lock class · 2c193c73
      Frederic Weisbecker 提交于
      lock_acquired, lock_contended and lock_release now share the
      same prototype and format. Let's factorize them into a lock
      event class.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      2c193c73
    • F
      tracing: Drop the nested field from lock_release event · 93135439
      Frederic Weisbecker 提交于
      Drop the nested field as we don't use it. Every nested state can
      be computed from a state machine on post processing already.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      93135439
    • F
      tracing: Drop lock_acquired waittime field · 883a2a31
      Frederic Weisbecker 提交于
      Drop the waittime field from the lock_acquired event, we can
      calculate it by substracting the lock_acquired event timestamp
      with the matching lock_acquire one.
      
      It is not needed and takes useless space in the traces.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      883a2a31
    • F
      perf lock: Always check min AND max wait time · 90c0e5fc
      Frederic Weisbecker 提交于
      When a lock is acquired after beeing contended, we update the
      wait time statistics for the given lock.
      But if the min wait time is updated, we don't check the max wait
      time. This is wrong because the first time we update the wait time,
      we want to update both min and max wait time.
      
      Before:
      	Name   acquired  contended total wait (ns)   max wait (ns)   min wait (ns)
      	key          8          1           21656           0           21656
      
      After:
      	Name   acquired  contended total wait (ns)   max wait (ns)   min wait (ns)
      	key          8          1           21656           21656           21656
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      90c0e5fc
    • F
      perf: Fix perf lock bad rate · 5efe08cf
      Frederic Weisbecker 提交于
      Fix the cast made to get the bad rate. It is made in the result
      instead of the operands. We need the operands to be cast in double,
      otherwise the result will always be zero.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      5efe08cf
    • F
      perf: Humanize lock flags in perf lock · 84c7a217
      Frederic Weisbecker 提交于
      Use an enum instead of plain constants for lock flags.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      84c7a217
    • F
      perf: Cleanup perf lock broken states · 10350ec3
      Frederic Weisbecker 提交于
      Use enum to get a human view of bad_hist indexes and
      put bad histogram output in its own function.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      10350ec3
    • H
      perf lock: Add "info" subcommand for dumping misc information · 26242d85
      Hitoshi Mitake 提交于
      This adds the "info" subcommand to perf lock which can be used
      to dump metadata like threads or addresses of lock instances.
      "map" was removed because info should do the work for it.
      
      This will be useful not only for debugging but also for ordinary
      analyzing.
      
      v2: adding example of usage
      % sudo ./perf lock info -t
       | Thread ID: comm
       | 	 0: swapper
       |         1: init
       |        18: migration/5
       |        29: events/2
       |        32: events/5
       |        33: events/6
      ...
      
      % sudo ./perf lock info -m
      | Address of instance: name of class
      |  0xffff8800b95adae0: &(&sighand->siglock)->rlock
      |  0xffff8800bbb41ae0: &(&sighand->siglock)->rlock
      |  0xffff8800bf165ae0: &(&sighand->siglock)->rlock
      |  0xffff8800b9576a98: &p->cred_guard_mutex
      |  0xffff8800bb890a08: &(&p->alloc_lock)->rlock
      |  0xffff8800b9522a08: &(&p->alloc_lock)->rlock
      |  0xffff8800bb8aaa08: &(&p->alloc_lock)->rlock
      |  0xffff8800bba72a08: &(&p->alloc_lock)->rlock
      |  0xffff8800bf18ea08: &(&p->alloc_lock)->rlock
      |  0xffff8800b8a0d8a0: &(&ip->i_lock)->mr_lock
      |  0xffff88009bf818a0: &(&ip->i_lock)->mr_lock
      |  0xffff88004c66b8a0: &(&ip->i_lock)->mr_lock
      |  0xffff8800bb6478a0: &(shost->host_lock)->rlock
      
      v3: fixed some problems Frederic pointed out
       * better rbtree tracking in dump_threads()
       * removed printf() and used pr_info() and pr_debug()
      Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      LKML-Reference: <1272863520-16179-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      26242d85
    • F
      perf: Provide a new deterministic events reordering algorithm · d6b17beb
      Frederic Weisbecker 提交于
      The current events reordering algorithm is based on a heuristic that
      gets broken once we deal with a very fast flow of events.
      
      Indeed the time period based flushing is not suitable anymore
      in the following case, assuming we have a flush period of two
      seconds.
      
          CPU 0           |        CPU 1
                          |
        cnt1 timestamps   |      cnt1 timestamps
                          |
          0               |         0
          1               |         1
          2               |         2
          3               |         3
          [...]           |        [...]
          4 seconds later
      
      If we spend too much time to read the buffers (case of a lot of
      events to record in each buffers or when we have a lot of CPU buffers
      to read), in the next pass the CPU 0 buffer could contain a slice
      of several seconds of events. We'll read them all and notice we've
      reached the period to flush. In the above example we flush the first
      half of the CPU 0 buffer, then we read the CPU 1 buffer where we
      have events that were on the flush slice and then the reordering
      fails.
      
      It's simple to reproduce with:
      
      	perf lock record perf bench sched messaging
      
      To solve this, we use a new solution that doesn't rely on an
      heuristical time slice period anymore but on a deterministic basis
      based on how perf record does its job.
      
      perf record saves the buffers through passes. A pass is a tour
      on every buffers from every CPUs. This is made in order: for
      each CPU we read the buffers of every counters. So the more
      buffers we visit, the later will be the timstamps of their events.
      
      When perf record finishes a pass it records a
      PERF_RECORD_FINISHED_ROUND pseudo event.
      We record the max timestamp t found in the pass n. Assuming these
      timestamps are monotonic across cpus, we know that if a buffer
      still has events with timestamps below t, they will be all available
      and then read in the pass n + 1.
      Hence when we start to read the pass n + 2, we can safely flush every
      events with timestamps below t.
      
            ============ PASS n =================
               CPU 0         |   CPU 1
                             |
            cnt1 timestamps  |   cnt2 timestamps
                  1          |         2
                  2          |         3
                  -          |         4  <--- max recorded
      
            ============ PASS n + 1 ==============
               CPU 0         |   CPU 1
                             |
            cnt1 timestamps  |   cnt2 timestamps
                  3          |         5
                  4          |         6
                  5          |         7 <---- max recorded
      
              Flush every events below timestamp 4
      
            ============ PASS n + 2 ==============
               CPU 0         |   CPU 1
                             |
            cnt1 timestamps  |   cnt2 timestamps
                  6          |         8
                  7          |         9
                  -          |         10
      
              Flush every events below timestamp 7
              etc...
      
      It also works on perf.data versions that don't have
      PERF_RECORD_FINISHED_ROUND pseudo events. The difference is that
      the events will be only flushed in the end of the perf.data
      processing. It will then consume more memory and scale less with
      large perf.data files.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      d6b17beb
    • F
      perf: Introduce a new "round of buffers read" pseudo event · 98402807
      Frederic Weisbecker 提交于
      In order to provide a more rubust and deterministic reordering
      algorithm, we need to know when we reach a point where we just
      did a pass through over every counter buffers to read every thing
      they had.
      
      This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event
      that only consist in an event header and doesn't need to contain
      anything.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      98402807
    • P
      perf report: Document '--call-graph' better for usage · e157eb83
      Pekka Enberg 提交于
      This patch improves 'perf report -h' output for the
      '--call-graph' command line option by enumerating the
      different output types.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1273332783-4268-1-git-send-email-penberg@cs.helsinki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e157eb83
    • M
      kprobes: Move enable/disable_kprobe() out from debugfs code · c0614829
      Masami Hiramatsu 提交于
      Move enable/disable_kprobe() API out from debugfs related code,
      because these interfaces are not related to debugfs interface.
      
      This fixes a compiler warning.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      LKML-Reference: <20100427223312.2322.60512.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c0614829
  4. 08 5月, 2010 7 次提交
    • C
      x86, perf: P4 PMU -- check for proper event index in RAW events · c7993165
      Cyrill Gorcunov 提交于
      RAW events are special and we should be ready for user passing
      in insane event index values.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112717.315897547@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7993165
    • C
      x86, perf: P4 PMU -- Get rid of redundant check for array index · 3f51b711
      Cyrill Gorcunov 提交于
      The caller already has done such a check.
      And it was wrong anyway, it had to be '>=' rather than '>'
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112717.130386882@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f51b711
    • C
      x86, perf: P4 PMU -- protect sensible procedures from preemption · 137351e0
      Cyrill Gorcunov 提交于
      Steven reported:
      
      |
      | I'm getting:
      |
      | Pid: 3477, comm: perf Not tainted 2.6.34-rc6 #2727
      | Call Trace:
      |  [<ffffffff811c7565>] debug_smp_processor_id+0xd5/0xf0
      |  [<ffffffff81019874>] p4_hw_config+0x2b/0x15c
      |  [<ffffffff8107acbc>] ? trace_hardirqs_on_caller+0x12b/0x14f
      |  [<ffffffff81019143>] hw_perf_event_init+0x468/0x7be
      |  [<ffffffff810782fd>] ? debug_mutex_init+0x31/0x3c
      |  [<ffffffff810c68b2>] T.850+0x273/0x42e
      |  [<ffffffff810c6cab>] sys_perf_event_open+0x23e/0x3f1
      |  [<ffffffff81009e6a>] ? sysret_check+0x2e/0x69
      |  [<ffffffff81009e32>] system_call_fastpath+0x16/0x1b
      |
      | When running perf record in latest tip/perf/core
      |
      
      Due to the fact that p4 counters are shared between HT threads
      we synthetically divide the whole set of counters into two
      non-intersected subsets. And while we're "borrowing" counters
      from these subsets we should not be preempted (well, strictly
      speaking in p4_hw_config we just pre-set reference to the
      subset which allow to save some cycles in schedule routine
      if it happens on the same cpu). So use get_cpu/put_cpu pair.
      
      Also p4_pmu_schedule_events should use smp_processor_id rather
      than raw_ version. This allow us to catch up preemption issue
      (if there will ever be).
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112716.963478928@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      137351e0
    • C
      x86, perf: P4 PMU -- configure predefined events · de902d96
      Cyrill Gorcunov 提交于
      If an event is not RAW we should not exit p4_hw_config
      early but call x86_setup_perfctr as well.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      de902d96
    • P
      perf_event: Make software events work again · 6e85158c
      Paul Mackerras 提交于
      Commit 6bde9b6c ("perf: Add
      group scheduling transactional APIs") added code to allow a
      group to be scheduled in a single transaction.  However, it
      introduced a bug in handling events whose pmu does not implement
      transactions -- at the end of scheduling in the events in the
      group, in the non-transactional case the code now falls through
      to the group_error label, and proceeds to unschedule all the
      events in the group and return failure.
      
      This fixes it by returning 0 (success) in the non-transactional
      case.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: eranian@gmail.com
      LKML-Reference: <20100508105800.GB10650@brick.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6e85158c
    • I
    • A
      perf list: Improve the raw hw event descriptor documentation · 1cf4a063
      Arnaldo Carvalho de Melo 提交于
      It was x86 specific and imcomplete at that, improve the situation by
      making it clear where the example provided applies and by adding the
      URLs for the Intel and AMD manuals where this is discussed in depth.
      Acked-by: NRobert Richter <robert.richter@amd.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Reported-by: Robert Richter <robert.richter@amd.com
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1cf4a063
  5. 07 5月, 2010 7 次提交