1. 09 8月, 2010 1 次提交
  2. 05 7月, 2010 1 次提交
    • C
      perf, x86: P4 PMU -- redesign cache events · 39ef13a4
      Cyrill Gorcunov 提交于
      To support cache events we have reserved the low 6 bits in
      hw_perf_event::config (which is a part of CCCR register
      configuration actually).
      
      These bits represent Replay Event mertic enumerated in
      enum P4_PEBS_METRIC. The caller should not care about
      which exact bits should be set and how -- the caller
      just chooses one P4_PEBS_METRIC entity and puts it into
      the config. The kernel will track it and set appropriate
      additional MSR registers (metrics) when needed.
      
      The reason for this redesign was the PEBS enable bit, which
      should not be set until DS (and PEBS sampling) support will
      be implemented properly.
      
      TODO
      ====
      
       - PEBS sampling (note it's tricky and works with _one_ counter only
         so for HT machines it will be not that easy to handle both threads)
      
       - tracking of PEBS registers state, a user might need to turn
         PEBS off completely (ie no PEBS enable, no UOP_tag) but some
         other event may need it, such events clashes and should not
         run simultaneously, at moment we just don't support such events
      
       - eventually export user space bits in separate header which will
         allow user apps to configure raw events more conveniently.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1278295769.9540.15.camel@minggr.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      39ef13a4
  3. 09 6月, 2010 1 次提交
  4. 19 5月, 2010 2 次提交
  5. 18 5月, 2010 2 次提交
  6. 15 5月, 2010 1 次提交
  7. 13 5月, 2010 1 次提交
    • C
      x86, perf: P4 PMU -- use hash for p4_get_escr_idx() · 72001990
      Cyrill Gorcunov 提交于
      Linear search over all p4 MSRs should be fine if only
      we would not use it in events scheduling routine which
      is pretty time critical. Lets use hashes. It should speed
      scheduling up significantly.
      
      v2: Steven proposed to use more gentle approach than issue
          BUG on error, so we use WARN_ONCE now
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100512174242.GA5190@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      72001990
  8. 08 5月, 2010 4 次提交
    • C
      x86, perf: P4 PMU -- check for proper event index in RAW events · c7993165
      Cyrill Gorcunov 提交于
      RAW events are special and we should be ready for user passing
      in insane event index values.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112717.315897547@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7993165
    • C
      x86, perf: P4 PMU -- Get rid of redundant check for array index · 3f51b711
      Cyrill Gorcunov 提交于
      The caller already has done such a check.
      And it was wrong anyway, it had to be '>=' rather than '>'
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112717.130386882@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f51b711
    • C
      x86, perf: P4 PMU -- protect sensible procedures from preemption · 137351e0
      Cyrill Gorcunov 提交于
      Steven reported:
      
      |
      | I'm getting:
      |
      | Pid: 3477, comm: perf Not tainted 2.6.34-rc6 #2727
      | Call Trace:
      |  [<ffffffff811c7565>] debug_smp_processor_id+0xd5/0xf0
      |  [<ffffffff81019874>] p4_hw_config+0x2b/0x15c
      |  [<ffffffff8107acbc>] ? trace_hardirqs_on_caller+0x12b/0x14f
      |  [<ffffffff81019143>] hw_perf_event_init+0x468/0x7be
      |  [<ffffffff810782fd>] ? debug_mutex_init+0x31/0x3c
      |  [<ffffffff810c68b2>] T.850+0x273/0x42e
      |  [<ffffffff810c6cab>] sys_perf_event_open+0x23e/0x3f1
      |  [<ffffffff81009e6a>] ? sysret_check+0x2e/0x69
      |  [<ffffffff81009e32>] system_call_fastpath+0x16/0x1b
      |
      | When running perf record in latest tip/perf/core
      |
      
      Due to the fact that p4 counters are shared between HT threads
      we synthetically divide the whole set of counters into two
      non-intersected subsets. And while we're "borrowing" counters
      from these subsets we should not be preempted (well, strictly
      speaking in p4_hw_config we just pre-set reference to the
      subset which allow to save some cycles in schedule routine
      if it happens on the same cpu). So use get_cpu/put_cpu pair.
      
      Also p4_pmu_schedule_events should use smp_processor_id rather
      than raw_ version. This allow us to catch up preemption issue
      (if there will ever be).
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112716.963478928@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      137351e0
    • C
      x86, perf: P4 PMU -- configure predefined events · de902d96
      Cyrill Gorcunov 提交于
      If an event is not RAW we should not exit p4_hw_config
      early but call x86_setup_perfctr as well.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      de902d96
  9. 07 5月, 2010 1 次提交
  10. 03 4月, 2010 3 次提交
  11. 26 3月, 2010 2 次提交
    • P
      perf, x86: Add Nehelem PMU programming errata workaround · 11164cd4
      Peter Zijlstra 提交于
      Implement the workaround for Intel Errata AAK100 and AAP53.
      
      Also, remove the Core-i7 name for Nehalem events since there are
      also Westmere based i7 chips.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <1269608924.12097.147.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      11164cd4
    • C
      x86, perf: Add raw events support for the P4 PMU · d814f301
      Cyrill Gorcunov 提交于
      The adding of raw event support lead to complete code
      refactoring. I hope is became more readable then it was.
      
      The list of changes:
      
      1)  The 64bit config field is enough to hold all information we need
          to track event details. To achieve it we used *own* enum for
          events selection in ESCR register and map this key into proper
          value at moment of event enabling.
      
          For the same reason we use 12LSB bits in CCCR register -- to track
          which exactly cache trace event was requested. And we cear this bits
          at real 'write' moment.
      
      2)  There is no per-cpu area reserved for P4 PMU anymore. We
          don't need it. All is held by config.
      
      3)  Now we may use any available counter, ie we try to grab any
          possible counter.
      
      v2:
        - Lin Ming reported the lack of ESCR selector in CCCR for cache events
      
      v3:
        - Don't loose cache event codes at config unpacking procedure, we may
          need it one day so no obscure hack behind our back, better to clear
          reserved bits explicitly when needed (thanks Ming for pointing out)
      
        - Lin Ming fixed misplaced opcodes in cache events
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Tested-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1269403766.3409.6.camel@minggr.sh.intel.com>
      [ v4: did a few whitespace fixlets ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d814f301
  12. 19 3月, 2010 4 次提交
  13. 15 3月, 2010 1 次提交
  14. 13 3月, 2010 1 次提交
  15. 12 3月, 2010 1 次提交
    • C
      perf, x86: Implement initial P4 PMU driver · a072738e
      Cyrill Gorcunov 提交于
      The netburst PMU is way different from the "architectural
      perfomance monitoring" specification that current CPUs use.
      P4 uses a tuple of ESCR+CCCR+COUNTER MSR registers to handle
      perfomance monitoring events.
      
      A few implementational details:
      
      1) We need a separate x86_pmu::hw_config helper in struct
         x86_pmu since register bit-fields are quite different from P6,
         Core and later cpu series.
      
      2) For the same reason is a x86_pmu::schedule_events helper
         introduced.
      
      3) hw_perf_event::config consists of packed ESCR+CCCR values.
         It's allowed since in reality both registers only use a half
         of their size. Of course before making a real write into a
         particular MSR we need to unpack the value and extend it to
         a proper size.
      
      4) The tuple of packed ESCR+CCCR in hw_perf_event::config
         doesn't describe the memory address of ESCR MSR register
         so that we need to keep a mapping between these tuples
         used and available ESCR (various P4 events may use same
         ESCRs but not simultaneously), for this sake every active
         event has a per-cpu map of hw_perf_event::idx <--> ESCR
         addresses.
      
      5) Since hw_perf_event::idx is an offset to counter/control register
         we need to lift X86_PMC_MAX_GENERIC up, otherwise kernel
         strips it down to 8 registers and event armed may never be turned
         off (ie the bit in active_mask is set but the loop never reaches
         this index to check), thanks to Peter Zijlstra
      
      Restrictions:
      
       - No cascaded counters support (do we ever need them?)
       - No dependent events support (so PERF_COUNT_HW_INSTRUCTIONS
         doesn't work for now)
       - There are events with same counters which can't work simultaneously
         (need to use intersected ones due to broken counter 1)
       - No PERF_COUNT_HW_CACHE_ events yet
      
      Todo:
      
       - Implement dependent events
       - Need proper hashing for event opcodes (no linear search, good for
         debugging stage but not in real loads)
       - Some events counted during a clock cycle -- need to set threshold
         for them and count every clock cycle just to get summary statistics
         (ie to behave the same way as other PMUs do)
       - Need to swicth to use event_constraints
       - To support RAW events we need to encode a global list of P4 events
         into p4_templates
       - Cache events need to be added
      
      Event support status matrix:
      
       Event			status
       -----------------------------
       cycles			works
       cache-references	works
       cache-misses		works
       branch-misses		works
       bus-cycles		partially (does not work on 64bit cpu with HT enabled)
       instruction		doesnt work (needs dependent event [mop tagging])
       branches		doesnt work
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100311165439.GB5129@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a072738e