1. 13 9月, 2013 3 次提交
    • A
      perf/x86/intel: Add Haswell TSX event aliases · 4b2c4f1f
      Andi Kleen 提交于
      Add TSX event aliases, and export them from the kernel to perf.
      
      These are used by perf stat -T and to allow
      more user friendly access to events. The events are designed to
      be fairly generic and may also apply to other architectures
      implementing HTM.  They all cover common situations that
      happens during tuning of transactional code.
      
      For Haswell we have to separate the HLE and RTM events,
      as they are separate in the PMU.
      
      This adds the following events:
      
       tx-start	Count start transaction (used by perf stat -T)
       tx-commit	Count commit of transaction
       tx-abort	Count all aborts
       tx-conflict	Count aborts due to conflict with another CPU.
       tx-capacity	Count capacity aborts (transaction too large)
      
      Then matching el-* events for HLE
      
       cycles-t	Transactional cycles (used by perf stat -T)
        * also exists on POWER8
      
       cycles-ct	Transactional cycles commited (used by perf stat -T)
        * according to Michael Ellerman POWER8 has a cycles-transactional-committed,
        * perf stat -T handles both cases
      
      Note for useful abort profiling often precise has to be set,
      as Haswell can only report the point inside the transaction
      with precise=2.
      
      For some classes of aborts, like conflicts, this is not needed,
      as it makes more sense to look at the complete critical section.
      
      This gives a clean set of generalized events to examine transaction
      success and aborts. Haswell has additional events for TSX, but those
      are more specialized for very specific situations.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1378438661-24765-4-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4b2c4f1f
    • A
      perf/x86/intel: Avoid checkpointed counters causing excessive TSX aborts · 2dbf0116
      Andi Kleen 提交于
      With checkpointed counters there can be a situation where the counter
      is overflowing, aborts the transaction, is set back to a non overflowing
      checkpoint, causes interupt. The interrupt doesn't see the overflow
      because it has been checkpointed.  This is then a spurious PMI, typically with
      a ugly NMI message.  It can also lead to excessive aborts.
      
      Avoid this problem by:
      
      - Using the full counter width for counting counters (earlier patch)
      
      - Forbid sampling for checkpointed counters. It's not too useful anyways,
        checkpointing is mainly for counting. The check is approximate
        (to still handle KVM), but should catch the majority of cases.
      
      - On a PMI always set back checkpointed counters to zero.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1378438661-24765-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2dbf0116
    • P
      perf/x86/intel: Fix Silvermont offcore masks · 06c939c1
      Peter Zijlstra 提交于
      Fengguang Wu reported:
      
      > sparse warnings: (new ones prefixed by >>)
      >
      > >> arch/x86/kernel/cpu/perf_event_intel.c:901:9: sparse: constant 0x768005ffff is so big it is long
      > >> arch/x86/kernel/cpu/perf_event_intel.c:902:9: sparse: constant 0x768005ffff is so big it is long
      >
      > vim +901 arch/x86/kernel/cpu/perf_event_intel.c
      >
      >    895	 },
      >    896	};
      >    897
      >    898	static struct extra_reg intel_slm_extra_regs[] __read_mostly =
      >    899	{
      >    900		/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
      >  > 901		INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x768005ffff, RSP_0),
      >  > 902		INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x768005ffff, RSP_1),
      >    903		EVENT_EXTRA_END
      >    904	};
      >    905
      
      Extend those constants to 64 bits.
      
      Reported-by: fengguang.wu@intel.com
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130909112636.GQ31370@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      06c939c1
  2. 12 9月, 2013 1 次提交
    • S
      perf/x86: Add constraint for IVB CYCLE_ACTIVITY:CYCLES_LDM_PENDING · 6113af14
      Stephane Eranian 提交于
      The IvyBridge event CYCLE_ACTIVITY:CYCLES_LDM_PENDING can only
      be measured on counters 0-3 when HT is off. When HT is on, you
      only have counters 0-3.
      
      If you program it on the eight counters for 1s on a 3GHz
      IVB laptop running a noploop, you see:
      
                 2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
                 2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
                 2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
                 2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
             3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
             3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
             3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
             3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
      
      Clearly the last 4 values are bogus.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: ak@linux.intel.com
      Cc: zheng.z.yan@intel.com
      Cc: dhsharp@google.com
      Link: http://lkml.kernel.org/r/20130911152222.GA28761@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6113af14
  3. 02 9月, 2013 2 次提交
  4. 12 8月, 2013 1 次提交
  5. 27 6月, 2013 1 次提交
    • S
      perf/x86: Fix shared register mutual exclusion enforcement · 2f7f73a5
      Stephane Eranian 提交于
      This patch fixes a problem with the shared registers mutual
      exclusion code and incremental event scheduling by the
      generic perf_event code.
      
      There was a bug whereby the mutual exclusion on the shared
      registers was not enforced because of incremental scheduling
      abort due to event constraints. As an example on Intel
      Nehalem, consider the following events:
      
      group1= L1D_CACHE_LD:E_STATE,OFFCORE_RESPONSE_0:PF_RFO,L1D_CACHE_LD:I_STATE
      group2= L1D_CACHE_LD:I_STATE
      
      The L1D_CACHE_LD event can only be measured by 2 counters. Yet, there
      are 3 instances here. The first group can be scheduled and is committed.
      Then, the generic code tries to schedule group2 and this fails (because
      there is no more counter to support the 3rd instance of L1D_CACHE_LD).
      But in x86_schedule_events() error path, put_event_contraints() is invoked
      on ALL the events and not just the ones that just failed. That causes the
      "lock" on the shared offcore_response MSR to be released. Yet the first group
      is actually scheduled and is exposed to reprogramming of that shared msr by
      the sibling HT thread. In other words, there is no guarantee on what is
      measured.
      
      This patch fixes the problem by tagging committed events with the
      PERF_X86_EVENT_COMMITTED tag. In the error path of x86_schedule_events(),
      only the events NOT tagged have their constraint released. The tag
      is eventually removed when the event in descheduled.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130620164254.GA3556@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2f7f73a5
  6. 26 6月, 2013 1 次提交
  7. 19 6月, 2013 6 次提交
  8. 04 5月, 2013 1 次提交
  9. 16 4月, 2013 1 次提交
    • S
      perf/x86: Fix offcore_rsp valid mask for SNB/IVB · f1923820
      Stephane Eranian 提交于
      The valid mask for both offcore_response_0 and
      offcore_response_1 was wrong for SNB/SNB-EP,
      IVB/IVB-EP. It was possible to write to
      reserved bit and cause a GP fault crashing
      the kernel.
      
      This patch fixes the problem by correctly marking the
      reserved bits in the valid mask for all the processors
      mentioned above.
      
      A distinction between desktop and server parts is introduced
      because bits 24-30 are only available on the server parts.
      
      This version of the  patch is just a rebase to perf/urgent tree
      and should apply to older kernels as well.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: jolsa@redhat.com
      Cc: gregkh@linuxfoundation.org
      Cc: security@kernel.org
      Cc: ak@linux.intel.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f1923820
  10. 10 4月, 2013 1 次提交
  11. 01 4月, 2013 4 次提交
  12. 18 3月, 2013 1 次提交
  13. 20 2月, 2013 1 次提交
  14. 24 1月, 2013 2 次提交
  15. 24 10月, 2012 2 次提交
  16. 04 10月, 2012 1 次提交
  17. 19 9月, 2012 1 次提交
  18. 04 9月, 2012 1 次提交
  19. 14 8月, 2012 1 次提交
  20. 26 7月, 2012 1 次提交
  21. 06 7月, 2012 4 次提交
  22. 08 6月, 2012 1 次提交
  23. 06 6月, 2012 2 次提交