1. 13 8月, 2014 3 次提交
    • S
      perf/x86: Fix data source encoding issues for load latency/precise store · 770eee1f
      Stephane Eranian 提交于
      This patch fixes issues introuduce by Andi's previous patch 'Revamp PEBS'
      series.
      
      This patch fixes the following:
      
       - precise_store_data_hsw() encode the mem op type whenever we can
       - precise_store_data_hsw set the default data source correctly
      
       - 0 is not a valid init value for data source. Define PERF_MEM_NA as the
         default value
      
      This bug was actually introduced by
      
          commit 722e76e6
          Author: Stephane Eranian <eranian@google.com>
          Date:   Thu May 15 17:56:44 2014 +0200
      
              fix Haswell precise store data source encoding
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1407785233-32193-4-git-send-email-eranian@google.com
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: ak@linux.intel.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      770eee1f
    • A
      perf/x86: Don't mark DataLA addresses as store · f3908b8c
      Andi Kleen 提交于
      Haswell supports reporting the data address for a range
      of PEBS events, including:
      
      	UOPS_RETIRED.ALL
      	MEM_UOPS_RETIRED.STLB_MISS_LOADS
      	MEM_UOPS_RETIRED.STLB_MISS_STORES
      	MEM_UOPS_RETIRED.LOCK_LOADS
      	MEM_UOPS_RETIRED.SPLIT_LOADS
      	MEM_UOPS_RETIRED.SPLIT_STORES
      	MEM_UOPS_RETIRED.ALL_LOADS
      	MEM_UOPS_RETIRED.ALL_STORES
      	MEM_LOAD_UOPS_RETIRED.L1_HIT
      	MEM_LOAD_UOPS_RETIRED.L2_HIT
      	MEM_LOAD_UOPS_RETIRED.L3_HIT
      	MEM_LOAD_UOPS_RETIRED.L1_MISS
      	MEM_LOAD_UOPS_RETIRED.L2_MISS
      	MEM_LOAD_UOPS_RETIRED.L3_MISS
      	MEM_LOAD_UOPS_RETIRED.HIT_LFB
      	MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS
      	MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT
      	MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM
      	MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE
      	MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM
      
      This facility was already enabled earlier with the original Haswell
      perf changes.
      
      However these addresses were always reports as stores by perf, which is wrong,
      as they could be loads too.  The hardware does not distinguish loads and stores
      for these instructions, so there's no (cheap) way for the profiler
      to find out.
      
      Change the type to PERF_MEM_OP_NA instead.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Link: http://lkml.kernel.org/r/1407785233-32193-3-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f3908b8c
    • A
      perf/x86: Revamp PEBS event selection · 86a04461
      Andi Kleen 提交于
      The basic idea is that it does not make sense to list all PEBS
      events individually. The list is very long, sometimes outdated
      and the hardware doesn't need it. If an event does not support
      PEBS it will just not count, there is no security issue.
      
      We need to only list events that something special, like
      supporting load or store addresses.
      
      This vastly simplifies the PEBS event selection. It also
      speeds up the scheduling because the scheduler doesn't
      have to walk as many constraints.
      
      Bugs fixed:
      
       - We do not allow setting forbidden flags with PEBS anymore
         (SDM 18.9.4), except for the special cycle event.
         This is done using a new constraint macro that also
         matches on the event flags.
      
       - Correct DataLA and load/store/na flags reporting on Haswell
         [Requires a followon patch]
      
       - We did not allow all PEBS events on Haswell:
         We were missing some valid subevents in d1-d2 (MEM_LOAD_UOPS_RETIRED.*,
         MEM_LOAD_UOPS_RETIRED_L3_HIT_RETIRED.*)
      
      This includes the changes proposed by Stephane earlier and obsoletes
      his patchkit (except for some changes on pre Sandy Bridge/Silvermont
      CPUs)
      
      I only did Sandy Bridge and Silvermont and later so far, mostly because these
      are the parts I could directly confirm the hardware behavior with hardware
      architects. Also I do not believe the older CPUs have any
      missing events in their PEBS list, so there's no pressing
      need to change them.
      
      I did not implement the flag proposed by Peter to allow
      setting forbidden flags. If really needed this could
      be implemented on to of this patch.
      
      v2: Fix broken store events on SNB/IVB (Stephane Eranian)
      v3: More fixes. Rename some arguments (Stephane Eranian)
      v4: List most Haswell events individually again to report
      memory operation type correctly.
      Add new flags to describe load/store/na for datala.
      Update description.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1407785233-32193-2-git-send-email-eranian@google.com
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Cc: Mark Davies <junk@eslaf.co.uk>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      86a04461
  2. 16 7月, 2014 1 次提交
  3. 19 5月, 2014 1 次提交
  4. 06 11月, 2013 1 次提交
  5. 16 10月, 2013 1 次提交
    • P
      perf/x86: Optimize intel_pmu_pebs_fixup_ip() · 9536c8d2
      Peter Zijlstra 提交于
      There's been reports of high NMI handler overhead, highlighted by
      such kernel messages:
      
        [ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
        [ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs
      
      Don Zickus analyzed the source of the overhead and reported:
      
       > While there are a few places that are causing latencies, for now I focused on
       > the longest one first.  It seems to be 'copy_user_from_nmi'
       >
       > intel_pmu_handle_irq ->
       >	intel_pmu_drain_pebs_nhm ->
       >		__intel_pmu_drain_pebs_nhm ->
       >			__intel_pmu_pebs_event ->
       >				intel_pmu_pebs_fixup_ip ->
       >					copy_from_user_nmi
       >
       > In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of
       > all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles
       > (there are some cases where only 10 iterations are needed to go that high
       > too, but in generall over 50 or so).  At this point copy_user_from_nmi
       > seems to account for over 90% of the nmi latency.
      
      The solution to that is to avoid having to call copy_from_user_nmi() for
      every instruction.
      
      Since we already limit the max basic block size, we can easily
      pre-allocate a piece of memory to copy the entire thing into in one
      go.
      
      Don reported this test result:
      
       > Your patch made a huge difference in improvement.  The
       > copy_from_user_nmi() no longer hits the million of cycles.  I still
       > have a batch of 100,000-300,000 cycles.  My longest NMI paths used
       > to be dominated by copy_from_user_nmi, now it is not (I have to dig
       > up the new hot path).
      Reported-and-tested-by: NDon Zickus <dzickus@redhat.com>
      Cc: jmario@redhat.com
      Cc: acme@infradead.org
      Cc: dave.hansen@linux.intel.com
      Cc: eranian@google.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20131016105755.GX10651@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9536c8d2
  6. 04 10月, 2013 1 次提交
  7. 20 9月, 2013 2 次提交
  8. 14 9月, 2013 1 次提交
  9. 13 9月, 2013 2 次提交
  10. 02 9月, 2013 2 次提交
  11. 27 6月, 2013 1 次提交
  12. 19 6月, 2013 3 次提交
  13. 01 4月, 2013 3 次提交
  14. 21 3月, 2013 1 次提交
    • S
      perf/x86: Fix uninitialized pt_regs in intel_pmu_drain_bts_buffer() · 0e48026a
      Stephane Eranian 提交于
      This patch fixes an uninitialized pt_regs struct in drain BTS
      function. The pt_regs struct is propagated all the way to the
      code_get_segment() function from perf_instruction_pointer()
      and may get garbage.
      
      We cannot simply inherit the actual pt_regs from the interrupt
      because BTS must be flushed on context-switch or when the
      associated event is disabled. And there we do not have a pt_regs
      handy.
      
      Setting pt_regs to all zeroes may not be the best option but it
      is not clear what else to do given where the drain_bts_buffer()
      is called from.
      
      In V2, we move the memset() later in the code to avoid doing it
      when we end up returning early without doing the actual BTS
      processing. Also dropped the reg.val initialization because it
      is redundant with the memset() as suggested by PeterZ.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: peterz@infradead.org
      Cc: sqazi@google.com
      Cc: ak@linux.intel.com
      Cc: jolsa@redhat.com
      Link: http://lkml.kernel.org/r/20130319151038.GA25439@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e48026a
  15. 18 3月, 2013 1 次提交
    • L
      perf,x86: fix wrmsr_on_cpu() warning on suspend/resume · 2a6e06b2
      Linus Torvalds 提交于
      Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
      suspend/resume") fixed a crash when doing PEBS performance profiling
      after resuming, but in using init_debug_store_on_cpu() to restore the
      DS_AREA mtrr it also resulted in a new WARN_ON() triggering.
      
      init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
      cross-calls to do the MSR update.  Which is not really valid at the
      early resume stage, and the warning is quite reasonable.  Now, it all
      happens to _work_, for the simple reason that smp_call_function_single()
      ends up just doing the call directly on the CPU when the CPU number
      matches, but we really should just do the wrmsr() directly instead.
      
      This duplicates the wrmsr() logic, but hopefully we can just remove the
      wrmsr_on_cpu() version eventually.
      Reported-and-tested-by: NParag Warudkar <parag.lkml@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a6e06b2
  16. 16 3月, 2013 1 次提交
  17. 19 9月, 2012 1 次提交
  18. 31 7月, 2012 1 次提交
    • P
      perf/x86: Fix USER/KERNEL tagging of samples properly · d07bdfd3
      Peter Zijlstra 提交于
      Some PMUs don't provide a full register set for their sample,
      specifically 'advanced' PMUs like AMD IBS and Intel PEBS which provide
      'better' than regular interrupt accuracy.
      
      In this case we use the interrupt regs as basis and over-write some
      fields (typically IP) with different information.
      
      The perf core however uses user_mode() to distinguish user/kernel
      samples, user_mode() relies on regs->cs. If the interrupt skid pushed
      us over a boundary the new IP might not be in the same domain as the
      interrupt.
      
      Commit ce5c1fe9 ("perf/x86: Fix USER/KERNEL tagging of samples")
      tried to fix this by making the perf core use kernel_ip(). This
      however is wrong (TM), as pointed out by Linus, since it doesn't allow
      for VM86 and non-zero based segments in IA32 mode.
      
      Therefore, provide a new helper to set the regs->ip field,
      set_linear_ip(), which massages the regs into a suitable state
      assuming the provided IP is in fact a linear address.
      
      Also modify perf_instruction_pointer() and perf_callchain_user() to
      deal with segments base offsets.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1341910954.3462.102.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d07bdfd3
  19. 06 7月, 2012 1 次提交
  20. 06 6月, 2012 3 次提交
  21. 09 5月, 2012 1 次提交
  22. 05 3月, 2012 2 次提交
  23. 03 2月, 2012 1 次提交
    • S
      perf: Remove deprecated WARN_ON_ONCE() · 84f2b9b2
      Stephane Eranian 提交于
      With the new throttling/unthrottling code introduced with
      commit:
      
        e050e3f0 ("perf: Fix broken interrupt rate throttling")
      
      we occasionally hit two WARN_ON_ONCE() checks in:
      
        - intel_pmu_pebs_enable()
        - intel_pmu_lbr_enable()
        - x86_pmu_start()
      
      The assertions are no longer problematic. There is a valid
      path where they can trigger but it is harmless.
      
      The assertion can be triggered with:
      
        $ perf record -e instructions:pp ....
      
      Leading to paths:
      
        intel_pmu_pebs_enable
        intel_pmu_enable_event
        x86_perf_event_set_period
        x86_pmu_start
        perf_adjust_freq_unthr_context
        perf_event_task_tick
        scheduler_tick
      
      And:
      
        intel_pmu_lbr_enable
        intel_pmu_enable_event
        x86_perf_event_set_period
        x86_pmu_start
        perf_adjust_freq_unthr_context.
        perf_event_task_tick
        scheduler_tick
      
      cpuc->enabled is always on because when we get to
      perf_adjust_freq_unthr_context() the PMU is not totally
      disabled. Furthermore when we need to adjust a period,
      we only stop the event we need to change and not the
      entire PMU. Thus, when we re-enable, cpuc->enabled is
      already set. Note that when we stop the event, both
      pebs and lbr are stopped if necessary (and possible).
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Link: http://lkml.kernel.org/r/20120202110401.GA30911@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>
      84f2b9b2
  24. 14 11月, 2011 1 次提交
  25. 26 9月, 2011 1 次提交
  26. 01 7月, 2011 2 次提交
    • P
      perf: Remove the perf_output_begin(.sample) argument · a7ac67ea
      Peter Zijlstra 提交于
      Since only samples call perf_output_sample() its much saner (and more
      correct) to put the sample logic in there than in the
      perf_output_begin()/perf_output_end() pair.
      
      Saves a useless argument, reduces conditionals and shrinks
      struct perf_output_handle, win!
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a7ac67ea
    • P
      perf: Remove the nmi parameter from the swevent and overflow interface · a8b0ca17
      Peter Zijlstra 提交于
      The nmi parameter indicated if we could do wakeups from the current
      context, if not, we would set some state and self-IPI and let the
      resulting interrupt do the wakeup.
      
      For the various event classes:
      
        - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
          the PMI-tail (ARM etc.)
        - tracepoint: nmi=0; since tracepoint could be from NMI context.
        - software: nmi=[0,1]; some, like the schedule thing cannot
          perform wakeups, and hence need 0.
      
      As one can see, there is very little nmi=1 usage, and the down-side of
      not using it is that on some platforms some software events can have a
      jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).
      
      The up-side however is that we can remove the nmi parameter and save a
      bunch of conditionals in fast paths.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michael Cree <mcree@orcon.net.nz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a8b0ca17
  27. 16 3月, 2011 1 次提交