1. 08 3月, 2016 1 次提交
  2. 04 8月, 2015 1 次提交
  3. 12 11月, 2014 1 次提交
  4. 13 8月, 2014 1 次提交
    • A
      perf/x86: Revamp PEBS event selection · 86a04461
      Andi Kleen 提交于
      The basic idea is that it does not make sense to list all PEBS
      events individually. The list is very long, sometimes outdated
      and the hardware doesn't need it. If an event does not support
      PEBS it will just not count, there is no security issue.
      
      We need to only list events that something special, like
      supporting load or store addresses.
      
      This vastly simplifies the PEBS event selection. It also
      speeds up the scheduling because the scheduler doesn't
      have to walk as many constraints.
      
      Bugs fixed:
      
       - We do not allow setting forbidden flags with PEBS anymore
         (SDM 18.9.4), except for the special cycle event.
         This is done using a new constraint macro that also
         matches on the event flags.
      
       - Correct DataLA and load/store/na flags reporting on Haswell
         [Requires a followon patch]
      
       - We did not allow all PEBS events on Haswell:
         We were missing some valid subevents in d1-d2 (MEM_LOAD_UOPS_RETIRED.*,
         MEM_LOAD_UOPS_RETIRED_L3_HIT_RETIRED.*)
      
      This includes the changes proposed by Stephane earlier and obsoletes
      his patchkit (except for some changes on pre Sandy Bridge/Silvermont
      CPUs)
      
      I only did Sandy Bridge and Silvermont and later so far, mostly because these
      are the parts I could directly confirm the hardware behavior with hardware
      architects. Also I do not believe the older CPUs have any
      missing events in their PEBS list, so there's no pressing
      need to change them.
      
      I did not implement the flag proposed by Peter to allow
      setting forbidden flags. If really needed this could
      be implemented on to of this patch.
      
      v2: Fix broken store events on SNB/IVB (Stephane Eranian)
      v3: More fixes. Rename some arguments (Stephane Eranian)
      v4: List most Haswell events individually again to report
      memory operation type correctly.
      Add new flags to describe load/store/na for datala.
      Update description.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1407785233-32193-2-git-send-email-eranian@google.com
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Cc: Mark Davies <junk@eslaf.co.uk>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      86a04461
  5. 19 6月, 2013 1 次提交
  6. 16 2月, 2013 1 次提交
  7. 07 2月, 2013 1 次提交
  8. 10 8月, 2012 1 次提交
    • F
      perf: Factor __output_copy to be usable with specific copy function · 91d7753a
      Frederic Weisbecker 提交于
      Adding a generic way to use __output_copy function with specific copy
      function via DEFINE_PERF_OUTPUT_COPY macro.
      
      Using this to add new __output_copy_user function, that provides output
      copy from user pointers. For x86 the copy_from_user_nmi function is used
      and __copy_from_user_inatomic for the rest of the architectures.
      
      This new function will be used in user stack dump on sample, coming in
      next patches.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-4-git-send-email-jolsa@redhat.comSigned-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91d7753a
  9. 31 7月, 2012 1 次提交
    • P
      perf/x86: Fix USER/KERNEL tagging of samples properly · d07bdfd3
      Peter Zijlstra 提交于
      Some PMUs don't provide a full register set for their sample,
      specifically 'advanced' PMUs like AMD IBS and Intel PEBS which provide
      'better' than regular interrupt accuracy.
      
      In this case we use the interrupt regs as basis and over-write some
      fields (typically IP) with different information.
      
      The perf core however uses user_mode() to distinguish user/kernel
      samples, user_mode() relies on regs->cs. If the interrupt skid pushed
      us over a boundary the new IP might not be in the same domain as the
      interrupt.
      
      Commit ce5c1fe9 ("perf/x86: Fix USER/KERNEL tagging of samples")
      tried to fix this by making the perf core use kernel_ip(). This
      however is wrong (TM), as pointed out by Linus, since it doesn't allow
      for VM86 and non-zero based segments in IA32 mode.
      
      Therefore, provide a new helper to set the regs->ip field,
      set_linear_ip(), which massages the regs into a suitable state
      assuming the provided IP is in fact a linear address.
      
      Also modify perf_instruction_pointer() and perf_callchain_user() to
      deal with segments base offsets.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1341910954.3462.102.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d07bdfd3
  10. 26 7月, 2012 1 次提交
  11. 06 7月, 2012 3 次提交
  12. 14 5月, 2012 1 次提交
  13. 09 5月, 2012 1 次提交
  14. 08 3月, 2012 2 次提交
  15. 02 3月, 2012 1 次提交
  16. 21 12月, 2011 2 次提交
  17. 07 12月, 2011 2 次提交
  18. 10 10月, 2011 3 次提交
  19. 06 10月, 2011 1 次提交
  20. 03 7月, 2011 1 次提交
    • F
      x86: Save stack pointer in perf live regs savings · 9e46294d
      Frederic Weisbecker 提交于
      In order to prepare for fetching the stack pointer from the
      regs when possible in dump_trace() instead of taking the
      local one, save the current stack pointer in perf live regs saving.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      9e46294d
  21. 26 11月, 2010 1 次提交
  22. 15 10月, 2010 1 次提交
  23. 09 6月, 2010 2 次提交
  24. 19 4月, 2010 1 次提交
  25. 03 4月, 2010 2 次提交
    • R
      perf, x86: implement ARCH_PERFMON_EVENTSEL bit masks · a098f448
      Robert Richter 提交于
      ARCH_PERFMON_EVENTSEL bit masks are often used in the kernel. This
      patch adds macros for the bit masks and removes local defines. The
      function intel_pmu_raw_event() becomes x86_pmu_raw_event() which is
      generic for x86 models and same also for p6. Duplicate code is
      removed.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20100330092821.GH11907@erda.amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a098f448
    • R
      perf, x86: Undo some some *_counter* -> *_event* renames · 948b1bb8
      Robert Richter 提交于
      The big rename:
      
       cdd6c482 perf: Do the big rename: Performance Counters -> Performance Events
      
      accidentally renamed some members of stucts that were named after
      registers in the spec. To avoid confusion this patch reverts some
      changes. The related specs are MSR descriptions in AMD's BKDGs and the
      ARCHITECTURAL PERFORMANCE MONITORING section in the Intel 64 and IA-32
      Architectures Software Developer's Manuals.
      
      This patch does:
      
       $ sed -i -e 's:num_events:num_counters:g' \
         arch/x86/include/asm/perf_event.h \
         arch/x86/kernel/cpu/perf_event_amd.c \
         arch/x86/kernel/cpu/perf_event.c \
         arch/x86/kernel/cpu/perf_event_intel.c \
         arch/x86/kernel/cpu/perf_event_p6.c \
         arch/x86/kernel/cpu/perf_event_p4.c \
         arch/x86/oprofile/op_model_ppro.c
      
       $ sed -i -e 's:event_bits:cntval_bits:g' -e 's:event_mask:cntval_mask:g' \
         arch/x86/kernel/cpu/perf_event_amd.c \
         arch/x86/kernel/cpu/perf_event.c \
         arch/x86/kernel/cpu/perf_event_intel.c \
         arch/x86/kernel/cpu/perf_event_p6.c \
         arch/x86/kernel/cpu/perf_event_p4.c
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1269880612-25800-2-git-send-email-robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      948b1bb8
  26. 12 3月, 2010 1 次提交
    • C
      perf, x86: Implement initial P4 PMU driver · a072738e
      Cyrill Gorcunov 提交于
      The netburst PMU is way different from the "architectural
      perfomance monitoring" specification that current CPUs use.
      P4 uses a tuple of ESCR+CCCR+COUNTER MSR registers to handle
      perfomance monitoring events.
      
      A few implementational details:
      
      1) We need a separate x86_pmu::hw_config helper in struct
         x86_pmu since register bit-fields are quite different from P6,
         Core and later cpu series.
      
      2) For the same reason is a x86_pmu::schedule_events helper
         introduced.
      
      3) hw_perf_event::config consists of packed ESCR+CCCR values.
         It's allowed since in reality both registers only use a half
         of their size. Of course before making a real write into a
         particular MSR we need to unpack the value and extend it to
         a proper size.
      
      4) The tuple of packed ESCR+CCCR in hw_perf_event::config
         doesn't describe the memory address of ESCR MSR register
         so that we need to keep a mapping between these tuples
         used and available ESCR (various P4 events may use same
         ESCRs but not simultaneously), for this sake every active
         event has a per-cpu map of hw_perf_event::idx <--> ESCR
         addresses.
      
      5) Since hw_perf_event::idx is an offset to counter/control register
         we need to lift X86_PMC_MAX_GENERIC up, otherwise kernel
         strips it down to 8 registers and event armed may never be turned
         off (ie the bit in active_mask is set but the loop never reaches
         this index to check), thanks to Peter Zijlstra
      
      Restrictions:
      
       - No cascaded counters support (do we ever need them?)
       - No dependent events support (so PERF_COUNT_HW_INSTRUCTIONS
         doesn't work for now)
       - There are events with same counters which can't work simultaneously
         (need to use intersected ones due to broken counter 1)
       - No PERF_COUNT_HW_CACHE_ events yet
      
      Todo:
      
       - Implement dependent events
       - Need proper hashing for event opcodes (no linear search, good for
         debugging stage but not in real loads)
       - Some events counted during a clock cycle -- need to set threshold
         for them and count every clock cycle just to get summary statistics
         (ie to behave the same way as other PMUs do)
       - Need to swicth to use event_constraints
       - To support RAW events we need to encode a global list of P4 events
         into p4_templates
       - Cache events need to be added
      
      Event support status matrix:
      
       Event			status
       -----------------------------
       cycles			works
       cache-references	works
       cache-misses		works
       branch-misses		works
       bus-cycles		partially (does not work on 64bit cpu with HT enabled)
       instruction		doesnt work (needs dependent event [mop tagging])
       branches		doesnt work
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100311165439.GB5129@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a072738e
  27. 10 3月, 2010 1 次提交
    • P
      perf, x86: use LBR for PEBS IP+1 fixup · ef21f683
      Peter Zijlstra 提交于
      Use the LBR to fix up the PEBS IP+1 issue.
      
      As said, PEBS reports the next instruction, here we use the LBR to find
      the last branch and from that construct the actual IP. If the IP matches
      the LBR-TO, we use LBR-FROM, otherwise we use the LBR-TO address as the
      beginning of the last basic block and decode forward.
      
      Once we find a match to the current IP, we use the previous location.
      
      This patch introduces a new ABI element: PERF_RECORD_MISC_EXACT, which
      conveys that the reported IP (PERF_SAMPLE_IP) is the exact instruction
      that caused the event (barring CPU errata).
      
      The fixup can fail due to various reasons:
      
       1) LBR contains invalid data (quite possible)
       2) part of the basic block got paged out
       3) the reported IP isn't part of the basic block (see 1)
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100304140100.619375431@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef21f683
  28. 02 3月, 2010 1 次提交
  29. 01 3月, 2010 3 次提交