1. 10 8月, 2012 1 次提交
    • F
      perf: Factor __output_copy to be usable with specific copy function · 91d7753a
      Frederic Weisbecker 提交于
      Adding a generic way to use __output_copy function with specific copy
      function via DEFINE_PERF_OUTPUT_COPY macro.
      
      Using this to add new __output_copy_user function, that provides output
      copy from user pointers. For x86 the copy_from_user_nmi function is used
      and __copy_from_user_inatomic for the rest of the architectures.
      
      This new function will be used in user stack dump on sample, coming in
      next patches.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-4-git-send-email-jolsa@redhat.comSigned-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91d7753a
  2. 31 7月, 2012 1 次提交
    • P
      perf/x86: Fix USER/KERNEL tagging of samples properly · d07bdfd3
      Peter Zijlstra 提交于
      Some PMUs don't provide a full register set for their sample,
      specifically 'advanced' PMUs like AMD IBS and Intel PEBS which provide
      'better' than regular interrupt accuracy.
      
      In this case we use the interrupt regs as basis and over-write some
      fields (typically IP) with different information.
      
      The perf core however uses user_mode() to distinguish user/kernel
      samples, user_mode() relies on regs->cs. If the interrupt skid pushed
      us over a boundary the new IP might not be in the same domain as the
      interrupt.
      
      Commit ce5c1fe9 ("perf/x86: Fix USER/KERNEL tagging of samples")
      tried to fix this by making the perf core use kernel_ip(). This
      however is wrong (TM), as pointed out by Linus, since it doesn't allow
      for VM86 and non-zero based segments in IA32 mode.
      
      Therefore, provide a new helper to set the regs->ip field,
      set_linear_ip(), which massages the regs into a suitable state
      assuming the provided IP is in fact a linear address.
      
      Also modify perf_instruction_pointer() and perf_callchain_user() to
      deal with segments base offsets.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1341910954.3462.102.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d07bdfd3
  3. 26 7月, 2012 1 次提交
  4. 06 7月, 2012 3 次提交
  5. 14 5月, 2012 1 次提交
  6. 09 5月, 2012 1 次提交
  7. 08 3月, 2012 2 次提交
  8. 02 3月, 2012 1 次提交
  9. 21 12月, 2011 2 次提交
  10. 07 12月, 2011 2 次提交
  11. 10 10月, 2011 3 次提交
  12. 06 10月, 2011 1 次提交
  13. 03 7月, 2011 1 次提交
    • F
      x86: Save stack pointer in perf live regs savings · 9e46294d
      Frederic Weisbecker 提交于
      In order to prepare for fetching the stack pointer from the
      regs when possible in dump_trace() instead of taking the
      local one, save the current stack pointer in perf live regs saving.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      9e46294d
  14. 26 11月, 2010 1 次提交
  15. 15 10月, 2010 1 次提交
  16. 09 6月, 2010 2 次提交
  17. 19 4月, 2010 1 次提交
  18. 03 4月, 2010 2 次提交
    • R
      perf, x86: implement ARCH_PERFMON_EVENTSEL bit masks · a098f448
      Robert Richter 提交于
      ARCH_PERFMON_EVENTSEL bit masks are often used in the kernel. This
      patch adds macros for the bit masks and removes local defines. The
      function intel_pmu_raw_event() becomes x86_pmu_raw_event() which is
      generic for x86 models and same also for p6. Duplicate code is
      removed.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20100330092821.GH11907@erda.amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a098f448
    • R
      perf, x86: Undo some some *_counter* -> *_event* renames · 948b1bb8
      Robert Richter 提交于
      The big rename:
      
       cdd6c482 perf: Do the big rename: Performance Counters -> Performance Events
      
      accidentally renamed some members of stucts that were named after
      registers in the spec. To avoid confusion this patch reverts some
      changes. The related specs are MSR descriptions in AMD's BKDGs and the
      ARCHITECTURAL PERFORMANCE MONITORING section in the Intel 64 and IA-32
      Architectures Software Developer's Manuals.
      
      This patch does:
      
       $ sed -i -e 's:num_events:num_counters:g' \
         arch/x86/include/asm/perf_event.h \
         arch/x86/kernel/cpu/perf_event_amd.c \
         arch/x86/kernel/cpu/perf_event.c \
         arch/x86/kernel/cpu/perf_event_intel.c \
         arch/x86/kernel/cpu/perf_event_p6.c \
         arch/x86/kernel/cpu/perf_event_p4.c \
         arch/x86/oprofile/op_model_ppro.c
      
       $ sed -i -e 's:event_bits:cntval_bits:g' -e 's:event_mask:cntval_mask:g' \
         arch/x86/kernel/cpu/perf_event_amd.c \
         arch/x86/kernel/cpu/perf_event.c \
         arch/x86/kernel/cpu/perf_event_intel.c \
         arch/x86/kernel/cpu/perf_event_p6.c \
         arch/x86/kernel/cpu/perf_event_p4.c
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1269880612-25800-2-git-send-email-robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      948b1bb8
  19. 12 3月, 2010 1 次提交
    • C
      perf, x86: Implement initial P4 PMU driver · a072738e
      Cyrill Gorcunov 提交于
      The netburst PMU is way different from the "architectural
      perfomance monitoring" specification that current CPUs use.
      P4 uses a tuple of ESCR+CCCR+COUNTER MSR registers to handle
      perfomance monitoring events.
      
      A few implementational details:
      
      1) We need a separate x86_pmu::hw_config helper in struct
         x86_pmu since register bit-fields are quite different from P6,
         Core and later cpu series.
      
      2) For the same reason is a x86_pmu::schedule_events helper
         introduced.
      
      3) hw_perf_event::config consists of packed ESCR+CCCR values.
         It's allowed since in reality both registers only use a half
         of their size. Of course before making a real write into a
         particular MSR we need to unpack the value and extend it to
         a proper size.
      
      4) The tuple of packed ESCR+CCCR in hw_perf_event::config
         doesn't describe the memory address of ESCR MSR register
         so that we need to keep a mapping between these tuples
         used and available ESCR (various P4 events may use same
         ESCRs but not simultaneously), for this sake every active
         event has a per-cpu map of hw_perf_event::idx <--> ESCR
         addresses.
      
      5) Since hw_perf_event::idx is an offset to counter/control register
         we need to lift X86_PMC_MAX_GENERIC up, otherwise kernel
         strips it down to 8 registers and event armed may never be turned
         off (ie the bit in active_mask is set but the loop never reaches
         this index to check), thanks to Peter Zijlstra
      
      Restrictions:
      
       - No cascaded counters support (do we ever need them?)
       - No dependent events support (so PERF_COUNT_HW_INSTRUCTIONS
         doesn't work for now)
       - There are events with same counters which can't work simultaneously
         (need to use intersected ones due to broken counter 1)
       - No PERF_COUNT_HW_CACHE_ events yet
      
      Todo:
      
       - Implement dependent events
       - Need proper hashing for event opcodes (no linear search, good for
         debugging stage but not in real loads)
       - Some events counted during a clock cycle -- need to set threshold
         for them and count every clock cycle just to get summary statistics
         (ie to behave the same way as other PMUs do)
       - Need to swicth to use event_constraints
       - To support RAW events we need to encode a global list of P4 events
         into p4_templates
       - Cache events need to be added
      
      Event support status matrix:
      
       Event			status
       -----------------------------
       cycles			works
       cache-references	works
       cache-misses		works
       branch-misses		works
       bus-cycles		partially (does not work on 64bit cpu with HT enabled)
       instruction		doesnt work (needs dependent event [mop tagging])
       branches		doesnt work
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100311165439.GB5129@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a072738e
  20. 10 3月, 2010 1 次提交
    • P
      perf, x86: use LBR for PEBS IP+1 fixup · ef21f683
      Peter Zijlstra 提交于
      Use the LBR to fix up the PEBS IP+1 issue.
      
      As said, PEBS reports the next instruction, here we use the LBR to find
      the last branch and from that construct the actual IP. If the IP matches
      the LBR-TO, we use LBR-FROM, otherwise we use the LBR-TO address as the
      beginning of the last basic block and decode forward.
      
      Once we find a match to the current IP, we use the previous location.
      
      This patch introduces a new ABI element: PERF_RECORD_MISC_EXACT, which
      conveys that the reported IP (PERF_SAMPLE_IP) is the exact instruction
      that caused the event (barring CPU errata).
      
      The fixup can fail due to various reasons:
      
       1) LBR contains invalid data (quite possible)
       2) part of the basic block got paged out
       3) the reported IP isn't part of the basic block (see 1)
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100304140100.619375431@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef21f683
  21. 02 3月, 2010 1 次提交
  22. 01 3月, 2010 3 次提交
  23. 29 1月, 2010 2 次提交
    • P
      perf_events, x86: Fix event constraint masks · ed8777fc
      Peter Zijlstra 提交于
      Since constraints are specified on the event number, not number
      and unit mask shorten the constraint masks so that we'll
      actually match something.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <20100127221121.967610372@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ed8777fc
    • S
      perf_events, x86: Improve x86 event scheduling · 1da53e02
      Stephane Eranian 提交于
      This patch improves event scheduling by maximizing the use of PMU
      registers regardless of the order in which events are created in a group.
      
      The algorithm takes into account the list of counter constraints for each
      event. It assigns events to counters from the most constrained, i.e.,
      works on only one counter, to the least constrained, i.e., works on any
      counter.
      
      Intel Fixed counter events and the BTS special event are also handled via
      this algorithm which is designed to be fairly generic.
      
      The patch also updates the validation of an event to use the scheduling
      algorithm. This will cause early failure in perf_event_open().
      
      The 2nd version of this patch follows the model used by PPC, by running
      the scheduling algorithm and the actual assignment separately. Actual
      assignment takes place in hw_perf_enable() whereas scheduling is
      implemented in hw_perf_group_sched_in() and x86_pmu_enable().
      Signed-off-by: NStephane Eranian <eranian@google.com>
      [ fixup whitespace and style nits as well as adding is_x86_event() ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4b5430c6.0f975e0a.1bf9.ffff85fe@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1da53e02
  24. 21 1月, 2010 1 次提交
  25. 09 10月, 2009 1 次提交
  26. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  27. 09 8月, 2009 1 次提交
  28. 26 6月, 2009 1 次提交