1. 01 3月, 2010 1 次提交
    • R
      perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE · bb1165d6
      Robert Richter 提交于
      For consistency reasons this patch renames
      ARCH_PERFMON_EVENTSEL0_ENABLE to ARCH_PERFMON_EVENTSEL_ENABLE.
      
      The following is performed:
      
       $ sed -i -e s/ARCH_PERFMON_EVENTSEL0_ENABLE/ARCH_PERFMON_EVENTSEL_ENABLE/g \
         arch/x86/include/asm/perf_event.h arch/x86/kernel/cpu/perf_event.c \
         arch/x86/kernel/cpu/perf_event_p6.c \
         arch/x86/kernel/cpu/perfctr-watchdog.c \
         arch/x86/oprofile/op_model_amd.c arch/x86/oprofile/op_model_ppro.c
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      bb1165d6
  2. 26 2月, 2010 5 次提交
    • P
      perf_events, x86: Split PMU definitions into separate files · f22f54f4
      Peter Zijlstra 提交于
      Split amd,p6,intel into separate files so that we can easily deal with
      CONFIG_CPU_SUP_* things, needed to make things build now that perf_event.c
      relies on symbols from amd.c
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f22f54f4
    • P
      perf_events, x86: Remove superflous MSR writes · 6667661d
      Peter Zijlstra 提交于
      We re-program the event control register every time we reset the count,
      this appears to be superflous, hence remove it.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6667661d
    • P
      perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in() · 6e37738a
      Peter Zijlstra 提交于
      Since the cpu argument to hw_perf_group_sched_in() is always
      smp_processor_id(), simplify the code a little by removing this argument
      and using the current cpu where needed.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1265890918.5396.3.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6e37738a
    • S
      perf_events, x86: AMD event scheduling · 38331f62
      Stephane Eranian 提交于
      This patch adds correct AMD NorthBridge event scheduling.
      
      NB events are events measuring L3 cache, Hypertransport traffic. They are
      identified by an event code >= 0xe0. They measure events on the
      Northbride which is shared by all cores on a package. NB events are
      counted on a shared set of counters. When a NB event is programmed in a
      counter, the data actually comes from a shared counter. Thus, access to
      those counters needs to be synchronized.
      
      We implement the synchronization such that no two cores can be measuring
      NB events using the same counters. Thus, we maintain a per-NB allocation
      table. The available slot is propagated using the event_constraint
      structure.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4b703957.0702d00a.6bf2.7b7d@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      38331f62
    • S
      perf_events: Add new start/stop PMU callbacks · d76a0812
      Stephane Eranian 提交于
      In certain situations, the kernel may need to stop and start the same
      event rapidly. The current PMU callbacks do not distinguish between stop
      and release (i.e., stop + free the resource). Thus, a counter may be
      released, then it will be immediately re-acquired. Event scheduling will
      again take place with no guarantee to assign the same counter. On some
      processors, this may event yield to failure to assign the event back due
      to competion between cores.
      
      This patch is adding a new pair of callback to stop and restart a counter
      without actually release the underlying counter resource. On stop, the
      counter is stopped, its values saved and that's it. On start, the value
      is reloaded and counter is restarted (on x86, actual restart is delayed
      until perf_enable()).
      Signed-off-by: NStephane Eranian <eranian@google.com>
      [ added fallback to ->enable/->disable for all other PMUs
        fixed x86_pmu_start() to call x86_pmu.enable()
        merged __x86_pmu_disable into x86_pmu_stop() ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4b703875.0a04d00a.7896.ffffb824@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d76a0812
  3. 04 2月, 2010 3 次提交
  4. 29 1月, 2010 16 次提交
  5. 28 1月, 2010 1 次提交
    • A
      perf: Fix inconsistency between IP and callchain sampling · 339ce1a4
      Anton Blanchard 提交于
      When running perf across all cpus with backtracing (-a -g), sometimes we
      get samples without associated backtraces:
      
          23.44%         init  [kernel]                     [k] restore
          11.46%         init                       eeba0c  [k] 0x00000000eeba0c
           6.77%      swapper  [kernel]                     [k] .perf_ctx_adjust_freq
           5.73%         init  [kernel]                     [k] .__trace_hcall_entry
           4.69%         perf  libc-2.9.so                  [.] 0x0000000006bb8c
                             |
                             |--11.11%-- 0xfffa941bbbc
      
      It turns out the backtrace code has a check for the idle task and the IP
      sampling does not. This creates problems when profiling an interrupt
      heavy workload (in my case 10Gbit ethernet) since we get no backtraces
      for interrupts received while idle (ie most of the workload).
      
      Right now x86 and sh check that current is not NULL, which should never
      happen so remove that too.
      
      Idle task's exclusion must be performed from the core code, on top
      of perf_event_attr:exclude_idle.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      LKML-Reference: <20100118054707.GT12666@kryten>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      339ce1a4
  6. 21 1月, 2010 1 次提交
  7. 13 1月, 2010 1 次提交
  8. 31 12月, 2009 1 次提交
    • F
      perf: Pass appropriate frame pointer to dump_trace() · 48b5ba9c
      Frederic Weisbecker 提交于
      Pass the frame pointer from the regs of the interrupted path
      to dump_trace() while processing the stack trace.
      
      Currently, dump_trace() takes the current bp and starts the
      callchain from dump_trace() itself. This is wasteful because
      we need to walk through the entire NMI/DEBUG stack before
      retrieving the interrupted point.
      
      We can fix that by just using the frame pointer from the
      captured regs. It points exactly where we want to start.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1262235183-5320-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      48b5ba9c
  9. 17 12月, 2009 2 次提交
    • F
      perf events, x86/stacktrace: Fix performance/softlockup by providing a special... · 06d65bda
      Frederic Weisbecker 提交于
      perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker
      
      It's just wasteful for stacktrace users like perf to walk
      through every entries on the stack whereas these only accept
      reliable ones, ie: that the frame pointer validates.
      
      Since perf requires pure reliable stacktraces, it needs a stack
      walker based on frame pointers-only to optimize the stacktrace
      processing.
      
      This might solve some near-lockup scenarios that can be triggered
      by call-graph tracing timer events.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1261024834-5336-2-git-send-regression-fweisbec@gmail.com>
      [ v2: fix for modular builds and small detail tidyup ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      06d65bda
    • F
      perf events, x86/stacktrace: Make stack walking optional · 61c1917f
      Frederic Weisbecker 提交于
      The current print_context_stack helper that does the stack
      walking job is good for usual stacktraces as it walks through
      all the stack and reports even addresses that look unreliable,
      which is nice when we don't have frame pointers for example.
      
      But we have users like perf that only require reliable
      stacktraces, and those may want a more adapted stack walker, so
      lets make this function a callback in stacktrace_ops that users
      can tune for their needs.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1261024834-5336-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      61c1917f
  10. 11 12月, 2009 2 次提交
  11. 06 12月, 2009 1 次提交
    • F
      x86/perf: Exclude the debug stack from the callchains · 7f33f9c5
      Frederic Weisbecker 提交于
      Dumping the callchains from breakpoint events with perf gives strange
      results:
      
      3.75%             perf  [kernel]           [k] _raw_read_unlock
                             |
                             --- _raw_read_unlock
                                 perf_callchain
                                 perf_prepare_sample
                                 __perf_event_overflow
                                 perf_swevent_overflow
                                 perf_swevent_add
                                 perf_bp_event
                                 hw_breakpoint_exceptions_notify
                                 notifier_call_chain
                                 __atomic_notifier_call_chain
                                 atomic_notifier_call_chain
                                 notify_die
                                 do_debug
                                 debug
                                 munmap
      
      We are infected with all the debug stack. Like the nmi stack, the debug
      stack is undesired as it is part of the profiling path, not helpful for
      the user.
      
      Ignore it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
      7f33f9c5
  12. 04 12月, 2009 1 次提交
  13. 25 11月, 2009 1 次提交
    • S
      perf_events, x86: Fix validate_event bug · 1261a02a
      Stephane Eranian 提交于
      The validate_event() was failing on valid event combinations. The
      function was assuming that if x86_schedule_event() returned 0, it
      meant error. But x86_schedule_event() returns the counter index and
      0 is a perfectly valid value. An error is returned if the function
      returns a negative value.
      
      Furthermore, validate_event() was also failing for event groups
      because the event->pmu was not set until after
      hw_perf_event_init().
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: paulus@samba.org
      Cc: perfmon2-devel@lists.sourceforge.net
      Cc: eranian@gmail.com
      LKML-Reference: <4b0bdf36.1818d00a.07cc.25ae@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      --
       arch/x86/kernel/cpu/perf_event.c |    4 ++--
       1 file changed, 2 insertions(+), 2 deletions(-)
      1261a02a
  14. 12 11月, 2009 1 次提交
  15. 13 10月, 2009 1 次提交
    • I
      perf_events, x86: Fix event constraints code · 7a693d3f
      Ingo Molnar 提交于
      There was namespace overlap due to a rename i did - this caused
      the following build warning, reported by Stephen Rothwell against
      linux-next x86_64 allmodconfig:
      
        arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx':
        arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function
      
      This is a real bug not just a warning: fix it by renaming the
      global event-constraints table pointer to 'event_constraints'.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a693d3f
  16. 09 10月, 2009 2 次提交
    • P
      perf, x86: Add simple group validation · fe9081cc
      Peter Zijlstra 提交于
      Refuse to add events when the group wouldn't fit onto the PMU
      anymore.
      
      Naive implementation.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@gmail.com>
      LKML-Reference: <1254911461.26976.239.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fe9081cc
    • S
      perf_events: Add event constraints support for Intel processors · b690081d
      Stephane Eranian 提交于
      On some Intel processors, not all events can be measured in all
      counters. Some events can only be measured in one particular
      counter, for instance. Assigning an event to the wrong counter does
      not crash the machine but this yields bogus counts, i.e., silent
      error.
      
      This patch changes the event to counter assignment logic to take
      into account event constraints for Intel P6, Core and Nehalem
      processors. There is no contraints on Intel Atom. There are
      constraints on Intel Yonah (Core Duo) but they are not provided in
      this patch given that this processor is not yet supported by
      perf_events.
      
      As a result of the constraints, it is possible for some event
      groups to never actually be loaded onto the PMU if they contain two
      events which can only be measured on a single counter. That
      situation can be detected with the scaling information extracted
      with read().
      Signed-off-by: NStephane Eranian <eranian@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1254840129-6198-3-git-send-email-eranian@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b690081d