1. 27 7月, 2011 1 次提交
  2. 01 7月, 2011 7 次提交
  3. 09 6月, 2011 1 次提交
  4. 04 6月, 2011 1 次提交
  5. 04 5月, 2011 1 次提交
  6. 29 4月, 2011 1 次提交
    • I
      perf events: Add generic front-end and back-end stalled cycle event definitions · 8f622422
      Ingo Molnar 提交于
      Add two generic hardware events: front-end and back-end stalled cycles.
      
      These events measure conditions when the CPU is executing code but its
      capabilities are not fully utilized. Understanding such situations and
      analyzing them is an important sub-task of code optimization workflows.
      
      Both events limit performance: most front end stalls tend to be caused
      by branch misprediction or instruction fetch cachemisses, backend
      stalls can be caused by various resource shortages or inefficient
      instruction scheduling.
      
      Front-end stalls are the more important ones: code cannot run fast
      if the instruction stream is not being kept up.
      
      An over-utilized back-end can cause front-end stalls and thus
      has to be kept an eye on as well.
      
      The exact composition is very program logic and instruction mix
      dependent.
      
      We use the terms 'stall', 'front-end' and 'back-end' loosely and
      try to use the best available events from specific CPUs that
      approximate these concepts.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/n/tip-7y40wib8n000io7hjpn1dsrm@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      8f622422
  7. 27 4月, 2011 1 次提交
  8. 05 4月, 2011 1 次提交
    • J
      jump label: Introduce static_branch() interface · d430d3d7
      Jason Baron 提交于
      Introduce:
      
      static __always_inline bool static_branch(struct jump_label_key *key);
      
      instead of the old JUMP_LABEL(key, label) macro.
      
      In this way, jump labels become really easy to use:
      
      Define:
      
              struct jump_label_key jump_key;
      
      Can be used as:
      
              if (static_branch(&jump_key))
                      do unlikely code
      
      enable/disale via:
      
              jump_label_inc(&jump_key);
              jump_label_dec(&jump_key);
      
      that's it!
      
      For the jump labels disabled case, the static_branch() becomes an
      atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
      atomic_dec() operations. We show testing results for this change below.
      
      Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.
      
      Since we now require a 'struct jump_label_key *key', we can store a pointer into
      the jump table addresses. In this way, we can enable/disable jump labels, in
      basically constant time. This change allows us to completely remove the previous
      hashtable scheme. Thanks to Peter Zijlstra for this re-write.
      
      Testing:
      
      I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
      configurations, where tracepoints were disabled.
      
      jump label configured in
      avg: 815.6
      
      jump label *not* configured in (using atomic reads)
      avg: 800.1
      
      jump label *not* configured in (regular reads)
      avg: 803.4
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20110316212947.GA8792@redhat.com>
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Suggested-by: NH. Peter Anvin <hpa@linux.intel.com>
      Tested-by: NDavid Daney <ddaney@caviumnetworks.com>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d430d3d7
  9. 31 3月, 2011 2 次提交
    • L
      Fix common misspellings · 25985edc
      Lucas De Marchi 提交于
      Fixes generated by 'codespell' and manually reviewed.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      25985edc
    • P
      perf: Fix task context scheduling · ab711fe0
      Peter Zijlstra 提交于
      Jiri reported:
      
       |
       | - once an event is created by sys_perf_event_open, task context
       |   is created and it stays even if the event is closed, until the
       |   task is finished ... thats what I see in code and I assume it's
       |   correct
       |
       | - when the task opens event, perf_sched_events jump label is
       |   incremented and following callbacks are started from scheduler
       |
       |         __perf_event_task_sched_in
       |         __perf_event_task_sched_out
       |
       |   These callback *in/out set/unset cpuctx->task_ctx value to the
       |   task context.
       |
       | - close is called on event on CPU 0:
       |         - the task is scheduled on CPU 0
       |         - __perf_event_task_sched_in is called
       |         - cpuctx->task_ctx is set
       |         - perf_sched_events jump label is decremented and == 0
       |         - __perf_event_task_sched_out is not called
       |         - cpuctx->task_ctx on CPU 0 stays set
       |
       | - exit is called on CPU 1:
       |         - the task is scheduled on CPU 1
       |         - perf_event_exit_task is called
       |         - task_ctx_sched_out unsets cpuctx->task_ctx on CPU 1
       |         - put_ctx destroys the context
       |
       | - another call of perf_rotate_context on CPU 0 will use invalid
       |   task_ctx pointer, and eventualy panic.
       |
      
      Cure this the simplest possibly way by partially reverting the
      jump_label optimization for the sched_out case.
      Reported-and-tested-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@kernel.org> # .37+
      LKML-Reference: <1301520405.4859.213.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ab711fe0
  10. 23 3月, 2011 1 次提交
    • S
      perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx() · 68cacd29
      Stephane Eranian 提交于
      This patch solves a stale pointer problem in
      update_cgrp_time_from_cpuctx(). The cpuctx->cgrp
      was not cleared on all possible event exit paths,
      including:
      
         close()
           perf_release()
             perf_release_kernel()
               list_del_event()
      
      This patch fixes list_del_event() to clear cpuctx->cgrp
      when there are no cgroup events left in the context.
      
      [ This second version makes the code compile when
        CONFIG_CGROUP_PERF is not enabled. We unconditionally define
        perf_cpu_context->cgrp. ]
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: perfmon2-devel@lists.sf.net
      Cc: paulus@samba.org
      Cc: davem@davemloft.net
      LKML-Reference: <20110323150306.GA1580@quad>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      68cacd29
  11. 16 3月, 2011 1 次提交
  12. 04 3月, 2011 1 次提交
    • A
      perf: Add support for supplementary event registers · a7e3ed1e
      Andi Kleen 提交于
      Change logs against Andi's original version:
      
      - Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
      - Fixed a major event scheduling issue. There cannot be a ref++ on an
        event that has already done ref++ once and without calling
        put_constraint() in between. (Stephane Eranian)
      - Use thread_cpumask for percore allocation. (Lin Ming)
      - Use MSR names in the extra reg lists. (Lin Ming)
      - Remove redundant "c = NULL" in intel_percore_constraints
      - Fix comment of perf_event_attr::config1
      
      Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
      that can be used to monitor any offcore accesses from a core.
      This is a very useful event for various tunings, and it's
      also needed to implement the generic LLC-* events correctly.
      
      Unfortunately this event requires programming a mask in a separate
      register. And worse this separate register is per core, not per
      CPU thread.
      
      This patch:
      
      - Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
        The extra parameters are passed by user space in the
        perf_event_attr::config1 field.
      
      - Adds support to the Intel perf_event core to schedule per
        core resources. This adds fairly generic infrastructure that
        can be also used for other per core resources.
        The basic code has is patterned after the similar AMD northbridge
        constraints code.
      
      Thanks to Stephane Eranian who pointed out some problems
      in the original version and suggested improvements.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a7e3ed1e
  13. 16 2月, 2011 2 次提交
    • P
      perf: Optimize throttling code · 163ec435
      Peter Zijlstra 提交于
      By pre-computing the maximum number of samples per tick we can avoid a
      multiplication and a conditional since MAX_INTERRUPTS >
      max_samples_per_tick.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      163ec435
    • S
      perf: Add cgroup support · e5d1367f
      Stephane Eranian 提交于
      This kernel patch adds the ability to filter monitoring based on
      container groups (cgroups). This is for use in per-cpu mode only.
      
      The cgroup to monitor is passed as a file descriptor in the pid
      argument to the syscall. The file descriptor must be opened to
      the cgroup name in the cgroup filesystem. For instance, if the
      cgroup name is foo and cgroupfs is mounted in /cgroup, then the
      file descriptor is opened to /cgroup/foo. Cgroup mode is
      activated by passing PERF_FLAG_PID_CGROUP in the flags argument
      to the syscall.
      
      For instance to measure in cgroup foo on CPU1 assuming
      cgroupfs is mounted under /cgroup:
      
      struct perf_event_attr attr;
      int cgroup_fd, fd;
      
      cgroup_fd = open("/cgroup/foo", O_RDONLY);
      fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP);
      close(cgroup_fd);
      Signed-off-by: NStephane Eranian <eranian@google.com>
      [ added perf_cgroup_{exit,attach} ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4d590250.114ddf0a.689e.4482@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e5d1367f
  14. 16 12月, 2010 2 次提交
    • P
      perf: Sysfs enumeration · abe43400
      Peter Zijlstra 提交于
      Simple sysfs emumeration of the PMUs.
      
      Use a "event_source" bus, and add PMU devices using their name.
      
      Each PMU device has a type attribute which contrains the value needed
      for perf_event_attr::type to identify this PMU.
      
      This is the minimal stub needed to start using this interface,
      we'll consider extending the sysfs usage later.
      
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Greg KH <gregkh@suse.de>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20101117222056.316982569@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      abe43400
    • P
      perf: Dynamic pmu types · 2e80a82a
      Peter Zijlstra 提交于
      Extend the perf_pmu_register() interface to allow for named and
      dynamic pmu types.
      
      Because we need to support the existing static types we cannot use
      dynamic types for everything, hence provide a type argument.
      
      If we want to enumerate the PMUs they need a name, provide one.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20101117222056.259707703@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e80a82a
  15. 09 12月, 2010 1 次提交
  16. 05 12月, 2010 2 次提交
  17. 01 12月, 2010 1 次提交
  18. 26 11月, 2010 3 次提交
  19. 11 11月, 2010 1 次提交
    • S
      perf_events: Fix time tracking in samples · eed01528
      Stephane Eranian 提交于
      This patch corrects time tracking in samples. Without this patch
      both time_enabled and time_running are bogus when user asks for
      PERF_SAMPLE_READ.
      
      One uses PERF_SAMPLE_READ to sample the values of other counters
      in each sample. Because of multiplexing, it is necessary to know
      both time_enabled, time_running to be able to scale counts correctly.
      
      In this second version of the patch, we maintain a shadow
      copy of ctx->time which allows us to compute ctx->time without
      calling update_context_time() from NMI context. We avoid the
      issue that update_context_time() must always be called with
      ctx->lock held.
      
      We do not keep shadow copies of the other event timings
      because if the lead event is overflowing then it is active
      and thus it's been scheduled in via event_sched_in() in
      which case neither tstamp_stopped, tstamp_running can be modified.
      
      This timing logic only applies to samples when PERF_SAMPLE_READ
      is used.
      
      Note that this patch does not address timing issues related
      to sampling inheritance between tasks. This will be addressed
      in a future patch.
      
      With this patch, the libpfm4 example task_smpl now reports
      correct counts (shown on 2.4GHz Core 2):
      
      $ task_smpl -p 2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears  noploop 5
      noploop for 5 seconds
      IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3
      	2,400,000,254 unhalted_core_cycles:u (33)
      	2,399,273,744 instructions_retired:u (34)
      	53,340 baclears (35)
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4cc6e14b.1e07e30a.256e.5190@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eed01528
  20. 19 10月, 2010 5 次提交
  21. 11 10月, 2010 2 次提交
  22. 17 9月, 2010 2 次提交
    • P
      perf: Undo the per cpu-context timer stuff · e9d2b064
      Peter Zijlstra 提交于
      Revert the timer per cpu-context timers because of unfortunate
      nohz interaction. Fixing that would have been somewhat ugly, so
      go back to driving things from the regular tick. Provide a
      jiffies interval feature for people who want slower rotations.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <20100917093009.519845633@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e9d2b064
    • P
      perf: Complete software pmu grouping · b04243ef
      Peter Zijlstra 提交于
      Aside from allowing software events into a !software group,
      allow adding !software events to pure software groups.
      
      Once we've moved the software group and attached the first
      !software event, the group will no longer be a pure software
      group and hence no longer be eligible for movement, at which
      point the straight ctx comparison is correct again.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20100917093009.410784731@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b04243ef