1. 10 8月, 2009 1 次提交
  2. 09 8月, 2009 2 次提交
    • F
      perf_counter: Fix tracepoint sampling to be part of generic sampling · 3a43ce68
      Frederic Weisbecker 提交于
      Based on Peter's comments, make tracepoint sampling generic
      just like all the other sampling bits are. This is a rename
      with no code changes:
      
      - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW
      - struct perf_tracepoint_record to perf_raw_record
      
      We want the system in place that transport tracepoints raw
      samples events into the perf ring buffer to be generalized and
      usable by any type of counter.
      
      Reported-by; Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a43ce68
    • F
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker 提交于
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f413cdb8
  3. 02 8月, 2009 1 次提交
    • P
      perf_counter: Full task tracing · 9f498cc5
      Peter Zijlstra 提交于
      In order to be able to distinguish between no samples due to
      inactivity and no samples due to task ended, Arjan asked for
      PERF_EVENT_EXIT events. This is useful to the boot delay
      instrumentation (bootchart) app.
      
      This patch changes the PERF_EVENT_FORK to be emitted on every
      clone, and adds PERF_EVENT_EXIT to be emitted on task exit,
      after the task's counters have been closed.
      
      This task tracing is controlled through: attr.comm || attr.mmap
      and through the new attr.task field.
      Suggested-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      [ cleaned up perf_counter.h a bit ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f498cc5
  4. 23 7月, 2009 1 次提交
    • P
      perf_counter: PERF_SAMPLE_ID and inherited counters · 7f453c24
      Peter Zijlstra 提交于
      Anton noted that for inherited counters the counter-id as provided by
      PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID
      because each inherited counter gets its own id.
      
      His suggestion was to always return the parent counter id, since that
      is the primary counter id as exposed. However, these inherited
      counters have a unique identifier so that events like
      PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which
      counter gets modified, which is important when trying to normalize the
      sample streams.
      
      This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD,
      which is more useful anyway, since changing periods became a lot more
      common than initially thought -- rendering PERF_EVENT_PERIOD the less
      useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate
      value, since it reports the value used to trigger the overflow,
      whereas PERF_EVENT_PERIOD simply reports the requested period changed,
      which might only take effect on the next cycle).
      
      This still leaves us PERF_EVENT_THROTTLE to consider, but since that
      _should_ be a rare occurrence, and linking it to a primary id is the
      most useful bit to diagnose the problem, we introduce a
      PERF_SAMPLE_STREAM_ID, for those few cases where the full
      reconstruction is important.
      
      [Does change the ABI a little, but I see no other way out]
      Suggested-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1248095846.15751.8781.camel@twins>
      7f453c24
  5. 30 6月, 2009 1 次提交
  6. 26 6月, 2009 5 次提交
  7. 23 6月, 2009 1 次提交
  8. 19 6月, 2009 2 次提交
    • P
      perf_counter: Simplify and fix task migration counting · e5289d4a
      Peter Zijlstra 提交于
      The task migrations counter was causing rare and hard to decypher
      memory corruptions under load. After a day of debugging and bisection
      we found that the problem was introduced with:
      
        3f731ca6: perf_counter: Fix cpu migration counter
      
      Turning them off fixes the crashes. Incidentally, the whole
      perf_counter_task_migration() logic can be done simpler as well,
      by injecting a proper sw-counter event.
      
      This cleanup also fixed the crashes. The precise failure mode is
      not completely clear yet, but we are clearly not unhappy about
      having a fix ;-)
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e5289d4a
    • P
      perf_counter: Make callchain samples extensible · f9188e02
      Peter Zijlstra 提交于
      Before exposing upstream tools to a callchain-samples ABI, tidy it
      up to make it more extensible in the future:
      
      Use markers in the IP chain to denote context, use (u64)-1..-4095 range
      for these context markers because we use them for ERR_PTR(), so these
      addresses are unlikely to be mapped.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f9188e02
  9. 18 6月, 2009 2 次提交
    • P
      perf_counter: Add event overlow handling · 43a21ea8
      Peter Zijlstra 提交于
      Alternative method of mmap() data output handling that provides
      better overflow management and a more reliable data stream.
      
      Unlike the previous method, that didn't have any user->kernel
      feedback and relied on userspace keeping up, this method relies on
      userspace writing its last read position into the control page.
      
      It will ensure new output doesn't overwrite not-yet read events,
      new events for which there is no space left are lost and the
      overflow counter is incremented, providing exact event loss
      numbers.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      43a21ea8
    • I
      perf report: Add validation of call-chain entries · 7522060c
      Ingo Molnar 提交于
      Add boundary checks for call-chain events. In case of corrupted
      entries we could crash otherwise.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7522060c
  10. 15 6月, 2009 1 次提交
    • P
      perf_counter: Make set_perf_counter_pending() declaration common · 9974458e
      Paul Mackerras 提交于
      At present, every architecture that supports perf_counters has to
      declare set_perf_counter_pending() in its arch-specific headers.
      This consolidates the declarations into a single declaration in one
      common place, include/linux/perf_counter.h.  On powerpc, we continue
      to provide a static inline definition of set_perf_counter_pending()
      in the powerpc hw_irq.h.
      
      Also, this removes from the x86 perf_counter.h the unused null
      definitions of {test,clear}_perf_counter_pending.
      Reported-by: NMike Frysinger <vapier.adi@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: benh@kernel.crashing.org
      LKML-Reference: <18998.13388.920691.523227@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9974458e
  11. 12 6月, 2009 2 次提交
    • P
      perf_counter: Add forward/backward attribute ABI compatibility · 974802ea
      Peter Zijlstra 提交于
      Provide for means of extending the perf_counter_attr in a 'natural' way.
      
      We allow growing the structure by appending fields at the end by specifying
      the full structure size inside it.
      
      When a new kernel sees a smaller (old) structure, it will 0 pad the tail.
      When an old kernel sees a larger (new) structure, it will verify the tail
      consists of 0s, otherwise fail.
      
      If we fail due to a size-mismatch, we return -E2BIG and write the kernel's
      native attribe size back into the provided structure.
      
      Furthermore, add some attribute verification, so that we'll fail counter
      creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in
      the __reserved fields).
      
      (This ABI detail is introduced while keeping the existing syscall ABI.)
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      974802ea
    • P
      perf_counter: PERF_TYPE_HW_CACHE is a hardware counter too · f1a3c979
      Peter Zijlstra 提交于
      is_software_counter() was missing the new HW_CACHE category.
      
      ( This could have caused some counter scheduling artifacts
        with mixed sw and hw counters and counter groups. )
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f1a3c979
  12. 11 6月, 2009 9 次提交
  13. 10 6月, 2009 1 次提交
    • P
      perf_counter: More aggressive frequency adjustment · bd2b5b12
      Peter Zijlstra 提交于
      Also employ the overflow handler to adjust the frequency, this results
      in a stable frequency in about 40~50 samples, instead of that many ticks.
      
      This also means we can start sampling at a sample period of 1 without
      running head-first into the throttle.
      
      It relies on sched_clock() to accurately measure the time difference
      between the overflow NMIs.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bd2b5b12
  14. 06 6月, 2009 5 次提交
    • I
      perf_counter: Implement generalized cache event types · 8326f44d
      Ingo Molnar 提交于
      Extend generic event enumeration with the PERF_TYPE_HW_CACHE
      method.
      
      This is a 3-dimensional space:
      
             { L1-D, L1-I, L2, ITLB, DTLB, BPU } x
             { load, store, prefetch } x
             { accesses, misses }
      
      User-space passes in the 3 coordinates and the kernel provides
      a counter. (if the hardware supports that type and if the
      combination makes sense.)
      
      Combinations that make no sense produce a -EINVAL.
      Combinations that are not supported by the hardware produce -ENOTSUP.
      
      Extend the tools to deal with this, and rewrite the event symbol
      parsing code with various popular aliases for the units and
      access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
      both valid aliases.
      
      ( x86 is supported for now, with the Nehalem event table filled in,
        and with Core2 and Atom having placeholder tables. )
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8326f44d
    • I
      perf_counter: Separate out attr->type from attr->config · a21ca2ca
      Ingo Molnar 提交于
      Counter type is a frequently used value and we do a lot of
      bit juggling by encoding and decoding it from attr->config.
      
      Clean this up by creating a separate attr->type field.
      
      Also clean up the various similarly complex user-space bits
      all around counter attribute management.
      
      The net improvement is significant, and it will be easier
      to add a new major type (which is what triggered this cleanup).
      
      (This changes the ABI, all tools are adapted.)
      (PowerPC build-tested.)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a21ca2ca
    • P
      perf_counter: Fix frequency adjustment for < HZ · 6a24ed6c
      Peter Zijlstra 提交于
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6a24ed6c
    • P
      perf_counter: Add PERF_SAMPLE_PERIOD · 689802b2
      Peter Zijlstra 提交于
      In order to allow easy tracking of the period, also provide means of
      adding it to the sample data.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      689802b2
    • P
      perf_counter: Change PERF_SAMPLE_CONFIG into PERF_SAMPLE_ID · ac4bcf88
      Peter Zijlstra 提交于
      The purpose of PERF_SAMPLE_CONFIG was to identify the counters,
      since then we've added counter ids, use those instead.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ac4bcf88
  15. 05 6月, 2009 1 次提交
  16. 04 6月, 2009 2 次提交
  17. 03 6月, 2009 3 次提交