1. 29 10月, 2013 1 次提交
  2. 04 10月, 2013 1 次提交
    • A
      perf: Add generic transaction flags · fdfbbd07
      Andi Kleen 提交于
      Add a generic qualifier for transaction events, as a new sample
      type that returns a flag word. This is particularly useful
      for qualifying aborts: to distinguish aborts which happen
      due to asynchronous events (like conflicts caused by another
      CPU) versus instructions that lead to an abort.
      
      The tuning strategies are very different for those cases,
      so it's important to distinguish them easily and early.
      
      Since it's inconvenient and inflexible to filter for this
      in the kernel we report all the events out and allow
      some post processing in user space.
      
      The flags are based on the Intel TSX events, but should be fairly
      generic and mostly applicable to other HTM architectures too. In addition
      to various flag words there's also reserved space to report an
      program supplied abort code. For TSX this is used to distinguish specific
      classes of aborts, like a lock busy abort when doing lock elision.
      
      Flags:
      
      Elision and generic transactions 		   (ELISION vs TRANSACTION)
      (HLE vs RTM on TSX; IBM etc.  would likely only use TRANSACTION)
      Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC)
      Retryable transaction				   (RETRY)
      Conflicts with other threads			   (CONFLICT)
      Transaction write capacity overflow		   (CAPACITY WRITE)
      Transaction read capacity overflow		   (CAPACITY READ)
      
      Transactions implicitely aborted can also return an abort code.
      This can be used to signal specific events to the profiler. A common
      case is abort on lock busy in a RTM eliding library (code 0xff)
      To handle this case we include the TSX abort code
      
      Common example aborts in TSX would be:
      
      - Data conflict with another thread on memory read.
                                            Flags: TRANSACTION|ASYNC|CONFLICT
      - executing a WRMSR in a transaction. Flags: TRANSACTION|SYNC
      - HLE transaction in user space is too large
                                            Flags: ELISION|SYNC|CAPACITY-WRITE
      
      The only flag that is somewhat TSX specific is ELISION.
      
      This adds the perf core glue needed for reporting the new flag word out.
      
      v2: Add MEM/MISC
      v3: Move transaction to the end
      v4: Separate capacity-read/write and remove misc
      v5: Remove _SAMPLE. Move abort flags to 32bit. Rename
          transaction to txn
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1379688044-14173-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fdfbbd07
  3. 20 9月, 2013 2 次提交
    • P
      perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page' · fa731587
      Peter Zijlstra 提交于
      Solve the problems around the broken definition of perf_event_mmap_page::
      cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially
      fixed by:
      
        860f085b ("perf: Fix broken union in 'struct perf_event_mmap_page'")
      
      The problem with the fix (merged in v3.12-rc1 and not yet released
      officially), noticed by Vince Weaver is that the new behavior is
      not detectable by new user-space, and that due to the reuse of the
      field names it's easy to mis-compile a binary if old headers are used
      on a new kernel or new headers are used on an old kernel.
      
      To solve all that make this change explicit, detectable and self-contained,
      by iterating the ABI the following way:
      
       - Always clear bit 0, and rename it to usrpage->cap_bit0, to at least not
         confuse old user-space binaries. RDPMC will be marked as unavailable
         to old binaries but that's within the ABI, this is a capability bit.
      
       - Rename bit 1 to ->cap_bit0_is_deprecated and always set it to 1, so new
         libraries can reliably detect that bit 0 is deprecated and perma-zero
         without having to check the kernel version.
      
       - Use bits 2, 3, 4 for the newly defined, correct functionality:
      
      	cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
      	cap_user_time		: 1, /* The time_* fields are used */
      	cap_user_time_zero	: 1, /* The time_zero field is used */
      
       - Rename all the bitfield names in perf_event.h to be different from the
         old names, to make sure it's not possible to mis-compile it
         accidentally with old assumptions.
      
      The 'size' field can then be used in the future to add new fields and it
      will act as a natural ABI version indicator as well.
      
      Also adjust tools/perf/ userspace for the new definitions, noticed by
      Adrian Hunter.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Also-Fixed-by: NAdrian Hunter <adrian.hunter@intel.com>
      Link: http://lkml.kernel.org/n/tip-zr03yxjrpXesOzzupszqglbv@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fa731587
    • P
      perf: Update ABI comment · c5ecceef
      Peter Zijlstra 提交于
      For some mysterious reason the sample_id field of PERF_RECORD_MMAP went AWOL.
      Reported-by: NVince Weaver <vince@deater.net>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c5ecceef
  4. 18 9月, 2013 1 次提交
  5. 03 9月, 2013 1 次提交
  6. 02 9月, 2013 2 次提交
  7. 30 8月, 2013 1 次提交
    • A
      perf: make events stream always parsable · ff3d527c
      Adrian Hunter 提交于
      The event stream is not always parsable because the format of a sample
      is dependent on the sample_type of the selected event.  When there is
      more than one selected event and the sample_types are not the same then
      parsing becomes problematic.  A sample can be matched to its selected
      event using the ID that is allocated when the event is opened.
      Unfortunately, to get the ID from the sample means first parsing it.
      
      This patch adds a new sample format bit PERF_SAMPLE_IDENTIFER that puts
      the ID at a fixed position so that the ID can be retrieved without
      parsing the sample.  For sample events, that is the first position
      immediately after the header.  For non-sample events, that is the last
      position.
      
      In this respect parsing samples requires that the sample_type and ID
      values are recorded.  For example, perf tools records struct
      perf_event_attr and the IDs within the perf.data file.  Those must be
      read first before it is possible to parse samples found later in the
      perf.data file.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1377591794-30553-6-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ff3d527c
  8. 08 8月, 2013 1 次提交
  9. 23 7月, 2013 3 次提交
  10. 19 6月, 2013 1 次提交
  11. 08 4月, 2013 1 次提交
  12. 01 4月, 2013 3 次提交
  13. 25 1月, 2013 1 次提交
  14. 13 10月, 2012 1 次提交