1. 09 3月, 2012 7 次提交
    • S
      perf record: Provide default branch stack sampling mode option · a5aabdac
      Stephane Eranian 提交于
      This patch chanegs the logic of the -b, --branch-stack options
      of perf record.
      
      Based on users' request, the patch provides a default filter
      mode with the -b (or --branch-any) option.  With the option,
      any type of taken branches is sampled.
      
      With -j (or --branch-filter), the user can specify any
      valid combination of branch types and privilege levels
      if supported by the underlying hardware.
      
      The -b (--branch any) is a shortcut for: --branch-filter any.
      
       $ perf record -b foo
      
      or:
      
       $ perf record --branch-filter any foo
      
      For more specific filtering:
      
       $ perf record --branch-filter ind_call,u foo
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: asharma@fb.com
      Cc: ravitillo@lbl.gov
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1331246868-19905-2-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a5aabdac
    • S
      perf tools: Make perf able to read files from older ABIs · 114382a0
      Stephane Eranian 提交于
      This patches provides a way to handle legacy perf.data
      files.  Legacy files are those using the older PERFFILE
      signature.
      
      For those, it is still necessary to detect endianness but
      without comparing their header->attr_size with the
      tool's own version as it may be different. Instead, we use
      a reference table for all known sizes from the legacy era.
      
      We try all the combinations for sizes and endianness. If we find
      a match, we proceed, otherwise we return: "incompatible file
      format".
      
      This is also done for the pipe-mode file format.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: ravitillo@lbl.gov
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-19-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      114382a0
    • S
      perf tools: Fix ABI compatibility bug in print_event_desc() · 62db9068
      Stephane Eranian 提交于
      This patches cleans up local variable types for msz and ret.
      They need to be size_t and ssize_t respectively.
      
      It also fixes a bug whereby perf would not read attr struct
      with a different size than what it knows about.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: ravitillo@lbl.gov
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-18-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      62db9068
    • S
      perf tools: Enable reading of perf.data files from different ABI rev · 69996df4
      Stephane Eranian 提交于
      This patch allows perf to process perf.data files generated
      using an ABI that has a different perf_event_attr struct size,
      i.e., a different ABI version.
      
      The perf_event_attr can be extended, yet perf needs to cope with
      older perf.data files. Similarly, perf must be able to cope with
      a perf.data file which is using a newer version of the ABI than
      what it knows about.
      
      This patch adds read_attr(), a routine that reads a
      perf_event_attr struct from a file incrementally based on its
      advertised size. If the on-file struct is smaller than what perf
      knows, then the extra fields are zeroed. If the on-file struct
      is bigger, then perf only uses what it knows about, the rest is
      skipped.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: ravitillo@lbl.gov
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-17-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      69996df4
    • R
      perf report: Add support for taken branch sampling · b50311dc
      Roberto Agostino Vitillo 提交于
      This patch adds support for taken branch sampling, i.e, the
      PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
      words, to display histograms based on taken branches rather
      than executed instructions addresses.
      
      The new option is called -b and it takes no argument. To
      generate meaningful output, the perf.data must have been
      obtained using perf record -b xxx ... where xxx is a branch
      filter option.
      
      The output shows symbols, modules, sorted by 'who branches
      where' the most often. The percentages reported in the first
      column refer to the total number of branches captured and
      not the usual number of samples.
      
      Here is a quick example.
      Here branchy is simple test program which looks as follows:
      
      void f2(void)
      {}
      void f3(void)
      {}
      void f1(unsigned long n)
      {
        if (n & 1UL)
          f2();
        else
          f3();
      }
      int main(void)
      {
        unsigned long i;
      
        for (i=0; i < N; i++)
         f1(i);
        return 0;
      }
      
      Here is the output captured on Nehalem, if we are
      only interested in user level function calls.
      
      $ perf record -b any_call,u -e cycles:u branchy
      
      $ perf report -b --sort=symbol
          52.34%  [.] main                   [.] f1
          24.04%  [.] f1                     [.] f3
          23.60%  [.] f1                     [.] f2
           0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
           0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
           0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
           0.01%  [k] __printf               [k] _IO_vfprintf_internal
           0.01%  [k] main                   [k] __printf
      
      About half (52%) of the call branches captured are from main()
      -> f1(). The second half (24%+23%) is split in two equal shares
      between f1() -> f2(), f1() ->f3(). The output is as expected
      given the code.
      
      It should be noted, that using -b in perf record does not
      eliminate information in the perf.data file. Consequently, a
      typical profile can also be obtained by perf report by simply
      not using its -b option.
      
      It is possible to sort on branch related columns:
      
         - dso_from, symbol_from
         - dso_to, symbol_to
         - mispredict
      Signed-off-by: NRoberto Agostino Vitillo <ravitillo@lbl.gov>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b50311dc
    • R
      perf record: Add support for sampling taken branch · bdfebd84
      Roberto Agostino Vitillo 提交于
      This patch adds a new option to enable taken branch stack
      sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature
      of perf_events.
      
      There is a new option to active this mode: -b.
      It is possible to pass a set of filters to select the type of
      branches to sample.
      
      The following filters are available:
      
       - any : any type of branches
       - any_call : any function call or system call
       - any_ret : any function return or system call return
       - any_ind : any indirect branch
       - u:  only when the branch target is at the user level
       - k: only when the branch target is in the kernel
       - hv: only when the branch target is in the hypervisor
      
      Filters can be combined by passing a comma separated list
      to the option:
      
      $ perf record -b any_call,u -e cycles:u branchy
      Signed-off-by: NRoberto Agostino Vitillo <ravitillo@lbl.gov>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-13-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      bdfebd84
    • R
      perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK · b5387528
      Roberto Agostino Vitillo 提交于
      This patch adds:
      
       - ability to parse samples with PERF_SAMPLE_BRANCH_STACK
       - sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict)
       - build histograms on branches
      Signed-off-by: NRoberto Agostino Vitillo <ravitillo@lbl.gov>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b5387528
  2. 03 3月, 2012 3 次提交
  3. 01 3月, 2012 4 次提交
  4. 22 2月, 2012 1 次提交
    • S
      perf tools: fix broken perf record -a mode · 6b1bee90
      Stephane Eranian 提交于
      The following commit:
      b52956c9 perf tools: Allow multiple threads or processes in record, stat, top
      
      introduced a bug in the thread_map code which caused perf record -a to
      not setup system-wide monitoring properly.
      
      $ taskset -c 1 noploop 1000 &
      $ perf record -a -C 1 sleep 10
      $ perf report -D | tail -20
      cycles stats:
                 TOTAL events:       4413
                  MMAP events:       4025
                  COMM events:        340
                SAMPLE events:         48
      
      Here I was expecting about 10,000 samples and not 48.
      
      In system-wide mode, the PID passed to perf_event_open() must be -1 and
      it was 0. That caused the kernel to setup a per-process event on PID:0.
      Consequently, the number of samples captured does not correspond to the
      requested measurement.
      
      The following one-liner fixes the problem for me with or without -C.
      
      I would also suggest to change the malloc() to something that matches
      the struct definition. thread_map->map[] is declared as int map[] and
      not pid_t map[]. If map[] can only contain pids, then change the struct
      definition.
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120221145424.GA6757@quadSigned-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6b1bee90
  5. 18 2月, 2012 2 次提交
  6. 15 2月, 2012 2 次提交
  7. 14 2月, 2012 14 次提交
  8. 09 2月, 2012 2 次提交
    • D
      perf record: No build id option fails · d3665498
      David Ahern 提交于
      A recent refactoring of perf-record introduced the following:
      
      perf record -a -B
      Couldn't generating buildids. Use --no-buildid to profile anyway.
      sleep: Terminated
      
      I believe the triple negative was meant to be only a double negative.
      :-) While I'm there, fixed the grammar on the error message.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d3665498
    • S
      perf tools: fix endianness detection in perf.data · 73323f54
      Stephane Eranian 提交于
      The current version of perf detects whether or not the perf.data file is
      written in a different endianness using the attr_size field in the
      header of the file. This field represents sizeof(struct perf_event_attr)
      as known to perf record. If the sizes do not match, then perf tries the
      byte-swapped version. If they match, then the tool assumes a different
      endianness.
      
      The issue with the approach is that it assumes the size of
      perf_event_attr always has to match between perf record and perf report.
      However, the kernel perf_event ABI is extensible.  New fields can be
      added to struct perf_event_attr. Consequently, it is not possible to use
      attr_size to detect endianness.
      
      This patch takes another approach by using the magic number written at
      the beginning of the perf.data file to detect endianness. The magic
      number is an eight-byte signature.  It's primary purpose is to identify
      (signature) a perf.data file. But it could also be used to encode the
      endianness.
      
      The patch introduces a new value for this signature. The key difference
      is that the signature is written differently in the file depending on
      the endianness. Thus, by comparing the signature from the file with the
      tool's own signature it is possible to detect endianness. The new
      signature is "PERFILE2".
      
      Backward compatiblity with existing perf.data file is ensured.
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roberto Agostino Vitillo <ravitillo@lbl.gov>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Vince Weaver <vweaver1@eecs.utk.edu>
      Link: http://lkml.kernel.org/r/1328187288-24395-15-git-send-email-eranian@google.comSigned-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      73323f54
  9. 07 2月, 2012 5 次提交