1. 30 6月, 2009 1 次提交
  2. 28 6月, 2009 2 次提交
    • J
      perf stat: Micro-optimize the code: memcpy is only required if no event is selected and !null_run · c3043569
      Jaswinder Singh Rajput 提交于
      Set attrs and nr_counters if no event is selected and !null_run.
      
      Setting of attrs should depend on number of counters,
      so we need to memcpy only for sizeof(default_attrs)
      
      Also set nr_counters as ARRAY_SIZE(default_attrs) in place of
      hardcoded value.
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1246126749.32198.16.camel@hpdv5.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c3043569
    • J
      perf stat: Improve output · 6e750a8f
      Jaswinder Singh Rajput 提交于
      Increase size for event name to handle bigger names like
      'L1-d$-prefetch-misses'
      
      Changed scaled counters from percentage to a multiplicative
      factor because the latter is more expressive.
      
      Also aligned the scaling factor, otherwise sometimes it looks
      like:
      
                  384  iTLB-load-misses           (4.74x scaled)
               452029  branch-loads               (8.00x scaled)
                 5892  branch-load-misses         (20.39x scaled)
               972315  iTLB-loads                 (3.24x scaled)
      
      Before:
               150708  L1-d$-stores          (scaled from 23.57%)
               428804  L1-d$-prefetches      (scaled from 23.47%)
               314446  L1-d$-prefetch-misses  (scaled from 23.42%)
            252626137  L1-i$-loads           (scaled from 23.24%)
              5297550  dTLB-load-misses      (scaled from 23.96%)
            106992392  branch-loads          (scaled from 23.67%)
              5239561  branch-load-misses    (scaled from 23.43%)
      
      After:
              1731713  L1-d$-loads               (  14.25x scaled)
                44241  L1-d$-prefetches          (   3.88x scaled)
                21076  L1-d$-prefetch-misses     (   3.40x scaled)
              5789421  L1-i$-loads               (   3.78x scaled)
                29645  dTLB-load-misses          (   2.95x scaled)
               461474  branch-loads              (   6.52x scaled)
                 7493  branch-load-misses        (  26.57x scaled)
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1246051927.2988.10.camel@hpdv5.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6e750a8f
  3. 27 6月, 2009 3 次提交
    • I
      perf stat: Fix multi-run stats · 566747e6
      Ingo Molnar 提交于
      In multi-run (-r/--repeat) printouts, print out the noise of
      the wall-clock average as well.
      
      Also, fix a bug in printing out scaled counters: if it was not
      scaled then we should not update the average with -1.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      566747e6
    • I
      perf stat: Add -n/--null option to run without counters · 0cfb7a13
      Ingo Molnar 提交于
      Allow a no-counters run. This can be useful to measure just
      elapsed wall-clock time - or to assess the raw overhead of perf
      stat itself, without running any counters.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0cfb7a13
    • I
      perf_counter tools: Remove dead code · fde953c1
      Ingo Molnar 提交于
      Vince Weaver reported that there's a handful of #ifdef __MINGW32__
      sections in the code.
      
      Remove them as they are in essence dead code - as unlike upstream
      Git, the perf tool is unlikely to be ported to Windows.
      Reported-by: NVince Weaver <vince@deater.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fde953c1
  4. 26 6月, 2009 8 次提交
  5. 25 6月, 2009 4 次提交
    • J
      perf_counter tools: Shorten names for events · e5c59547
      Jaswinder Singh Rajput 提交于
      Added new alias for events.
      
      On AMD box:
      
       $ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null
      
      Before :
      
       Performance counter stats for 'ls -lR /usr/include/':
      
            248064467  L1-data-Cache-Load-Referencees  (scaled from 23.27%)
              1001433  L1-data-Cache-Load-Misses  (scaled from 23.34%)
               153691  L1-data-Cache-Store-Referencees  (scaled from 23.34%)
               423248  L1-data-Cache-Prefetch-Referencees  (scaled from 23.33%)
               302138  L1-data-Cache-Prefetch-Misses  (scaled from 23.25%)
            251217546  L1-instruction-Cache-Load-Referencees  (scaled from 23.25%)
              5757005  L1-instruction-Cache-Load-Misses  (scaled from 23.23%)
                93435  L1-instruction-Cache-Prefetch-Referencees  (scaled from 23.24%)
              6496073  L2-Cache-Load-Referencees  (scaled from 23.32%)
               609485  L2-Cache-Load-Misses  (scaled from 23.45%)
              6876991  L2-Cache-Store-Referencees  (scaled from 23.71%)
            248922840  Data-TLB-Cache-Load-Referencees  (scaled from 23.94%)
              5828386  Data-TLB-Cache-Load-Misses  (scaled from 24.17%)
            257613506  Instruction-TLB-Cache-Load-Referencees  (scaled from 24.20%)
                 6833  Instruction-TLB-Cache-Load-Misses  (scaled from 23.88%)
            109043606  Branch-Cache-Load-Referencees  (scaled from 23.64%)
              5552296  Branch-Cache-Load-Misses  (scaled from 23.42%)
      
          0.413702461  seconds time elapsed.
      
      After :
      
       Peformance counter stats for 'ls -lR /usr/include/':
      
            266590464  L1-d$-loads           (scaled from 23.03%)
              1222273  L1-d$-load-misses     (scaled from 23.58%)
               146204  L1-d$-stores          (scaled from 23.83%)
               406344  L1-d$-prefetches      (scaled from 24.09%)
               283748  L1-d$-prefetch-misses (scaled from 24.10%)
            249650965  L1-i$-loads           (scaled from 23.80%)
              3353961  L1-i$-load-misses     (scaled from 23.82%)
               104599  L1-i$-prefetches      (scaled from 23.68%)
              4836405  LLC-loads             (scaled from 23.67%)
               498214  LLC-load-misses       (scaled from 23.66%)
              4953994  LLC-stores            (scaled from 23.64%)
            243354097  dTLB-loads            (scaled from 23.77%)
              6468584  dTLB-load-misses      (scaled from 23.74%)
            249719549  iTLB-loads            (scaled from 23.25%)
                 5060  iTLB-load-misses      (scaled from 23.00%)
            112343016  branch-loads          (scaled from 22.76%)
              5528876  branch-load-misses    (scaled from 22.54%)
      
          0.427154051  seconds time elapsed.
      
      Reported-by : Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1245934522.5308.39.camel@hpdv5.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e5c59547
    • J
      perf_counter tools: Check for valid cache operations · 06813f6c
      Jaswinder Singh Rajput 提交于
      Made new table for cache operartion stat 'hw_cache_stat' as:
      
       L1I : Read and prefetch only
       ITLB and BPU : Read-only
      
      introduce is_cache_op_valid() for cache operation validity
      
      And checks for valid cache operations.
      
      Reported-by : Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1245930367.5308.33.camel@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      06813f6c
    • J
      perf record: Fix filemap pathname parsing in /proc/pid/maps · 76c64c5e
      Johannes Weiner 提交于
      Looking backward for the first space from the end of a line in
      /proc/pid/maps does not find the start of the pathname of the mapped
      file if it contains a space.
      
      Since the only slashes we have in this file occur in the (absolute!)
      pathname column of file mappings, looking for the first slash in a
      line is a safe method to find the name.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Stefani Seibold <stefani@seibold.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090624190835.GA25548@cmpxchg.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      76c64c5e
    • I
      perf_counter tools: Add CREDITS file for Git contributors · 1b173f77
      Ingo Molnar 提交于
      Much of perf's libraries comes from the Git project. I noticed
      that the files (in tools/perf/util/*.[ch] and elsewhere) are
      quite spartan wrt. credits, so lets add a CREDITS file that
      includes an (incomplete!) list of main contributors.
      
      Thanks guys, these libraries are really useful. Special thanks
      go to Johannes Schindelin and Junio C Hamano for coming up with
      this list.
      List-Composed-By: NJohannes Schindelin <Johannes.Schindelin@gmx.de>
      Cc: Junio C Hamano <gitster@pobox.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1b173f77
  6. 24 6月, 2009 3 次提交
  7. 23 6月, 2009 3 次提交
  8. 22 6月, 2009 4 次提交
  9. 21 6月, 2009 1 次提交
    • I
      perf_counter tools: Fix vmlinux fallback when running on a different kernel · c1f47b45
      Ingo Molnar 提交于
      Lucas De Marchi reported that perf report and perf annotate
      displays mismatching profile if a perf.data is analyzed on
      an older kernel - even if the correct vmlinux is specified
      via the -k option.
      
      The reason is the fallback path in util/symbol.c:dso__load_kernel():
      
      int dso__load_kernel(struct dso *self, const char *vmlinux,
                           symbol_filter_t filter, int verbose)
      {
              int err = -1;
      
              if (vmlinux)
                      err = dso__load_vmlinux(self, vmlinux, filter, verbose);
      
              if (err)
                      err = dso__load_kallsyms(self, filter, verbose);
      
              return err;
      }
      
      dso__load_vmlinux() returns negative on error, but on success it
      returns the number of symbols loaded - which confuses the function
      to load the kallsyms.
      
      This is normally harmless, as reporting is usually performed on the
      same kernel that is analyzed - but if there's a mismatch then we
      load the wrong kallsyms and create a non-sensical symbol tree.
      
      The fix is to only fall back to kallsyms on errors.
      Reported-by: NLucas De Marchi <lucas.de.marchi@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c1f47b45
  10. 20 6月, 2009 2 次提交
    • F
      perfcounter: Handle some IO return values · eadc84cc
      Frederic Weisbecker 提交于
      Building perfcounter tools raises the following warnings:
      
       builtin-record.c: In function ‘atexit_header’:
       builtin-record.c:464: erreur: ignoring return value of ‘pwrite’, declared with attribute warn_unused_result
       builtin-record.c: In function ‘__cmd_record’:
       builtin-record.c:503: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
      
       builtin-report.c: In function ‘__cmd_report’:
       builtin-report.c:1403: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
      
      This patch handles these IO return values.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1245456100-5477-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eadc84cc
    • P
      perf_counter tools: Define and use our own u64, s64 etc. definitions · 9cffa8d5
      Paul Mackerras 提交于
      On 64-bit powerpc, __u64 is defined to be unsigned long rather than
      unsigned long long.  This causes compiler warnings every time we
      print a __u64 value with %Lx.
      
      Rather than changing __u64, we define our own u64 to be unsigned long
      long on all architectures, and similarly s64 as signed long long.
      For consistency we also define u32, s32, u16, s16, u8 and s8.  These
      definitions are put in a new header, types.h, because these definitions
      are needed in util/string.h and util/symbol.h.
      
      The main change here is the mechanical change of __[us]{64,32,16,8}
      to remove the "__".  The other changes are:
      
      * Create types.h
      * Include types.h in perf.h, util/string.h and util/symbol.h
      * Add types.h to the LIB_H definition in Makefile
      * Added (u64) casts in process_overflow_event() and print_sym_table()
        to kill two remaining warnings.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19003.33494.495844.956580@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9cffa8d5
  11. 19 6月, 2009 2 次提交
  12. 18 6月, 2009 7 次提交
    • I
      perf report: Filter to parent set by default · b8e6d829
      Ingo Molnar 提交于
      Make it easier to use parent filtering - default to a filtered
      output. Also add the parent column so that we get collapsing but
      dont display it by default.
      
      add --no-exclude-other to override this.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8e6d829
    • P
      perf_counter tools: Handle lost events · 9d91a6f7
      Peter Zijlstra 提交于
      Make use of the new ->data_tail mechanism to tell kernel-space
      about user-space draining the data stream. Emit lost events
      (and display them) if they happen.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9d91a6f7
    • P
      perf_counter: tools: Makefile tweaks for 64-bit powerpc · e24a72c4
      Paul Mackerras 提交于
      On 64-bit powerpc, perf needs to be built as a 64-bit executable.
      This arranges to add the -m64 flag to CFLAGS if we are running on
      a 64-bit machine, indicated by the result of uname -m ending in "64".
      This means that we'll use -m64 on x86_64 machines as well.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linuxppc-dev@ozlabs.org
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19000.55666.866148.559620@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e24a72c4
    • P
      perf_counter tools: Add and use isprint() · a73c7d84
      Peter Zijlstra 提交于
      Introduce isprint() to print out raw event dumps to ASCII, etc.
      
      (This is an extension to upstream Git's ctype.c.)
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      [ removed openssl.h inclusion from util.h - it leaked ctype.h ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a73c7d84
    • I
      perf report: Add validation of call-chain entries · 7522060c
      Ingo Molnar 提交于
      Add boundary checks for call-chain events. In case of corrupted
      entries we could crash otherwise.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7522060c
    • I
      perf report: Tidy up the "--parent <regex>" and "--sort parent" call-chain features · b25bcf2f
      Ingo Molnar 提交于
      Instead of the ambigious 'call' naming use the much more
      specific 'parent' naming:
      
       - rename --call <regex> to --parent <regex>
      
       - rename --sort call to --sort parent
      
       - rename [unmatched] to [other] - to signal that this is not
         an error but the inverse set
      
      Also add pagefaults to the default parent-symbol pattern too,
      as it's a 'syscall overhead category' in a sense.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b25bcf2f
    • P
      perf_counter tools: Replace isprint() with issane() · 5aa75a0f
      Peter Zijlstra 提交于
      The Git utils came with a ctype replacement that doesn't provide
      isprint(). Add a replacement.
      
      Solves a build bug on certain distros.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5aa75a0f