1. 18 6月, 2009 2 次提交
    • I
      perf report: Filter to parent set by default · b8e6d829
      Ingo Molnar 提交于
      Make it easier to use parent filtering - default to a filtered
      output. Also add the parent column so that we get collapsing but
      dont display it by default.
      
      add --no-exclude-other to override this.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8e6d829
    • P
      perf_counter: tools: Makefile tweaks for 64-bit powerpc · e24a72c4
      Paul Mackerras 提交于
      On 64-bit powerpc, perf needs to be built as a 64-bit executable.
      This arranges to add the -m64 flag to CFLAGS if we are running on
      a 64-bit machine, indicated by the result of uname -m ending in "64".
      This means that we'll use -m64 on x86_64 machines as well.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linuxppc-dev@ozlabs.org
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19000.55666.866148.559620@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e24a72c4
  2. 13 6月, 2009 1 次提交
    • I
      perf stat: Enable raw data to be printed · ef281a19
      Ingo Molnar 提交于
      If -vv (very verbose) is specified, print out raw data
      in the following format:
      
      $ perf stat -vv -r 3 ./loop_1b_instructions
      
      [ perf stat: executing run #1 ... ]
      [ perf stat: executing run #2 ... ]
      [ perf stat: executing run #3 ... ]
      
      debug:              runtime[0]: 235871872
      debug:             walltime[0]: 236646752
      debug:       runtime_cycles[0]: 755150182
      debug:            counter/0[0]: 235871872
      debug:            counter/1[0]: 235871872
      debug:            counter/2[0]: 235871872
      debug:               scaled[0]: 0
      debug:            counter/0[1]: 2
      debug:            counter/1[1]: 235870662
      debug:            counter/2[1]: 235870662
      debug:               scaled[1]: 0
      debug:            counter/0[2]: 1
      debug:            counter/1[2]: 235870437
      debug:            counter/2[2]: 235870437
      debug:               scaled[2]: 0
      debug:            counter/0[3]: 140
      debug:            counter/1[3]: 235870298
      debug:            counter/2[3]: 235870298
      debug:               scaled[3]: 0
      debug:            counter/0[4]: 755150182
      debug:            counter/1[4]: 235870145
      debug:            counter/2[4]: 235870145
      debug:               scaled[4]: 0
      debug:            counter/0[5]: 1001411258
      debug:            counter/1[5]: 235868838
      debug:            counter/2[5]: 235868838
      debug:               scaled[5]: 0
      debug:            counter/0[6]: 27897
      debug:            counter/1[6]: 235868560
      debug:            counter/2[6]: 235868560
      debug:               scaled[6]: 0
      debug:            counter/0[7]: 2910
      debug:            counter/1[7]: 235868151
      debug:            counter/2[7]: 235868151
      debug:               scaled[7]: 0
      debug:              runtime[0]: 235980257
      debug:             walltime[0]: 236770942
      debug:       runtime_cycles[0]: 755114546
      debug:            counter/0[0]: 235980257
      debug:            counter/1[0]: 235980257
      debug:            counter/2[0]: 235980257
      debug:               scaled[0]: 0
      debug:            counter/0[1]: 3
      debug:            counter/1[1]: 235980049
      debug:            counter/2[1]: 235980049
      debug:               scaled[1]: 0
      debug:            counter/0[2]: 1
      debug:            counter/1[2]: 235979907
      debug:            counter/2[2]: 235979907
      debug:               scaled[2]: 0
      debug:            counter/0[3]: 135
      debug:            counter/1[3]: 235979780
      debug:            counter/2[3]: 235979780
      debug:               scaled[3]: 0
      debug:            counter/0[4]: 755114546
      debug:            counter/1[4]: 235979652
      debug:            counter/2[4]: 235979652
      debug:               scaled[4]: 0
      debug:            counter/0[5]: 1001439771
      debug:            counter/1[5]: 235979304
      debug:            counter/2[5]: 235979304
      debug:               scaled[5]: 0
      debug:            counter/0[6]: 23723
      debug:            counter/1[6]: 235979050
      debug:            counter/2[6]: 235979050
      debug:               scaled[6]: 0
      debug:            counter/0[7]: 2213
      debug:            counter/1[7]: 235978820
      debug:            counter/2[7]: 235978820
      debug:               scaled[7]: 0
      debug:              runtime[0]: 235888002
      debug:             walltime[0]: 236700533
      debug:       runtime_cycles[0]: 754881504
      debug:            counter/0[0]: 235888002
      debug:            counter/1[0]: 235888002
      debug:            counter/2[0]: 235888002
      debug:               scaled[0]: 0
      debug:            counter/0[1]: 2
      debug:            counter/1[1]: 235887793
      debug:            counter/2[1]: 235887793
      debug:               scaled[1]: 0
      debug:            counter/0[2]: 1
      debug:            counter/1[2]: 235887645
      debug:            counter/2[2]: 235887645
      debug:               scaled[2]: 0
      debug:            counter/0[3]: 135
      debug:            counter/1[3]: 235887499
      debug:            counter/2[3]: 235887499
      debug:               scaled[3]: 0
      debug:            counter/0[4]: 754881504
      debug:            counter/1[4]: 235887368
      debug:            counter/2[4]: 235887368
      debug:               scaled[4]: 0
      debug:            counter/0[5]: 1001401731
      debug:            counter/1[5]: 235887024
      debug:            counter/2[5]: 235887024
      debug:               scaled[5]: 0
      debug:            counter/0[6]: 24212
      debug:            counter/1[6]: 235886786
      debug:            counter/2[6]: 235886786
      debug:               scaled[6]: 0
      debug:            counter/0[7]: 1824
      debug:            counter/1[7]: 235886560
      debug:            counter/2[7]: 235886560
      debug:               scaled[7]: 0
      
       Performance counter stats for '/home/mingo/loop_1b_instructions' (3 runs):
      
           235.913377  task-clock-msecs     #      0.997 CPUs    ( +-   0.011% )
                    2  context-switches     #      0.000 M/sec   ( +-   0.000% )
                    1  CPU-migrations       #      0.000 M/sec   ( +-   0.000% )
                  136  page-faults          #      0.001 M/sec   ( +-   0.730% )
            755048744  cycles               #   3200.534 M/sec   ( +-   0.009% )
           1001417586  instructions         #      1.326 IPC     ( +-   0.001% )
                25277  cache-references     #      0.107 M/sec   ( +-   3.988% )
                 2315  cache-misses         #      0.010 M/sec   ( +-   9.845% )
      
          0.236706075  seconds time elapsed.
      
      This allows the summary stats to be validated.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef281a19
  3. 07 6月, 2009 2 次提交
    • I
      perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/ · 86470930
      Ingo Molnar 提交于
      Several people have suggested that 'perf' has become a full-fledged
      tool that should be moved out of Documentation/. Move it to the
      (new) tools/ directory.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      86470930
    • I
      perf_counter tools: Prepare for 'perf annotate' · 8035e428
      Ingo Molnar 提交于
      Prepare for the 'perf annotate' implementation by splitting off
      builtin-annotate.c from builtin-report.c.
      
      ( We keep this commit separate to ease the later librarization
        of the facilities that perf-report and perf-annotate shares. )
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8035e428
  4. 06 6月, 2009 1 次提交
  5. 05 6月, 2009 1 次提交
  6. 04 6月, 2009 3 次提交
    • I
      perf_counter tools: Add color terminal output support · 8fc0321f
      Ingo Molnar 提交于
      Add Git's color printing library to util/color.[ch].
      
      Add it to perf report, with a trivial example to print high-overhead
      entries in red, low-overhead entries in green.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8fc0321f
    • I
      perf_counter tools: Build with native optimization · af794b94
      Ingo Molnar 提交于
      Build the tools with -march=native by default.
      
      No measurable difference in speed though, compared to the
      default, on a Nehalem testbox.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      af794b94
    • I
      perf_counter tools: Optimize harder · 095b3a6a
      Ingo Molnar 提交于
      Use -O6 to build the tools.
      
      Before:
      
          12387507370  instructions         #    3121.653 M/sec
      
      After:
      
           6244894971  instructions         #    3458.437 M/sec
      
      Almost twice as fast!
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      095b3a6a
  7. 02 6月, 2009 2 次提交
    • M
      perf_counter tools: Cleanup Makefile · c1079abd
      Mike Galbraith 提交于
      We currently build perf-stat/record etc, only to do nothing
      with them.  We also install the perf binary in two places,
      $prefix/bin and $perfexec_instdir, which appears to be for
      binaries which perf would exec were a command not linked in.
      Correct this, and comment out broken/incomplete targets dist
      and coverage.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c1079abd
    • A
      perf_counter tools: Use hex2u64 in more places · a0055ae2
      Arnaldo Carvalho de Melo 提交于
      This has also a nice side effect, tools built on newer systems such as
      fedora 10 again work on systems with older versions of glibc:
      
      My workstation:
      
      [acme@doppio ~]$ rpm -q glibc.x86_64
      glibc-2.9-3.x86_64
      
      Test machine:
      
      [acme@emilia ~]$ rpm -q glibc.x86_64
      glibc-2.5-24
      
      Before:
      
      [acme@emilia ~]$ perf
      perf: /lib64/libc.so.6: version `GLIBC_2.7' not found (required by perf)
      [acme@emilia ~]$ nm `which perf` | grep GLIBC_2\.7
                       U __isoc99_sscanf@@GLIBC_2.7
      [acme@emilia ~]$
      
      After:
      [acme@emilia ~]$ perf
      usage: perf [--version] [--help] COMMAND [ARGS]
      
      The most commonly used perf commands are:
         record   Run a command and record its profile into perf.data
         report   Read perf.data (created by perf record) and display the
      profile
         stat     Run a command and gather performance counter statistics
         top      Run a command and profile it
      
      See 'perf help COMMAND' for more information on a specific command.
      [acme@emilia ~]$ nm `which perf` | grep GLIBC_2\.7
      [acme@emilia ~]$
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20090601205019.GA7805@ghostprotocols.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a0055ae2
  8. 30 5月, 2009 2 次提交
    • I
      perf_counter tools: Generate per command manpages (and pdf/html, etc.) · c1c2365a
      Ingo Molnar 提交于
      Import Git's nice .txt => {man/html/pdf} generation machinery.
      
      Fix various errors in the Documentation/perf*.txt description as well.
      
      Also fix a bug in builtin-help: we'd map 'perf help top' to 'perftop'
      if only the 'perf' binary is in the default PATH - confusing the manpage
      logic. I dont fully understand why Git did it this way - but i suppose
      it's a migration artifact from their migration from standalone git-xyz
      commands to 'git xyz' commands. The perf tools were always using the
      modern form so it's not an issue there.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c1c2365a
    • I
      perf_counter tools: Fix 'make install' · 7fbd5544
      Ingo Molnar 提交于
      'make install' didnt install perf itself - which needs a special
      rule to be copied to bindir.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7fbd5544
  9. 29 5月, 2009 1 次提交
  10. 27 5月, 2009 2 次提交
    • I
      perf_counter tools: Add built-in pager support · a930d2c0
      Ingo Molnar 提交于
      Add Git's pager.c (and sigchain) code. A command only
      has to call setup_pager() to get paged interactive
      output.
      
      Non-interactive (redirected, command-piped, etc.) uses
      are not affected.
      
      Update perf-report to make use of this.
      
      [ Impact: new feature ]
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a930d2c0
    • I
      perf_counter tools: Introduce stricter C code checking · 16f762a2
      Ingo Molnar 提交于
      Tighten up our C code requirements:
      
       - disallow warnings
       - disallow declarations-mixed-with-statements
       - require proper prototypes
       - require C99 (with gcc extensions)
      
      Fix up a ton of problems these measures unearth:
      
       - unused functions
       - needlessly global functions
       - missing prototypes
       - code mixed with declarations
      
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20090526222155.GJ4424@ghostprotocols.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      16f762a2
  11. 26 5月, 2009 5 次提交
  12. 02 5月, 2009 1 次提交
  13. 30 4月, 2009 1 次提交
  14. 27 4月, 2009 1 次提交
  15. 20 4月, 2009 6 次提交
  16. 09 4月, 2009 1 次提交
    • P
      perf_counter: some simple userspace profiling · de9ac07b
      Peter Zijlstra 提交于
      # perf-record make -j4 kernel/
       # perf-report | tail -15
      
        0.39              cc1 [kernel] lock_acquired
        0.42              cc1 [kernel] lock_acquire
        0.51              cc1 [ user ] /lib64/libc-2.8.90.so: _int_free
        0.51               as [kernel] clear_page_c
        0.53              cc1 [ user ] /lib64/libc-2.8.90.so: memcpy
        0.56              cc1 [ user ] /lib64/libc-2.8.90.so: _IO_vfprintf
        0.63              cc1 [kernel] lock_release
        0.67              cc1 [ user ] /lib64/libc-2.8.90.so: strlen
        0.68              cc1 [kernel] debug_smp_processor_id
        1.38              cc1 [ user ] /lib64/libc-2.8.90.so: _int_malloc
        1.55              cc1 [ user ] /lib64/libc-2.8.90.so: memset
        1.77              cc1 [kernel] __lock_acquire
        1.88              cc1 [kernel] clear_page_c
        3.61               as [ user ] /usr/bin/as: <unknown>
       59.16              cc1 [ user ] /usr/libexec/gcc/x86_64-redhat-linux/4.3.2/cc1: <unknown>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      LKML-Reference: <20090408130409.220518450@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      de9ac07b
  17. 06 4月, 2009 3 次提交
    • P
      perf_counter tools: remove glib dependency and fix bugs in kerneltop.c · cbe46555
      Paul Mackerras 提交于
      The glib dependency in kerneltop.c is only for a little bit of list
      manipulation, and I find it inconvenient.  This adds a 'next' field to
      struct source_line, which lets us link them together into a list.  The
      code to do the linking ourselves turns out to be no longer or more
      difficult than using glib.
      
      This also fixes a few other problems:
      
      - We need to #include <limits.h> to get PATH_MAX on powerpc.
      
      - We need to #include <linux/types.h> rather than have our own
        definitions of __u64 and __s64; on powerpc the installed headers
        define them to be unsigned long and long respectively, and if we
        have our own, different definition here that causes a compile error.
      
      - This takes out the x86 setting of errno from -ret in
        sys_perf_counter_open.  My experiments on x86 indicate that the
        glibc syscall() does this for us already.
      
      - We had two CPU migration counters in the default set, which seems
        unnecessary; I changed one of them to a context switch counter.
      
      - In perfstat mode we were printing CPU cycles and instructions as
        milliseconds, and the cpu clock and task clock counters as events.
        This fixes that.
      
      - In perfstat mode we were still printing a blank line after the first
        counter, which was a holdover from when a task clock counter was
        automatically included as the first counter.  This removes the blank
        line.
      
      - On a test machine here, parse_symbols() and parse_vmlinux() were
        taking long enough (almost 0.5 seconds) for the mmap buffer to
        overflow before we got to the first mmap_read() call, so this moves
        them before we open all the counters.
      
      - The error message if sys_perf_counter_open fails needs to use errno,
        not -fd[i][counter].
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NMike Galbraith <efault@gmx.de>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Orig-LKML-Reference: <18888.29986.340328.540512@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cbe46555
    • I
      perf_counter tools: tidy up in-kernel dependencies · 383c5f8c
      Ingo Molnar 提交于
      Remove now unified perfstat.c and perf_counter.h, and link to the
      in-kernel perf_counter.h.
      
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Orig-LKML-Reference: <20090323172417.677932499@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      383c5f8c
    • I
      perf_counter: add sample user-space to Documentation/perf_counter/ · e0143bad
      Ingo Molnar 提交于
      Initial version of kerneltop.c and perfstat.c.
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e0143bad