1. 29 10月, 2009 5 次提交
    • M
      x86: Add Intel FMA instructions to x86 opcode map · 3f7e454a
      Masami Hiramatsu 提交于
      Add Intel FMA(FUSED-MULTIPLY-ADD) instructions to x86 opcode map
      for x86 instruction decoder.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204235.30545.33997.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f7e454a
    • M
      x86: AVX instruction set decoder support · e0e492e9
      Masami Hiramatsu 提交于
      Add Intel AVX(Advanced Vector Extensions) instruction set
      support to x86 instruction decoder. This adds insn.vex_prefix
      field for storing VEX prefixes, and introduces some original
      tags for expressing opcodes attributes.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204226.30545.23451.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e0e492e9
    • M
      x86: Add pclmulq to x86 opcode map · 82cb5702
      Masami Hiramatsu 提交于
      Add pclmulq opcode to x86 opcode map.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204219.30545.82039.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      82cb5702
    • M
      x86: Merge INAT_REXPFX into INAT_PFX_* · 04d46c1b
      Masami Hiramatsu 提交于
      Merge INAT_REXPFX into INAT_PFX_* macro and rename it to
      INAT_PFX_REX.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204211.30545.58090.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      04d46c1b
    • M
      x86: Fix SSE opcode map bug · 7f387d3f
      Masami Hiramatsu 提交于
      Fix superscripts position because some superscripts of SSE
      opcode are not put in correct position.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091027204204.30545.97296.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7f387d3f
  2. 23 10月, 2009 6 次提交
    • A
      perf probe: Print debug messages using pr_*() · b7cb10e7
      Arnaldo Carvalho de Melo 提交于
      Use the new pr_{err,warning,debug,etc} printout methods, just
      like in the kernel.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com>
      [ Split this patch out, to keep perf/probes separate. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7cb10e7
    • I
      Merge branch 'perf/core' into perf/probes · 43315956
      Ingo Molnar 提交于
      Conflicts:
      	tools/perf/Makefile
      
      Merge reason:
      
       - fix the conflict
       - pick up the pr_*() infrastructure to queue up dependent patch
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      43315956
    • A
      perf tools: Unify debug messages mechanisms · 6beba7ad
      Arnaldo Carvalho de Melo 提交于
      We were using eprintf in some places, that looks at a global
      'verbose' level, and at other places passing a 'v' parameter to
      specify the verbosity level, unify it by introducing
      pr_{err,warning,debug,etc}, just like in the kernel.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6beba7ad
    • F
      perf tools: Drop asm/types.h wrapper · 802da5f2
      Frederic Weisbecker 提交于
      Wrapping the kernel headers is dangerous when it comes to arch
      headers. Once we wrap asm/types.h, it will also replace the
      glibc asm/types.h, not only the kernel one.
      
      This results in build errors on some machines.
      
      Drop this wrapper and do its work from linux/types.h wrapper,
      also the glibc asm/types.h can already handle most of the type
      definition it was doing (typedef __u64, __u32, etc...).
      
      Todo: Check the others asm/*.h wrappers to prevent from other
      conflicts.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      LKML-Reference: <1256246604-17156-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      802da5f2
    • F
      perf tools: Bind callchains to the first sort dimension column · a4fb581b
      Frederic Weisbecker 提交于
      Currently, the callchains are displayed using a constant left
      margin. So depending on the current sort dimension
      configuration, callchains may appear to be well attached to the
      first sort dimension column field which is mostly the case,
      except when the first dimension of sorting is done by comm,
      because these are right aligned.
      
      This patch binds the callchain to the first letter in the first
      column, whatever type of column it is (dso, comm, symbol).
      Before:
      
           0.80%             perf  [k] __lock_acquire
                   __lock_acquire
                   lock_acquire
                   |
                   |--58.33%-- _spin_lock
                   |          |
                   |          |--28.57%-- inotify_should_send_event
                   |          |          fsnotify
                   |          |          __fsnotify_parent
      
      After:
      
           0.80%             perf  [k] __lock_acquire
                             __lock_acquire
                             lock_acquire
                             |
                             |--58.33%-- _spin_lock
                             |          |
                             |          |--28.57%-- inotify_should_send_event
                             |          |          fsnotify
                             |          |          __fsnotify_parent
      
      Also, for clarity, we don't put anymore the callchain as is but:
      
      - If we have a top level ancestor in the callchain, start it
        with a first ascii hook.
      
        Before:
      
           0.80%             perf  [kernel]                        [k] __lock_acquire
                             __lock_acquire
                               lock_acquire
                             |
                             |--58.33%-- _spin_lock
                             |          |
                             |          |--28.57%-- inotify_should_send_event
                             |          |          fsnotify
                            [..]       [..]
      
         After:
      
           0.80%             perf  [kernel]                         [k] __lock_acquire
                             |
                             --- __lock_acquire
                                 lock_acquire
                                |
                                |--58.33%-- _spin_lock
                                |          |
                                |          |--28.57%-- inotify_should_send_event
                                |          |          fsnotify
                               [..]       [..]
      
      - Otherwise, if we have several top level ancestors, then
        display these like we did before:
      
             1.69%           Xorg
                             |
                             |--21.21%-- vread_hpet
                             |          0x7fffd85b46fc
                             |          0x7fffd85b494d
                             |          0x7f4fafb4e54d
                             |
                             |--15.15%-- exaOffscreenAlloc
                             |
                             |--9.09%-- I830WaitLpRing
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      LKML-Reference: <1256246604-17156-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a4fb581b
    • F
      perf tools: Fix missing top level callchain · af0a6fa4
      Frederic Weisbecker 提交于
      While recursively printing the branches of each callchains, we
      forget to display the root. It is never printed.
      
      Say we have:
      
          symbol
          f1
          f2
           |
           -------- f3
           |        f4
           |
           ---------f5
                    f6
      
      Actually we never see that, instead it displays:
      
          symbol
          |
          --------- f3
          |         f4
          |
          --------- f5
                    f6
      
      However f1 is always the same than "symbol" and if we are
      sorting by symbols first then "symbol", f1 and f2 will be well
      aligned like in the above example, so displaying f1 looks
      redundant here.
      
      But if we are sorting by something else first (dso, comm,
      etc...), displaying f1 doesn't look redundant but rather
      necessary because the symbol is not well aligned anymore with
      its callchain:
      
           comm     dso        symbol
           f1
           f2
           |
           --------- [...]
      
      And we want the callchain to be obvious.
      So we fix the bug by printing the root branch, but we also
      filter its first entry if we are sorting by symbols first.
      Reported-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1256246604-17156-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      af0a6fa4
  3. 21 10月, 2009 8 次提交
    • I
      x86, instruction decoder: Fix test_get_len build rules · 9bf4e7fb
      Ingo Molnar 提交于
      Add the kernel source include file as well to the include files
      search path, to fix this build bug:
      
       In file included from arch/x86/tools/test_get_len.c:28:
         arch/x86/lib/insn.c:21:26: error: linux/string.h: No such file or directory
      
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap<systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9bf4e7fb
    • S
      perf tools: Use strsep() over strtok_r() for parsing single line · 4e3b799d
      Steven Rostedt 提交于
      The second argument in the strtok_r() function is not to be used
      generically and can have different implementations. Currently
      the function parsing of the perf trace code uses the second
      argument to copy data from. This can crash the tool or just have
      unpredictable results.
      
      The correct solution is to use strsep() which has a defined
      result.
      
      I also added a check to see if the result was correct, and will
      break out of the loop in case it fails to parse as expected.
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091020232034.237814877@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4e3b799d
    • S
      perf tools: Add 'make DEBUG=1' to remove the -O6 cflag · 60d526f7
      Steven Rostedt 提交于
      When using gdb to debug perf, it is practically impossible to
      use when perf is compiled with -O6. For developers, this patch
      adds the DEBUG feature to the make command line so that a
      developer can easily remove the optimization flag.
      
      LKML-Reference: <1255590330.8392.446.camel@twins>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <20091020232033.984323261@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      60d526f7
    • M
      x86: Add AES opcodes to opcode map · 9983d60d
      Masami Hiramatsu 提交于
      Add Intel AES opcodes to x86 opcode map. These opcodes are
      used in arch/x86/crypt/aesni-intel_asm.S.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap<systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9983d60d
    • M
      x86: Fix group attribute decoding bug · 06ed6ba5
      Masami Hiramatsu 提交于
      Fix a typo in inat_get_group_attribute() which should refer
      inat_group_tables, not inat_escape_tables.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap<systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091020165524.4145.97333.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      06ed6ba5
    • A
      perf top: Fix symbol annotation · c88e4bf6
      Arnaldo Carvalho de Melo 提交于
      We need to use map->unmap_ip() here too to match section
      relative symbol address to the absolute address needed to match
      objdump -dS addresses.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1256061295-19835-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c88e4bf6
    • A
      perf annotate: Remove requirement of passing a symbol name · 8f0b0373
      Arnaldo Carvalho de Melo 提交于
      If the user doesn't pass a symbol name to annotate, it will
      annotate all the symbols that have hits, in order, just like
      'perf report -s comm,dso,symbol'.
      
      This is a natural followup patch to the one that uses
      output_hists to find the symbols with hits.
      
      The common case is to annotate the first few entries at the top
      of a perf report, so lets type less characters.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1256058509-19678-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8f0b0373
    • A
      perf annotate: Use the sym_priv_size area for the histogram · e4204992
      Arnaldo Carvalho de Melo 提交于
      We have this sym_priv_size mechanism for attaching private areas
      to struct symbol entries but annotate wasn't using it, adding
      private areas to struct symbol in addition to a ->priv pointer.
      
      Scrap all that and use the sym_priv_size mechanism.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1256055940-19511-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e4204992
  4. 20 10月, 2009 7 次提交
    • A
      perf tools: Add ->unmap_ip operation to struct map · ed52ce2e
      Arnaldo Carvalho de Melo 提交于
      We need this because we get section relative addresses when
      reading the symtabs, but when a tool like 'perf annotate' needs
      to match these address to what 'objdump -dS' produces we need
      the address + section back again.
      
      So in annotate now we look at the 'struct hist_entry' instances
      (that weren't really being used) so that we iterate only over
      the symbols that had some hit and get the map where that
      particular hit happened so that we can get the right address to
      match with annotate.
      
      Verified that at least:
      
       perf annotate mmap_read_counter # Uses the ~/bin/perf binary
       perf annotate --vmlinux /home/acme/git/build/perf/vmlinux intel_pmu_enable_all
      
      on a 'perf record perf top' session seems to work.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1255979877-12533-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ed52ce2e
    • A
      perf timechart: Add a process filter · bbe2987b
      Arjan van de Ven 提交于
      During the Kernel Summit demo of perf/ftrace/timechart, there
      was a feature request to have a process filter for timechart so
      that you can zoom into one or a few processes that you are
      really interested in.
      
      This patch adds basic support for this feature, the -p
      (--process) option now can select a PID or a process name to be
      shown. Multiple -p options are allowed, and the combined set
      will be included in the output.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091020070939.7d0fb8a7@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bbe2987b
    • I
      Merge branch 'perf/urgent' into perf/core · c258449b
      Ingo Molnar 提交于
      Merge reason: Queue up dependent patch.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c258449b
    • A
      perf timechart: Improve the visual appearance of scheduler delays · 2e600d01
      Arjan van de Ven 提交于
      [from KS feedback]
      
      Currently, scheduler delays are shown in a mostly transparent,
      light yellow color. This color is rather hard to see on several
      screens, especially projectors.
      
      This patch changes the color of the scheduler delays to be a
      much more "hard" yellow that survived the kernel summit
      projector.
      Reported-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: Arjan van de Ven <arjan@linux.intel.com
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091020064731.20ae126a@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e600d01
    • A
      perf timechart: Fix the wakeup-arrows that point to non-visible processes · 3bc2a39c
      Arjan van de Ven 提交于
      The timechart wakeup arrows currently show no process
      information when the waker/wakee are processes that are not
      actually chosen to be shown on the timechart.
      
      This patch fixes this oversight, by looking through all
      processes (after giving preference to visible processes) as well
      as falling back to just showing the PID if no name for the
      process can be resolved.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091020064649.0e4959b2@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3bc2a39c
    • A
      perf tools: Add bunch of missing headers to LIB_H · 79b9ad36
      Arnaldo Carvalho de Melo 提交于
      Build dependencies were not properly mapped out.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1255973491-11626-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      79b9ad36
    • A
      perf tools: Add missing tools/perf/util/include/string.h · 20639c15
      Arnaldo Carvalho de Melo 提交于
      To cure a bunch of:
      
      In file included from util/include/linux/bitmap.h:1,
                       from util/header.h:8,
                       from builtin-trace.c:7:
      util/include/../../../../include/linux/bitmap.h:8:26: error:
      linux/string.h: No such file or directory make: ***
      [builtin-trace.o] Error 1 make: *** Waiting for unfinished
      jobs....
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1255972296-11500-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      20639c15
  5. 19 10月, 2009 9 次提交
    • I
      perf stat: Count branches first · dd86e72a
      Ingo Molnar 提交于
      Count branches first, cache-misses second. The reason is that
      on x86 branches are not counted by all counters on all CPUs.
      
      Before:
      
       Performance counter stats for 'ls':
      
             0.756653  task-clock-msecs         #      0.802 CPUs
                    0  context-switches         #      0.000 M/sec
                    0  CPU-migrations           #      0.000 M/sec
                  250  page-faults              #      0.330 M/sec
              2375725  cycles                   #   3139.781 M/sec
              1628129  instructions             #      0.685 IPC
                19643  cache-references         #     25.960 M/sec
                 4608  cache-misses             #      6.090 M/sec
               342532  branches                 #    452.694 M/sec
        <not counted>  branch-misses
      
          0.000943356  seconds time elapsed
      
      After:
      
       Performance counter stats for 'ls':
      
             1.056734  task-clock-msecs         #      0.859 CPUs
                    0  context-switches         #      0.000 M/sec
                    0  CPU-migrations           #      0.000 M/sec
                  259  page-faults              #      0.245 M/sec
              3345932  cycles                   #   3166.295 M/sec
              3074090  instructions             #      0.919 IPC
               616928  branches                 #    583.806 M/sec
                39279  branch-misses            #      6.367 %
                21312  cache-references         #     20.168 M/sec
                 3661  cache-misses             #      3.464 M/sec
      
          0.001230551  seconds time elapsed
      
      (also prettify the printout of branch misses, in case it's
       getting scaled.)
      
      Cc: Tim Blechmann <tim@klingt.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4ADC3975.8050109@klingt.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ---
       tools/perf/builtin-stat.c |    2 ++
       1 files changed, 2 insertions(+), 0 deletions(-)
      
      diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
      index c373683..95a55ea 100644
      --- a/tools/perf/builtin-stat.c
      +++ b/tools/perf/builtin-stat.c
      @@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = {
         { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS	},
         { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
         { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES	},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES	},
      
       };
      ---
       tools/perf/builtin-stat.c |   20 ++++++++++----------
       1 files changed, 10 insertions(+), 10 deletions(-)
      
      diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
      index 95a55ea..90e0a26 100644
      --- a/tools/perf/builtin-stat.c
      +++ b/tools/perf/builtin-stat.c
      @@ -50,17 +50,17 @@
      
       static struct perf_event_attr default_attrs[] = {
      
      -  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK	},
      -  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES},
      -  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS	},
      -  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS	},
      -
      -  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES	},
      -  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS	},
      -  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
      -  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES	},
      -  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
      -  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES	},
      +  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK		},
      +  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES	},
      +  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS		},
      +  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS		},
      +
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES		},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS		},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES	},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES		},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS	},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES		},
      
       };
      dd86e72a
    • I
      perf stat: Re-align the default_attrs[] array · 56aab464
      Ingo Molnar 提交于
      Clean up the array definition to be vertically aligned.
      
      No functional effects.
      
      Cc: Tim Blechmann <tim@klingt.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4ADC3975.8050109@klingt.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ---
       tools/perf/builtin-stat.c |    2 ++
       1 files changed, 2 insertions(+), 0 deletions(-)
      
      diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
      index c373683..95a55ea 100644
      --- a/tools/perf/builtin-stat.c
      +++ b/tools/perf/builtin-stat.c
      @@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = {
         { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS	},
         { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
         { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES	},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
      +  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES	},
      
       };
      56aab464
    • T
      perf stat: Add branch performance events to default output · 12133aff
      Tim Blechmann 提交于
      Adds performance event information about branches
      and branch misses to the default output of perf stat.
      Signed-off-by: NTim Blechmann <tim@klingt.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4ADC3975.8050109@klingt.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      12133aff
    • R
      perf tools: Display better error messages on missing packages · 1abc7f55
      Randy Dunlap 提交于
      Check for libelf headers and glibc headers separately so that
      the error message correctly identifies which package
      installation is missing/needed.
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: paulus@samba.org
      Cc: a.p.zijlstra@chello.nl
      Cc: efault@gmx.de
      Cc: fweisbec@gmail.com
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <4ADBCCE8.3060300@oracle.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1abc7f55
    • T
      perf top: Fix --delay_secs 0 division by zero · dc79959a
      Tim Blechmann 提交于
      Add delay_secs sanity check to handle_keypress,
      this fixes a division by zero crash.
      Signed-off-by: NTim Blechmann <tim@klingt.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4AD9EBFD.106@klingt.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc79959a
    • F
      perf tools: Use DECLARE_BITMAP instead of an open-coded array · db9f11e3
      Frederic Weisbecker 提交于
      Use DECLARE_BITMAP instead of an open coded array for our bitmap
      of featured sections.
      
      This makes the array an unsigned long instead of a u64 but since
      we use a 256 bits bitmap, the array size shouldn't vary between
      different boxes.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1255795038-13751-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      db9f11e3
    • F
      perf tools: Introduce bitmask'ed additional headers · 2ba08250
      Frederic Weisbecker 提交于
      This provides a new set of bitmasked headers. A new field is
      added in the perf headers that implements a bitmap storing
      optional features present in the perf.data file.
      
      The layout can be pictured like this:
      
      (Usual perf headers)(Features bitmap)[Feature 0][Feature
      n][Feature 255]
      
      If the bit n is set, then the feature n is used in this file.
      They are all set in order. This brings a backward and forward
      compatibility.
      
      The trace_info section has moved into such optional features,
      this is the first and only one for now.
      
      This is backward compatible with the .32 file version although
      it doesn't support the previous separate trace.info file.
      
      And finally it doesn't support the current interim development
      version.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1255792354-11304-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2ba08250
    • F
      perf tools: Use kernel bitmap library · 5a116dd2
      Frederic Weisbecker 提交于
      Use the kernel bitmap library for internal perf tools uses.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1255792354-11304-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5a116dd2
    • A
      perf stat: Add branch performance metric · 11018201
      Anton Blanchard 提交于
      When we count both branches and branch-misses it is useful to
      print out the percentage of branch-misses:
      
       # perf stat -e branches -e branch-misses /bin/true
      
       Performance counter stats for '/bin/true':
      
               401684  branches                 #      0.000 M/sec
                23301  branch-misses            #      5.801 %
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: paulus@samba.org
      Cc: a.p.zijlstra@chello.nl
      LKML-Reference: <20091018112923.GQ4808@kryten>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      11018201
  6. 17 10月, 2009 5 次提交