1. 07 12月, 2009 1 次提交
    • U
      perf tools: Optimize parse_subsystem_tracepoint_event() · 180570fd
      Ulrich Drepper 提交于
      Uses of strcat are almost always signs that someone is too lazy
      to think about the code a bit more carefully.  One always has to
      know about the lengths of the strings involved to avoid buffer
      overflows.
      
      This is one case where the size of the object code for me is
      reduced by 38 bytes.  The code should also be faster, especially
      if flags is non-NULL.
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Cc: a.p.zijlstra@chello.nl
      Cc: fweisbec@gmail.com
      Cc: jaswinderrajput@gmail.com
      Cc: paulus@samba.org
      LKML-Reference: <200912061825.nB6IPUa1023306@hs20-bc2-1.build.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      180570fd
  2. 06 12月, 2009 1 次提交
  3. 24 11月, 2009 2 次提交
    • A
      perf tools: Introduce zalloc() for the common calloc(1, N) case · 36479484
      Arnaldo Carvalho de Melo 提交于
      This way we type less characters and it looks more like the
      kzalloc kernel counterpart.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259071517-3242-3-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      36479484
    • F
      perf tools: Add support for breakpoint events in perf tools · 1b290d67
      Frederic Weisbecker 提交于
      Add the breakpoint events support with this new sysnopsis:
      
        mem:addr[:access]
      
      Where addr is a raw addr value in the kernel and access can be
      either [r][w][x]
      
      Example to profile tasklist_lock:
      
      	$ grep tasklist_lock /proc/kallsyms
      	ffffffff8189c000 D tasklist_lock
      
      	$ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1
      	$ perf report
      
      	# Samples: 62
      	#
      	# Overhead          Command  Shared Object  Symbol
      	# ........  ...............  .............  ......
      	#
      	    29.03%          swapper  [kernel]       [k] _raw_read_trylock
      	    29.03%          swapper  [kernel]       [k] _raw_read_unlock
      	    19.35%             init  [kernel]       [k] _raw_read_trylock
      	    19.35%             init  [kernel]       [k] _raw_read_unlock
      	     1.61%         events/0  [kernel]       [k] _raw_read_trylock
      	     1.61%         events/0  [kernel]       [k] _raw_read_unlock
      
      Coming soon:
      
       - Support for symbols in the event definition.
      
       - Default period to 1 for breakpoint events because these are
         not high frequency events. The same thing is needed for trace
         events.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      LKML-Reference: <1258987355-8751-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      1b290d67
  4. 09 11月, 2009 1 次提交
  5. 28 10月, 2009 1 次提交
  6. 27 10月, 2009 2 次提交
  7. 15 10月, 2009 1 次提交
    • L
      perf trace: Add filter Suppport · c171b552
      Li Zefan 提交于
      Add a new option "--filter <filter_str>" to perf record, and
      it should be right after "-e trace_point":
      
       #./perf record -R -f -e irq:irq_handler_entry --filter irq==18
       ^C
       # ./perf trace
                  perf-4303  ... irq_handler_entry: irq=18 handler=eth0
                  init-0     ... irq_handler_entry: irq=18 handler=eth0
                  init-0     ... irq_handler_entry: irq=18 handler=eth0
                  init-0     ... irq_handler_entry: irq=18 handler=eth0
                  init-0     ... irq_handler_entry: irq=18 handler=eth0
      
      See Documentation/trace/events.txt for the syntax of filter
      expressions.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4AD6955F.90602@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c171b552
  8. 13 10月, 2009 1 次提交
  9. 25 9月, 2009 1 次提交
    • E
      perf tools: Dont use openat() · 725b1368
      Eric Dumazet 提交于
      openat() is still a young glibc facility, better to not use it in a
      non performance critical program (perf list)
      
      Many machines have older glibc (RHEL 4 Update 5 -> glibc-2.3.4-2.36
      on my dev machine for example).
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ulrich Drepper <drepper@redhat.com>
      LKML-Reference: <4ABB767D.6080004@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      725b1368
  10. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  11. 19 9月, 2009 1 次提交
  12. 18 9月, 2009 2 次提交
    • L
      perf trace: Sample timestamp and cpu when using record flag · 1281a49b
      Li Zefan 提交于
      Sample timestamp and cpu just like the -R option.
      
      Before:
                  init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
                  init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
                  init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=1 handler=i8042
                  init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
                  init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=1 handler=i8042
      
      After:
                  init-0     [001]  7364.568965353: irq_handler_entry: irq=18 handler=eth0
                  init-0     [001]  7365.530226877: irq_handler_entry: irq=1 handler=i8042
                  init-0     [001]  7365.542831563: irq_handler_entry: irq=18 handler=eth0
                  init-0     [001]  7365.644156299: irq_handler_entry: irq=18 handler=eth0
                  init-0     [001]  7365.694556201: irq_handler_entry: irq=18 handler=eth0
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4AB1F827.8040905@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1281a49b
    • L
      perf tools: Increase MAX_EVENT_LENGTH · 270bbbe8
      Li Zefan 提交于
      The name length of some trace events is longer than 30, like
      sys_enter_sched_get_priority_max and
      ext4_mb_discard_preallocations.
      
      Passing those events to perf-record will fail, try:
      
        # ./perf record -f -e syscalls:sys_enter_sched_get_priority_max -F 1 -a
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4AB1F4AB.7050205@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      270bbbe8
  13. 13 9月, 2009 2 次提交
    • I
      perf sched: Implement the 'perf sched record' subcommand · 1fc35b29
      Ingo Molnar 提交于
      Implement the 'perf sched record' subcommand that adds a
      default list of events, turns on raw sampling and system-wide
      tracing and passes off the rest of the command to perf record.
      
      This is more convenient than having to specify the events all
      the time.
      
      Before:
      
       $ perf record -a -R -e sched:sched_switch:r -e sched:sched_stat_wait:r -e sched:sched_stat_sleep:r -e sched:sched_stat_iowait:r -e sched:sched_process_exit:r -e sched:sched_process_fork:r -e sched:sched_wakeup:r -e sched:sched_migrate_task:r -c 1 sleep 1
      
      After:
      
       $ perf sched record -f sleep 1
      
      Also fix an assumption in the event string parser that assumed
      that strings passed in can be modified. (In this case they wont
      be as they come from a readonly constant section.)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fc35b29
    • F
      perf tools: Allow the specification of all tracepoints at once · bcd3279f
      Frederic Weisbecker 提交于
      Currently, when one wants to activate every tracepoint
      counters of a subsystem from perf record, the current sequence
      is needed:
      
        perf record -e subsys:ev1 -e subsys:ev2 -e subsys:ev3
      
      This may annoy the most patient of us.
      
      Now we can just do:
      
        perf record -e subsys:*
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bcd3279f
  14. 05 9月, 2009 1 次提交
  15. 28 8月, 2009 1 次提交
  16. 16 8月, 2009 1 次提交
    • I
      perf: Enable more compiler warnings · 83a0944f
      Ingo Molnar 提交于
      Related to a shadowed variable bug fix Valdis Kletnieks noticed
      that perf does not get built with -Wshadow, which could have
      helped us avoid the bug.
      
      So enable -Wshadow and also enable the following warnings on
      perf builds, in addition to the already enabled -Wall -Wextra
      -std=gnu99 warnings:
      
       -Wcast-align
       -Wformat=2
       -Wshadow
       -Winit-self
       -Wpacked
       -Wredundant-decls
       -Wstack-protector
       -Wstrict-aliasing=3
       -Wswitch-default
       -Wswitch-enum
       -Wno-system-headers
       -Wundef
       -Wvolatile-register-var
       -Wwrite-strings
       -Wbad-function-cast
       -Wmissing-declarations
       -Wmissing-prototypes
       -Wnested-externs
       -Wold-style-definition
       -Wstrict-prototypes
       -Wdeclaration-after-statement
      
      And change/fix the perf code to build cleanly under GCC 4.3.2.
      
      The list of warnings enablement is rather arbitrary: it's based
      on my (quick) reading of the GCC manpages and trying them on
      perf.
      
      I categorized the warnings based on individually enabling them
      and looking whether they trigger something in the perf build.
      If i liked those warnings (i.e. if they trigger for something
      that arguably could be improved) i enabled the warning.
      
      If the warnings seemed to come from language laywers spamming
      the build with tons of nuisance warnings i generally kept them
      off. Most of the sign conversion related warnings were in
      this category. (A second patch enabling some of the sign
      warnings might be welcome - sign bugs can be nasty.)
      
      I also kept warnings that seem to make sense from their manpage
      description and which produced no actual warnings on our code
      base. These warnings might still be turned off if they end up
      being a nuisance.
      
      I also left out a few warnings that are not supported in older
      compilers.
      
      [ Note that these changes might break the build on older
        compilers i did not test, or on non-x86 architectures that
        produce different warnings, so more testing would be welcome. ]
      
      Reported-by: Valdis.Kletnieks@vt.edu
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      83a0944f
  17. 13 8月, 2009 1 次提交
    • F
      perf tools: Add a per tracepoint counter attribute to get raw sample · 3a9f131f
      Frederic Weisbecker 提交于
      Add a new flag field while opening a tracepoint perf counter:
      
      	-e tracepoint_subsystem:tracepoint_name:flags
      
      This is intended to be generic although for now it only supports the
      r[e[c[o[r[d]]]]] flag:
      
      	./perf record -e workqueue:workqueue_insertion:record
      	./perf record -e workqueue:workqueue_insertion:r
      
      will have the same effect: enabling the raw samples record for
      the given tracepoint counter.
      
      In the future, we may want to support further flags, separated
      by commas.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1250152039-7284-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a9f131f
  18. 12 8月, 2009 1 次提交
  19. 09 8月, 2009 2 次提交
  20. 23 7月, 2009 2 次提交
  21. 13 7月, 2009 1 次提交
  22. 10 7月, 2009 1 次提交
  23. 01 7月, 2009 3 次提交
    • J
      perf list: Add cache events · 73c24cb8
      Jaswinder Singh Rajput 提交于
      After:
      
      $ ./perf list
      
      List of pre-defined events (to be used in -e):
      
        cpu-cycles OR cycles                     [Hardware event]
        instructions                             [Hardware event]
        cache-references                         [Hardware event]
        cache-misses                             [Hardware event]
        branch-instructions OR branches          [Hardware event]
        branch-misses                            [Hardware event]
        bus-cycles                               [Hardware event]
      
        cpu-clock                                [Software event]
        task-clock                               [Software event]
        page-faults OR faults                    [Software event]
        minor-faults                             [Software event]
        major-faults                             [Software event]
        context-switches OR cs                   [Software event]
        cpu-migrations OR migrations             [Software event]
      
        L1-d$-loads                              [Hardware cache event]
        L1-d$-load-misses                        [Hardware cache event]
        L1-d$-stores                             [Hardware cache event]
        L1-d$-store-misses                       [Hardware cache event]
        L1-d$-prefetches                         [Hardware cache event]
        L1-d$-prefetch-misses                    [Hardware cache event]
        L1-i$-loads                              [Hardware cache event]
        L1-i$-load-misses                        [Hardware cache event]
        L1-i$-prefetches                         [Hardware cache event]
        L1-i$-prefetch-misses                    [Hardware cache event]
        LLC-loads                                [Hardware cache event]
        LLC-load-misses                          [Hardware cache event]
        LLC-stores                               [Hardware cache event]
        LLC-store-misses                         [Hardware cache event]
        LLC-prefetches                           [Hardware cache event]
        LLC-prefetch-misses                      [Hardware cache event]
        dTLB-loads                               [Hardware cache event]
        dTLB-load-misses                         [Hardware cache event]
        dTLB-stores                              [Hardware cache event]
        dTLB-store-misses                        [Hardware cache event]
        dTLB-prefetches                          [Hardware cache event]
        dTLB-prefetch-misses                     [Hardware cache event]
        iTLB-loads                               [Hardware cache event]
        iTLB-load-misses                         [Hardware cache event]
        branch-loads                             [Hardware cache event]
        branch-load-misses                       [Hardware cache event]
      
        rNNN                                     [raw hardware event descriptor]
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1246453578.3072.1.camel@ht.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      73c24cb8
    • I
      perf_counter tools: Add more warnings and fix/annotate them · f37a291c
      Ingo Molnar 提交于
      Enable -Wextra. This found a few real bugs plus a number
      of signed/unsigned type mismatches/uncleanlinesses. It
      also required a few annotations
      
      All things considered it was still worth it so lets try with
      this enabled for now.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f37a291c
    • P
      perf_counter tools: Rework event string parsing/syntax · 61c45981
      Paul Mackerras 提交于
      This reworks the parser for event descriptors to make it more
      consistent in what it accepts.  It is now structured as a
      recursive descent parser for the following grammar:
      
      events		::= event ( ("," | space) space* event )*
      event		::= ( raw_event | numeric_event | symbolic_event |
      		      generic_hw_event ) [ event_modifier ]
      raw_event	::= "r" hex_number
      numeric_event	::= number ":" number
      number		::= decimal_number | "0x" hex_number | "0" octal_number
      symbolic_event	::= string_from_event_symbols_array
      generic_hw_event::= cache_type ( "-" ( cache_op | cache_result ) )*
      event_modifier	::= ":" ( "u" | "k" | "h" )+
      
      with the extra restriction that you can have at most one
      cache_op and at most one cache_result.
      
      We pass the current string pointer by reference (i.e. as a
      const char **) to the various parsing functions so that they
      can advance the pointer to indicate how much they consumed.
      They return 0 if they didn't recognize the thing at the pointer
      or 1 if they did (and advance the pointer past it).
      
      This also fixes parse_aliases to take the longest matching
      alias from the table, not the first one.  Otherwise "l1-data"
      would match the "l1-d" alias and the "ata" would not be
      consumed.
      
      This allows event modifiers indicating what processor modes to
      count in to be applied to any event, not just numeric events,
      and adds a ":h" modifier to indicate counting in hypervisor
      mode.  Specifying ":u" now sets both exclude_kernel and
      exclude_hv, and so on.  Multiple modes can be specified, e.g.
      ":uk" will count in user or hypervisor mode (i.e. only
      exclude_kernel will be set).
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <19018.53826.843815.189847@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      61c45981
  24. 26 6月, 2009 1 次提交
  25. 25 6月, 2009 2 次提交
    • J
      perf_counter tools: Shorten names for events · e5c59547
      Jaswinder Singh Rajput 提交于
      Added new alias for events.
      
      On AMD box:
      
       $ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null
      
      Before :
      
       Performance counter stats for 'ls -lR /usr/include/':
      
            248064467  L1-data-Cache-Load-Referencees  (scaled from 23.27%)
              1001433  L1-data-Cache-Load-Misses  (scaled from 23.34%)
               153691  L1-data-Cache-Store-Referencees  (scaled from 23.34%)
               423248  L1-data-Cache-Prefetch-Referencees  (scaled from 23.33%)
               302138  L1-data-Cache-Prefetch-Misses  (scaled from 23.25%)
            251217546  L1-instruction-Cache-Load-Referencees  (scaled from 23.25%)
              5757005  L1-instruction-Cache-Load-Misses  (scaled from 23.23%)
                93435  L1-instruction-Cache-Prefetch-Referencees  (scaled from 23.24%)
              6496073  L2-Cache-Load-Referencees  (scaled from 23.32%)
               609485  L2-Cache-Load-Misses  (scaled from 23.45%)
              6876991  L2-Cache-Store-Referencees  (scaled from 23.71%)
            248922840  Data-TLB-Cache-Load-Referencees  (scaled from 23.94%)
              5828386  Data-TLB-Cache-Load-Misses  (scaled from 24.17%)
            257613506  Instruction-TLB-Cache-Load-Referencees  (scaled from 24.20%)
                 6833  Instruction-TLB-Cache-Load-Misses  (scaled from 23.88%)
            109043606  Branch-Cache-Load-Referencees  (scaled from 23.64%)
              5552296  Branch-Cache-Load-Misses  (scaled from 23.42%)
      
          0.413702461  seconds time elapsed.
      
      After :
      
       Peformance counter stats for 'ls -lR /usr/include/':
      
            266590464  L1-d$-loads           (scaled from 23.03%)
              1222273  L1-d$-load-misses     (scaled from 23.58%)
               146204  L1-d$-stores          (scaled from 23.83%)
               406344  L1-d$-prefetches      (scaled from 24.09%)
               283748  L1-d$-prefetch-misses (scaled from 24.10%)
            249650965  L1-i$-loads           (scaled from 23.80%)
              3353961  L1-i$-load-misses     (scaled from 23.82%)
               104599  L1-i$-prefetches      (scaled from 23.68%)
              4836405  LLC-loads             (scaled from 23.67%)
               498214  LLC-load-misses       (scaled from 23.66%)
              4953994  LLC-stores            (scaled from 23.64%)
            243354097  dTLB-loads            (scaled from 23.77%)
              6468584  dTLB-load-misses      (scaled from 23.74%)
            249719549  iTLB-loads            (scaled from 23.25%)
                 5060  iTLB-load-misses      (scaled from 23.00%)
            112343016  branch-loads          (scaled from 22.76%)
              5528876  branch-load-misses    (scaled from 22.54%)
      
          0.427154051  seconds time elapsed.
      
      Reported-by : Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1245934522.5308.39.camel@hpdv5.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e5c59547
    • J
      perf_counter tools: Check for valid cache operations · 06813f6c
      Jaswinder Singh Rajput 提交于
      Made new table for cache operartion stat 'hw_cache_stat' as:
      
       L1I : Read and prefetch only
       ITLB and BPU : Read-only
      
      introduce is_cache_op_valid() for cache operation validity
      
      And checks for valid cache operations.
      
      Reported-by : Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1245930367.5308.33.camel@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      06813f6c
  26. 22 6月, 2009 3 次提交
  27. 20 6月, 2009 1 次提交
    • P
      perf_counter tools: Define and use our own u64, s64 etc. definitions · 9cffa8d5
      Paul Mackerras 提交于
      On 64-bit powerpc, __u64 is defined to be unsigned long rather than
      unsigned long long.  This causes compiler warnings every time we
      print a __u64 value with %Lx.
      
      Rather than changing __u64, we define our own u64 to be unsigned long
      long on all architectures, and similarly s64 as signed long long.
      For consistency we also define u32, s32, u16, s16, u8 and s8.  These
      definitions are put in a new header, types.h, because these definitions
      are needed in util/string.h and util/symbol.h.
      
      The main change here is the mechanical change of __[us]{64,32,16,8}
      to remove the "__".  The other changes are:
      
      * Create types.h
      * Include types.h in perf.h, util/string.h and util/symbol.h
      * Add types.h to the LIB_H definition in Makefile
      * Added (u64) casts in process_overflow_event() and print_sym_table()
        to kill two remaining warnings.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19003.33494.495844.956580@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9cffa8d5
  28. 13 6月, 2009 1 次提交
    • I
      perf stat: Reorganize output · 44175b6f
      Ingo Molnar 提交于
       - use IPC for the instruction normalization output
       - CPUs for the CPU utilization factor value.
       - print out time elapsed like the other rows
       - tidy up the task-clocks/cpu-clocks printout
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      44175b6f
  29. 12 6月, 2009 1 次提交