1. 22 5月, 2011 6 次提交
  2. 20 5月, 2011 3 次提交
  3. 19 5月, 2011 19 次提交
    • I
      perf stat: Add more cache-miss percentage printouts · c3305257
      Ingo Molnar 提交于
      Print out the cache-miss percentage as well if the cache refs were
      collected, for all the generic cache event types.
      
      Before:
      
         11,103,723,230 dTLB-loads                #  622.471 M/sec                    ( +-  0.30% )
             87,065,337 dTLB-load-misses          #    4.881 M/sec                    ( +-  0.90% )
      
      After:
      
         11,353,713,242 dTLB-loads                #  626.020 M/sec                    ( +-  0.35% )
            113,393,472 dTLB-load-misses          #    1.00% of all dTLB cache hits   ( +-  0.49% )
      
      Also ASCII color highlight too high percentages, them when it's executed on the console.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c3305257
    • I
      perf stat: Add -d -d and -d -d -d options to show more CPU events · 2cba3ffb
      Ingo Molnar 提交于
      Print even more detailed statistics if requested via perf stat -d:
      
             -d:          detailed events, L1 and LLC data cache
          -d -d:     more detailed events, dTLB and iTLB events
       -d -d -d:     very detailed events, adding prefetch events
      
      Full output looks like this now:
      
       Performance counter stats for '/home/mingo/hackbench 10' (5 runs):
      
             1703.674707 task-clock                #    8.709 CPUs utilized            ( +-  4.19% )
                  49,068 context-switches          #    0.029 M/sec                    ( +- 16.66% )
                   8,303 CPU-migrations            #    0.005 M/sec                    ( +- 24.90% )
                  17,397 page-faults               #    0.010 M/sec                    ( +-  0.46% )
           2,345,389,239 cycles                    #    1.377 GHz                      ( +-  4.61% ) [55.90%]
           1,884,503,527 stalled-cycles-frontend   #   80.35% frontend cycles idle     ( +-  5.67% ) [50.39%]
             743,919,737 stalled-cycles-backend    #   31.72% backend  cycles idle     ( +-  8.75% ) [49.91%]
           1,314,416,379 instructions              #    0.56  insns per cycle
                                                   #    1.43  stalled cycles per insn  ( +-  2.53% ) [60.87%]
             272,592,567 branches                  #  160.003 M/sec                    ( +-  1.74% ) [56.56%]
               3,794,846 branch-misses             #    1.39% of all branches          ( +-  6.59% ) [58.50%]
             449,982,778 L1-dcache-loads           #  264.125 M/sec                    ( +-  2.47% ) [49.88%]
              22,404,961 L1-dcache-load-misses     #    4.98% of all L1-dcache hits    ( +-  6.08% ) [55.05%]
               6,204,750 LLC-loads                 #    3.642 M/sec                    ( +-  8.91% ) [43.75%]
               1,837,411 LLC-load-misses           #    1.078 M/sec                    ( +-  7.27% ) [12.07%]
             411,440,421 L1-icache-loads           #  241.502 M/sec                    ( +-  5.60% ) [36.52%]
              27,556,832 L1-icache-load-misses     #   16.175 M/sec                    ( +-  7.46% ) [46.72%]
             464,067,627 dTLB-loads                #  272.392 M/sec                    ( +-  4.46% ) [54.17%]
              10,765,648 dTLB-load-misses          #    6.319 M/sec                    ( +-  3.18% ) [48.68%]
           1,273,080,386 iTLB-loads                #  747.256 M/sec                    ( +-  3.38% ) [47.53%]
                 117,481 iTLB-load-misses          #    0.069 M/sec                    ( +- 14.99% ) [47.01%]
               4,590,653 L1-dcache-prefetches      #    2.695 M/sec                    ( +-  4.49% ) [46.19%]
               1,712,660 L1-dcache-prefetch-misses #    1.005 M/sec                    ( +-  3.75% ) [44.82%]
      
              0.195622057  seconds time elapsed  ( +-  6.84% )
      
      Also clean up the attribute construction code to be appending, and factor
      it out into add_default_attributes().
      
      Tweak the coverage percentage printout a bit, so that it's easier to view it
      alongside the +- sttddev colum.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      2cba3ffb
    • M
      ftrace/kbuild: Add recordmcount files to force full build · d6971822
      Michal Marek 提交于
      Modifications to recordmcount must be performed on all object
      files to stay consistent with what the kernel code may expect.
      Add the recordmcount files to the main dependencies to make sure
      any change to them causes a full recompile.
      Signed-off-by: NMichal Marek <mmarek@suse.cz>
      Link: http://lkml.kernel.org/r/20110517133646.GP13293@sepie.suse.czSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d6971822
    • S
      ftrace: Add self-tests for multiple function trace users · 95950c2e
      Steven Rostedt 提交于
      Add some basic sanity tests for multiple users of the function
      tracer at startup.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      95950c2e
    • S
      ftrace: Modify ftrace_set_filter/notrace to take ops · 936e074b
      Steven Rostedt 提交于
      Since users of the function tracer can now pick and choose which
      functions they want to trace agnostically from other users of the
      function tracer, we need to pass the ops struct to the ftrace_set_filter()
      functions.
      
      The functions ftrace_set_global_filter() and ftrace_set_global_notrace()
      is added to keep the old filter functions which are used to modify
      the generic function tracers.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      936e074b
    • S
      ftrace: Allow dynamically allocated function tracers · cdbe61bf
      Steven Rostedt 提交于
      Now that functions may be selected individually, it only makes sense
      that we should allow dynamically allocated trace structures to
      be traced. This will allow perf to allocate a ftrace_ops structure
      at runtime and use it to pick and choose which functions that
      structure will trace.
      
      Note, a dynamically allocated ftrace_ops will always be called
      indirectly instead of being called directly from the mcount in
      entry.S. This is because there's no safe way to prevent mcount
      from being preempted before calling the function, unless we
      modify every entry.S to do so (not likely). Thus, dynamically allocated
      functions will now be called by the ftrace_ops_list_func() that
      loops through the ops that are allocated if there are more than
      one op allocated at a time. This loop is protected with a
      preempt_disable.
      
      To determine if an ftrace_ops structure is allocated or not, a new
      util function was added to the kernel/extable.c called
      core_kernel_data(), which returns 1 if the address is between
      _sdata and _edata.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cdbe61bf
    • S
      ftrace: Implement separate user function filtering · b848914c
      Steven Rostedt 提交于
      ftrace_ops that are registered to trace functions can now be
      agnostic to each other in respect to what functions they trace.
      Each ops has their own hash of the functions they want to trace
      and a hash to what they do not want to trace. A empty hash for
      the functions they want to trace denotes all functions should
      be traced that are not in the notrace hash.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b848914c
    • S
      ftrace: Free hash with call_rcu_sched() · 07fd5515
      Steven Rostedt 提交于
      When a hash is modified and might be in use, we need to perform
      a schedule RCU operation on it, as the hashes will soon be used
      directly in the function tracer callback.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      07fd5515
    • S
      ftrace: Have global_ops store the functions that are to be traced · 2b499381
      Steven Rostedt 提交于
      This is a step towards each ops structure defining its own set
      of functions to trace. As the current code with pid's and such
      are specific to the global_ops, it is restructured to be used
      with the global ops.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2b499381
    • S
      ftrace: Add ops parameter to ftrace_startup/shutdown functions · bd69c30b
      Steven Rostedt 提交于
      In order to allow different ops to enable different functions,
      the ftrace_startup() and ftrace_shutdown() functions need the
      ops parameter passed to them.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bd69c30b
    • S
      ftrace: Add enabled_functions file · 647bcd03
      Steven Rostedt 提交于
      Add the enabled_functions file that is used to show all the
      functions that have been enabled for tracing as well as their
      ref counts. This helps seeing if any function has been registered
      and what functions are being traced.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      647bcd03
    • S
      ftrace: Use counters to enable functions to trace · ed926f9b
      Steven Rostedt 提交于
      Every function has its own record that stores the instruction
      pointer and flags for the function to be traced. There are only
      two flags: enabled and free. The enabled flag states that tracing
      for the function has been enabled (actively traced), and the free
      flag states that the record no longer points to a function and can
      be used by new functions (loaded modules).
      
      These flags are now moved to the MSB of the flags (actually just
      the top 32bits). The rest of the bits (30 bits) are now used as
      a ref counter. Everytime a tracer register functions to trace,
      those functions will have its counter incremented.
      
      When tracing is enabled, to determine if a function should be traced,
      the counter is examined, and if it is non-zero it is set to trace.
      
      When a ftrace_ops is registered to trace functions, its hashes
      are examined. If the ftrace_ops filter_hash count is zero, then
      all functions are set to be traced, otherwise only the functions
      in the hash are to be traced. The exception to this is if a function
      is also in the ftrace_ops notrace_hash. Then that function's counter
      is not incremented for this ftrace_ops.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ed926f9b
    • S
      ftrace: Separate hash allocation and assignment · 33dc9b12
      Steven Rostedt 提交于
      When filtering, allocate a hash to insert the function records.
      After the filtering is complete, assign it to the ftrace_ops structure.
      
      This allows the ftrace_ops structure to have a much smaller array of
      hash buckets instead of wasting a lot of memory.
      
      A read only empty_hash is created to be the minimum size that any ftrace_ops
      can point to.
      
      When a new hash is created, it has the following steps:
      
      o Allocate a default hash.
      o Walk the function records assigning the filtered records to the hash
      o Allocate a new hash with the appropriate size buckets
      o Move the entries from the default hash to the new hash.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      33dc9b12
    • S
      ftrace: Create a global_ops to hold the filter and notrace hashes · f45948e8
      Steven Rostedt 提交于
      Combine the filter and notrace hashes to be accessed by a single entity,
      the global_ops. The global_ops is a ftrace_ops structure that is passed
      to different functions that can read or modify the filtering of the
      function tracer.
      
      The ftrace_ops structure was modified to hold a filter and notrace
      hashes so that later patches may allow each ftrace_ops to have its own
      set of rules to what functions may be filtered.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f45948e8
    • S
      ftrace: Use hash instead for FTRACE_FL_FILTER · 1cf41dd7
      Steven Rostedt 提交于
      When multiple users are allowed to have their own set of functions
      to trace, having the FTRACE_FL_FILTER flag will not be enough to
      handle the accounting of those users. Each user will need their own
      set of functions.
      
      Replace the FTRACE_FL_FILTER with a filter_hash instead. This is
      temporary until the rest of the function filtering accounting
      gets in.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1cf41dd7
    • S
      ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions · b448c4e3
      Steven Rostedt 提交于
      To prepare for the accounting system that will allow multiple users of
      the function tracer, having the FTRACE_FL_NOTRACE as a flag in the
      dyn_trace record does not make sense.
      
      All ftrace_ops will soon have a hash of functions they should trace
      and not trace. By making a global hash of functions not to trace makes
      this easier for the transition.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b448c4e3
    • I
      perf bench, x86: Add alternatives-asm.h wrapper · b3132072
      Ingo Molnar 提交于
      perf bench needs this to build the kernel's memcpy routine:
      
      In file included from bench/mem-memcpy-x86-64-asm.S:2:0:
      bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b3132072
    • I
      Merge branch 'x86/mem' into perf/core · 01ed58ab
      Ingo Molnar 提交于
      Merge reason: memcpy_64.S changes an assumption perf bench has, so merge this
                    here so we can fix it.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      01ed58ab
    • I
      Merge branch 'tip/perf/core-2' of... · af2d03d4
      Ingo Molnar 提交于
      Merge branch 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
      af2d03d4
  4. 18 5月, 2011 12 次提交