1. 18 8月, 2009 1 次提交
    • F
      perf tools: Fix spelling mistake in callchain error · 6ede59c4
      Frederic Weisbecker 提交于
      While running perf report -g in a perf.data file that hasn't
      been recorded in callchain mode, the error reported has a
      spelling issue:
      
      	./perf report -g
      	selected -c but no callchain data. Did you call perf record without -g?
      
      Fix it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1250543271-8383-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ede59c4
  2. 17 8月, 2009 4 次提交
  3. 16 8月, 2009 1 次提交
    • I
      perf: Enable more compiler warnings · 83a0944f
      Ingo Molnar 提交于
      Related to a shadowed variable bug fix Valdis Kletnieks noticed
      that perf does not get built with -Wshadow, which could have
      helped us avoid the bug.
      
      So enable -Wshadow and also enable the following warnings on
      perf builds, in addition to the already enabled -Wall -Wextra
      -std=gnu99 warnings:
      
       -Wcast-align
       -Wformat=2
       -Wshadow
       -Winit-self
       -Wpacked
       -Wredundant-decls
       -Wstack-protector
       -Wstrict-aliasing=3
       -Wswitch-default
       -Wswitch-enum
       -Wno-system-headers
       -Wundef
       -Wvolatile-register-var
       -Wwrite-strings
       -Wbad-function-cast
       -Wmissing-declarations
       -Wmissing-prototypes
       -Wnested-externs
       -Wold-style-definition
       -Wstrict-prototypes
       -Wdeclaration-after-statement
      
      And change/fix the perf code to build cleanly under GCC 4.3.2.
      
      The list of warnings enablement is rather arbitrary: it's based
      on my (quick) reading of the GCC manpages and trying them on
      perf.
      
      I categorized the warnings based on individually enabling them
      and looking whether they trigger something in the perf build.
      If i liked those warnings (i.e. if they trigger for something
      that arguably could be improved) i enabled the warning.
      
      If the warnings seemed to come from language laywers spamming
      the build with tons of nuisance warnings i generally kept them
      off. Most of the sign conversion related warnings were in
      this category. (A second patch enabling some of the sign
      warnings might be welcome - sign bugs can be nasty.)
      
      I also kept warnings that seem to make sense from their manpage
      description and which produced no actual warnings on our code
      base. These warnings might still be turned off if they end up
      being a nuisance.
      
      I also left out a few warnings that are not supported in older
      compilers.
      
      [ Note that these changes might break the build on older
        compilers i did not test, or on non-x86 architectures that
        produce different warnings, so more testing would be welcome. ]
      
      Reported-by: Valdis.Kletnieks@vt.edu
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      83a0944f
  4. 15 8月, 2009 1 次提交
  5. 13 8月, 2009 1 次提交
  6. 12 8月, 2009 4 次提交
  7. 10 8月, 2009 1 次提交
  8. 09 8月, 2009 6 次提交
    • B
      perf report: Fix and improve the displaying of per-thread event counters · 8d513270
      Brice Goglin 提交于
      Improve and fix the handling of per-thread counter stats
      recorded via perf record -s. Previously we only displayed
      it in debug printouts (-D) and even that output was hard
      to disambiguate.
      
      I moved everything to utils/values.[ch] so that we may reuse
      it in perf stat.
      
      We get something like this now:
      
       #  PID   TID  cache-misses  cache-references
         4658  4659        495581           3238779
         4658  4662        498246           3236823
         4658  4663        499531           3243162
      
      Then it'll be easy to add --pretty=raw to display a single line per thread/event.
      
      By the way, -S was also used for --symbol... So I used -T/--thread here.
      
      perf report: Add -T/--threads to display per-thread counter values
      
       We get something like this now:
       #  PID   TID  cache-misses  cache-references
         4658  4659        495581           3238779
         4658  4662        498246           3236823
         4658  4663        499531           3243162
      
      Per-thread arrays of counter values are managed in utils/values.[ch]
      Signed-off-by: NBrice Goglin <Brice.Goglin@inria.fr>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8d513270
    • F
      perf tools: callchain: Fix sum of percentages to be 100% by displaying amount... · 25446036
      Frederic Weisbecker 提交于
      perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode
      
      When we filter the callchains below a given percentage, we
      ignore them and the end result only shows entries that have an
      upper percentage than the filter threshold.
      
      It seems to users then that we have an imbalance in the
      percentage, as if the sum inside a profiled branch doesn't
      reach 100%.
      
      Since in the past there have been real perf report bugs that
      showed the same sypmtom, it would be nice to assure the user
      that the data is perfect and trustable and it all sums up to
      100.00%.
      
      So fix this by displaying the remaining hits that have been
      filtered but without more detail than their amount in each
      branches. Example while filtering below 50%:
      
      7.73%  [k] delay_tsc
                      |
                      |--98.22%-- __const_udelay
                      |          |
                      |          |--86.37%-- ath5k_hw_register_timeout
                      |          |          ath5k_hw_noise_floor_calibration
                      |          |          ath5k_hw_reset
                      |          |          ath5k_reset
                      |          |          ath5k_config
                      |          |          ieee80211_hw_config
                      |          |          |
                      |          |          |--88.53%-- ieee80211_scan_work
                      |          |          |          worker_thread
                      |          |          |          kthread
                      |          |          |          child_rip
                      |          |           --11.47%-- [...]
                      |           --13.63%-- [...]
                       --1.78%-- [...]
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249690585-9145-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      25446036
    • F
      perf tools: callchain: Fix 'perf report' display to be callchain by default · b1a88349
      Frederic Weisbecker 提交于
      If we recorded with -g option to record the callchain, right now
      we require a -g option to perf report as well - and people reported
      this as unnecessary complication: the user already specified -g
      once, no need to require it a second time.
      
      So if the recording includes call-chains, display the callchain by
      default from perf report.
      
      ( The user can override this default using "-g none" option from
        perf report. )
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249690585-9145-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b1a88349
    • A
      perf report: Add debug help for the finding of symbol bugs - show the symtab... · 94cb9e38
      Arnaldo Carvalho de Melo 提交于
      perf report: Add debug help for the finding of symbol bugs - show the symtab origin (DSO, build-id, kernel, etc)
      
      Used with perf report --verbose:
      
      [acme@doppio linux-2.6-tip]$ perf report -v | head -16
           5.17%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
           2.56%  firefox  /lib64/libpthread-2.10.1.so            0x0000000000008e02 d [.] __pthread_mutex_lock_internal
           1.94%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x0000000000d0af8f f [.] SearchTable
           1.75%  firefox  [kernel]                               0xffffffffff60013b k [.] vread_hpet
           1.63%  firefox  /lib64/libpthread-2.10.1.so            0x000000000000a404 d [.] __pthread_mutex_unlock
           1.47%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret
           1.42%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer
           1.24%  firefox  [kernel]                               0xffffffff8102ca4a k [k] read_hpet
           1.16%  firefox  [kernel]                               0xffffffff810f3dd4 k [k] fget_light
           1.11%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject
           0.98%  firefox  /usr/lib64/firefox-3.5.2/firefox       0x000000000000dd23 b [.] arena_ralloc
      [acme@doppio linux-2.6-tip]$
      
      The new field is just after the symbol address. To help in
      figuring out symbol resolution bugs.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94cb9e38
    • P
      perf report: Fix per task mult-counter stat reporting · 8f18aec5
      Peter Zijlstra 提交于
      Brice Goglin reported:
      
      > I can easily sort them by thread id, but I don't know how to match
      > my 4 events with each group of 4 lines.
      
      Also report the counter id and the time running/enabled
      stats (in case the counter got time-shared).
      Reported-by: NBrice Goglin <Brice.Goglin@inria.fr>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: NBrice Goglin <Brice.Goglin@inria.fr>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8f18aec5
    • F
      perf tools: Fix call-chain cumul hit based sub-total (fractal mode) · 1953287b
      Frederic Weisbecker 提交于
      The callchain fractal mode builds each new total hits in a new
      branch of profiling by using the parent's hits of the current
      branch plus the hits of the children.
      
      This is wrong, the total hits of a branch should be made of the
      sum of every children hits, we must ignore the parent hits in
      this scope.
      
      This patch also fixes another mistake with the hit counting.
      
      Now the rates are correct.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1953287b
  9. 05 8月, 2009 1 次提交
  10. 04 8月, 2009 1 次提交
  11. 02 8月, 2009 1 次提交
  12. 23 7月, 2009 1 次提交
    • P
      perf_counter: PERF_SAMPLE_ID and inherited counters · 7f453c24
      Peter Zijlstra 提交于
      Anton noted that for inherited counters the counter-id as provided by
      PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID
      because each inherited counter gets its own id.
      
      His suggestion was to always return the parent counter id, since that
      is the primary counter id as exposed. However, these inherited
      counters have a unique identifier so that events like
      PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which
      counter gets modified, which is important when trying to normalize the
      sample streams.
      
      This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD,
      which is more useful anyway, since changing periods became a lot more
      common than initially thought -- rendering PERF_EVENT_PERIOD the less
      useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate
      value, since it reports the value used to trigger the overflow,
      whereas PERF_EVENT_PERIOD simply reports the requested period changed,
      which might only take effect on the next cycle).
      
      This still leaves us PERF_EVENT_THROTTLE to consider, but since that
      _should_ be a rare occurrence, and linking it to a primary id is the
      most useful bit to diagnose the problem, we introduce a
      PERF_SAMPLE_STREAM_ID, for those few cases where the full
      reconstruction is important.
      
      [Does change the ABI a little, but I see no other way out]
      Suggested-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1248095846.15751.8781.camel@twins>
      7f453c24
  13. 18 7月, 2009 1 次提交
  14. 12 7月, 2009 3 次提交
    • A
      perf report: Introduce -n/--show-nr-samples · e3d7e183
      Arnaldo Carvalho de Melo 提交于
      [acme@doppio pahole]$ perf report -ns comm,dso,symbol -d /lib64/libc-2.10.1.so -C pahole | head -17
          21.94%      32101  [.] _int_malloc
          20.10%      29402  [.] __GI_strcmp
          16.77%      24533  [.] __tsearch
          12.61%      18450  [.] malloc_consolidate
           6.42%       9394  [.] _int_free
           6.28%       9191  [.] __tfind
           4.56%       6678  [.] __GI___libc_free
           4.46%       6520  [.] _IO_vfprintf_internal
           2.59%       3786  [.] __malloc
           1.17%       1716  [.] __GI_memcpy
      [acme@doppio pahole]$
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1247325517-12272-5-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e3d7e183
    • A
      perf report: Make the output more compact · 021191b3
      Arnaldo Carvalho de Melo 提交于
      When we filter by column content we may end up with a column
      that has the same value for all the lines. So remove that
      column and tell its unique value on the top, as a comment.
      
      Example:
      
        [acme@doppio pahole]$  perf report --sort comm,dso,symbol -d ./build/libdwarves.so.1.0.0 -C pahole | head -15
        # dso: ./build/libdwarves.so.1.0.0
        # comm: pahole
        # Samples: 58409
        #
        # Overhead  Symbol
        # ........  ......
        #
            20.93%  [.] tag__recode_dwarf_type
            14.94%  [.] namespace__recode_dwarf_types
            10.38%  [.] cu__table_add_tag
             6.69%  [.] __die__process_tag
             5.05%  [.] die__process_function
             4.70%  [.] list__for_all_tags
             3.68%  [.] tag__init
             3.48%  [.] die__create_new_parameter
        [acme@doppio pahole]$
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1247325517-12272-3-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      021191b3
    • A
      perf report: Tidy up reporting of symbols not found · 60c1baf1
      Arnaldo Carvalho de Melo 提交于
      Always printing the level info about if it is in the kernel,
      hypervisor or userspace as that is in the hist_entry.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1247325517-12272-1-git-send-email-acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      60c1baf1
  15. 11 7月, 2009 1 次提交
    • A
      perf report: Adjust column width to the values sampled · 52d422de
      Arnaldo Carvalho de Melo 提交于
      Auto-adjust column width of perf report output to the
      longest occuring string length.
      
      Example:
      
      [acme@doppio pahole]$  perf report --sort comm,dso,symbol | head -13
      
          12.79%   pahole  /usr/lib64/libdw-0.141.so    [.] __libdw_find_attr
           8.90%   pahole  /lib64/libc-2.10.1.so        [.] _int_malloc
           8.68%   pahole  /usr/lib64/libdw-0.141.so    [.] __libdw_form_val_len
           8.15%   pahole  /lib64/libc-2.10.1.so        [.] __GI_strcmp
           6.80%   pahole  /lib64/libc-2.10.1.so        [.] __tsearch
           5.54%   pahole  ./build/libdwarves.so.1.0.0  [.] tag__recode_dwarf_type
      [acme@doppio pahole]$
      
      [acme@doppio pahole]$  perf report --sort comm,dso,symbol -d /lib64/libc-2.10.1.so | head -10
      
          21.92%   pahole  /lib64/libc-2.10.1.so  [.] _int_malloc
          20.08%   pahole  /lib64/libc-2.10.1.so  [.] __GI_strcmp
          16.75%   pahole  /lib64/libc-2.10.1.so  [.] __tsearch
      [acme@doppio pahole]$
      
      Also add these extra options to control the new behaviour:
      
        -w, --field-width
      
      Force each column width to the provided list, for large terminal
      readability.
      
        -t, --field-separator:
      
      Use a special separator character and don't pad with spaces, replacing
      all occurances of this separator in symbol names (and other output) with
      a '.' character, that thus it's the only non valid separator.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20090711014728.GH3452@ghostprotocols.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      52d422de
  16. 05 7月, 2009 4 次提交
    • F
      perf report: Add "Fractal" mode output - support callchains with relative overhead rate · 805d127d
      Frederic Weisbecker 提交于
      The current callchain displays the overhead rates as absolute:
      relative to the total overhead.
      
      This patch provides relative overhead percentage, in which each
      branch of the callchain tree is a independant instrumentated object.
      
      This provides a 'fractal' view of the call-chain profile: each
      sub-graph looks like a profile in itself - relative to its parent.
      
      You can produce such output by using the "fractal" mode
      that you can abbreviate via f, fr, fra, frac, etc...
      
      ./perf report -s sym -c fractal
      
      Example:
      
           8.46%  [k] copy_user_generic_string
                      |
                      |--52.01%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |          |--97.20%-- sys_pread64
                      |          |          system_call_fastpath
                      |          |          pread64
                      |          |
                      |           --2.81%-- sys_read
                      |                     system_call_fastpath
                      |                     __read
                      |
                      |--39.85%-- generic_file_buffered_write
                      |          __generic_file_aio_write_nolock
                      |          generic_file_aio_write
                      |          do_sync_write
                      |          reiserfs_file_write
                      |          vfs_write
                      |          |
                      |          |--97.05%-- sys_pwrite64
                      |          |          system_call_fastpath
                      |          |          __pwrite64
                      |          |
                      |           --2.95%-- sys_write
                      |                     system_call_fastpath
                      |                     __write_nocancel
      [...]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246772361-9960-5-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      805d127d
    • F
      perf report: Change default callchain parameters · 94a8eb02
      Frederic Weisbecker 提交于
      The default callchain parameters are set to use the flat mode and never
      filter any overhead threshold of backtrace.
      
      But flat mode is boring compared to graph mode.
      Also the number of callchains may be very high if none is
      filtered.
      
      Let's change this to set the graph view and a minimum overhead of 0.5%
      as default parameters.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246772361-9960-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94a8eb02
    • F
      perf report: Use a modifiable string for default callchain options · be903885
      Frederic Weisbecker 提交于
      If the user doesn't provide options to tune his callchain output
      (ie: if he uses -c without arguments) then the default value passed
      in the OPT_CALLBACK_DEFAULT() macro is used.
      
      But it's parsed later by strtok() which will replace comma separators
      to a zero. This may segfault as we are using a read-only string.
      
      Use a modifiable one instead, and also fix the "100%" default
      minimum threshold value by turning it into a 0 (output every callchains)
      as it was intended in the origin.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246772361-9960-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      be903885
    • F
      perf report: Warn on callchain output request from non-callchain file · 91b4eaea
      Frederic Weisbecker 提交于
      perf report segfaults while trying to handle callchains from a non
      callchain data file.
      
      Instead of a segfault, print a useful message to the user.
      Reported-by: NJens Axboe <jens.axboe@oracle.com>
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246772361-9960-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      91b4eaea
  17. 03 7月, 2009 5 次提交
    • I
      perf report: Annotate variable initialization · 029e5b16
      Ingo Molnar 提交于
      Certain versions of GCC dont see the initialization that is done here:
      
        builtin-report.c: In function ‘__cmd_report’:
        builtin-report.c:1038: warning: ‘syms’ may be used uninitialized in this function
      
      So annotate it with a NULL initialization.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      029e5b16
    • F
      perf_counter tools: Display percents of hits in callchain with overhead colors · 24b57c69
      Frederic Weisbecker 提交于
      This adds the use of colors to signal at a glance the important
      overhead thresholds in callchains hit rates.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246558475-10624-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      24b57c69
    • F
      perf_counter tools: Provide helper to print percents color · 1e11fd82
      Frederic Weisbecker 提交于
      Among perf annotate, perf report and perf top, we can find the
      common colored printing of percents according to the following
      rules:
      
          High overhead =  > 5%, colored in red
          Mid overhead =  > 0.5%, colored in green
          Low overhead =  < 0.5%, default color
      
      Factorize these multiple checks in a single function named
      percent_color_fprintf() and also provide a get_percent_color()
      for sites which print percentages and other things at the same
      time.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246558475-10624-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1e11fd82
    • F
      perf_counter tools: Set the minimum percent for callchains to be displayed · c20ab37e
      Frederic Weisbecker 提交于
      Callchains output may become a burden on a trace because even
      rarely hit site are exposed. This can be too much information.
      
      Let the user set a threshold as a minimum percent of hits using
      the new pattern for the -c option:
      
          -c mode,min_percent
      
      Example:
      
      $ perf report -s sym -c flat,4
      
           8.25%  [k] copy_user_generic_string
                   4.19%
                      copy_user_generic_string
                      generic_file_aio_read
                      do_sync_read
                      vfs_read
                      sys_pread64
                      system_call_fastpath
                      pread64
      
           5.39%  [k] search_by_key
           4.63%  0x00000000009e0a
           2.36%  [k] memcpy_c
      [...]
      
      $ perf report -s sym -c graph,2
      
           8.25%  [k] copy_user_generic_string
                      |
                      |--4.31%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |           --4.19%-- sys_pread64
                      |                     system_call_fastpath
                      |                     pread64
                      |
                       --3.24%-- generic_file_buffered_write
                                 __generic_file_aio_write_nolock
                                 generic_file_aio_write
                                 do_sync_write
                                 reiserfs_file_write
                                 vfs_write
                                 |
                                  --3.14%-- sys_pwrite64
                                            system_call_fastpath
                                            __pwrite64
      
           5.39%  [k] search_by_key
                      |
                       --2.23%-- reiserfs_update_sd_size
      
           4.63%  0x00000000009e0a
      
           2.36%  [k] memcpy_c
      [...]
      
      You can also omit it and it will default to 0.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246558475-10624-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c20ab37e
    • F
      perf report: Add support for callchain graph output · 4eb3e478
      Frederic Weisbecker 提交于
      Currently, the printing of callchains is done in a single
      vertical level, this is the "flat" mode:
      
      8.25%  [k] copy_user_generic_string
                   4.19%
                      copy_user_generic_string
                      generic_file_aio_read
                      do_sync_read
                      vfs_read
                      sys_pread64
                      system_call_fastpath
                      pread64
      
      This patch introduces a new "graph" mode which provides a
      hierarchical output of factorized paths recursively sorted:
      
       8.25%  [k] copy_user_generic_string
                      |
                      |--4.31%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |          |--4.19%-- sys_pread64
                      |          |          system_call_fastpath
                      |          |          pread64
                      |          |
                      |           --0.12%-- sys_read
                      |                     system_call_fastpath
                      |                     __read
                      |
                      |--3.24%-- generic_file_buffered_write
                      |          __generic_file_aio_write_nolock
                      |          generic_file_aio_write
                      |          do_sync_write
                      |          reiserfs_file_write
                      |          vfs_write
                      |          |
                      |          |--3.14%-- sys_pwrite64
                      |          |          system_call_fastpath
                      |          |          __pwrite64
                      |          |
                      |           --0.10%-- sys_write
      [...]
      
      The command line has then changed.
      
      By providing the -c option, the callchain will output in the
      flat mode by default.
      
      But you can override it:
      
          perf report -c graph
      
      or
      
          perf report -c flat
      
      You can also pass the abreviated mode:
      
          perf report -c g
      
      or
      
          perf report -c gra
      
      will both make use of the graph mode.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246550301-8954-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4eb3e478
  18. 02 7月, 2009 3 次提交