1. 06 3月, 2009 1 次提交
  2. 25 2月, 2009 1 次提交
    • S
      tracing: add event trace infrastructure · b77e38aa
      Steven Rostedt 提交于
      This patch creates the event tracing infrastructure of ftrace.
      It will create the files:
      
       /debug/tracing/available_events
       /debug/tracing/set_event
      
      The available_events will list the trace points that have been
      registered with the event tracer.
      
      set_events will allow the user to enable or disable an event hook.
      
      example:
      
       # echo sched_wakeup > /debug/tracing/set_event
      
      Will enable the sched_wakeup event (if it is registered).
      
       # echo "!sched_wakeup" >> /debug/tracing/set_event
      
      Will disable the sched_wakeup event (and only that event).
      
       # echo > /debug/tracing/set_event
      
      Will disable all events (notice the '>')
      
       # cat /debug/tracing/available_events > /debug/tracing/set_event
      
      Will enable all registered event hooks.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      b77e38aa
  3. 19 2月, 2009 1 次提交
    • S
      tracing: have function trace select kallsyms · 4d7a077c
      Steven Rostedt 提交于
      Impact: fix output of function tracer to be useful
      
      The function tracer is pretty useless if KALLSYMS is not configured.
      Unless you are good at reading hex values, the function tracer should
      select the KALLSYMS configuration.
      
      Also, the dynamic function tracer will fail its self test if KALLSYMS
      is not selected.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      4d7a077c
  4. 16 2月, 2009 1 次提交
    • P
      trace: mmiotrace to the tracer menu in Kconfig · 6bc5c366
      Pekka Paalanen 提交于
      Impact: cosmetic change in Kconfig menu layout
      
      This patch was originally suggested by Peter Zijlstra, but seems it
      was forgotten.
      
      CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable
      directly under the Kernel hacking / debugging menu in the kernel
      configuration system. They were present only for x86 and x86_64.
      
      Other tracers that use the ftrace tracing framework are in their own
      sub-menu. This patch moves the mmiotrace configuration options there.
      Since the Kconfig file, where the tracer menu is, is not architecture
      specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by
      x86/x86_64. CONFIG_MMIOTRACE now depends on it.
      Signed-off-by: NPekka Paalanen <pq@iki.fi>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6bc5c366
  5. 11 2月, 2009 1 次提交
  6. 09 2月, 2009 2 次提交
  7. 08 2月, 2009 1 次提交
    • S
      ring-buffer: add NMI protection for spinlocks · 78d904b4
      Steven Rostedt 提交于
      Impact: prevent deadlock in NMI
      
      The ring buffers are not yet totally lockless with writing to
      the buffer. When a writer crosses a page, it grabs a per cpu spinlock
      to protect against a reader. The spinlocks taken by a writer are not
      to protect against other writers, since a writer can only write to
      its own per cpu buffer. The spinlocks protect against readers that
      can touch any cpu buffer. The writers are made to be reentrant
      with the spinlocks disabling interrupts.
      
      The problem arises when an NMI writes to the buffer, and that write
      crosses a page boundary. If it grabs a spinlock, it can be racing
      with another writer (since disabling interrupts does not protect
      against NMIs) or with a reader on the same CPU. Luckily, most of the
      users are not reentrant and protects against this issue. But if a
      user of the ring buffer becomes reentrant (which is what the ring
      buffers do allow), if the NMI also writes to the ring buffer then
      we risk the chance of a deadlock.
      
      This patch moves the ftrace_nmi_enter called by nmi_enter() to the
      ring buffer code. It replaces the current ftrace_nmi_enter that is
      used by arch specific code to arch_ftrace_nmi_enter and updates
      the Kconfig to handle it.
      
      When an NMI is called, it will set a per cpu variable in the ring buffer
      code and will clear it when the NMI exits. If a write to the ring buffer
      crosses page boundaries inside an NMI, a trylock is used on the spin
      lock instead. If the spinlock fails to be acquired, then the entry
      is discarded.
      
      This bug appeared in the ftrace work in the RT tree, where event tracing
      is reentrant. This workaround solved the deadlocks that appeared there.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      78d904b4
  8. 03 2月, 2009 1 次提交
  9. 14 1月, 2009 1 次提交
    • F
      tracing: add a new workqueue tracer · e1d8aa9f
      Frederic Weisbecker 提交于
      Impact: new tracer
      
      The workqueue tracer provides some statistical informations
      about each cpu workqueue thread such as the number of the
      works inserted and executed since their creation. It can help
      to evaluate the amount of work each of them have to perform.
      For example it can help a developer to decide whether he should
      choose a per cpu workqueue instead of a singlethreaded one.
      
      It only traces statistical informations for now but it will probably later
      provide event tracing too.
      
      Such a tracer could help too, and be improved, to help rt priority sorted
      workqueue development.
      
      To have a snapshot of the workqueues state at any time, just do
      
      cat /debugfs/tracing/trace_stat/workqueues
      
      Ie:
      
        1    125        125       reiserfs/1
        1      0          0       scsi_tgtd/1
        1      0          0       aio/1
        1      0          0       ata/1
        1    114        114       kblockd/1
        1      0          0       kintegrityd/1
        1   2147       2147       events/1
      
        0      0          0       kpsmoused
        0    105        105       reiserfs/0
        0      0          0       scsi_tgtd/0
        0      0          0       aio/0
        0      0          0       ata_aux
        0      0          0       ata/0
        0      0          0       cqueue
        0      0          0       kacpi_notify
        0      0          0       kacpid
        0    149        149       kblockd/0
        0      0          0       kintegrityd/0
        0   1000       1000       khelper
        0   2270       2270       events/0
      
      Changes in V2:
      
      _ Drop the static array based on NR_CPU and dynamically allocate the stat array
        with num_possible_cpus() and other cpu mask facilities....
      _ Trace workqueue insertion at a bit lower level (insert_work instead of queue_work) to handle
        even the workqueue barriers.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e1d8aa9f
  10. 11 1月, 2009 1 次提交
    • P
      trace: mmiotrace to the tracer menu in Kconfig · fe6f90e5
      Pekka Paalanen 提交于
      Impact: cosmetic change in Kconfig menu layout
      
      This patch was originally suggested by Peter Zijlstra, but seems it
      was forgotten.
      
      CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable
      directly under the Kernel hacking / debugging menu in the kernel
      configuration system. They were present only for x86 and x86_64.
      
      Other tracers that use the ftrace tracing framework are in their own
      sub-menu. This patch moves the mmiotrace configuration options there.
      Since the Kconfig file, where the tracer menu is, is not architecture
      specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by
      x86/x86_64. CONFIG_MMIOTRACE now depends on it.
      Signed-off-by: NPekka Paalanen <pq@iki.fi>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fe6f90e5
  11. 06 1月, 2009 1 次提交
  12. 30 12月, 2008 2 次提交
    • I
      tracing/kmemtrace: export kmemtrace_mark_alloc_node() / kmemtrace_mark_free() · 3fd4bc01
      Ingo Molnar 提交于
      Impact: build fix
      
      Also fix up Kconfig dependencies and include files.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3fd4bc01
    • F
      tracing/kmemtrace: normalize the raw tracer event to the unified tracing API · 36994e58
      Frederic Weisbecker 提交于
      Impact: new tracer plugin
      
      This patch adapts kmemtrace raw events tracing to the unified tracing API.
      
      To enable and use this tracer, just do the following:
      
       echo kmemtrace > /debugfs/tracing/current_tracer
       cat /debugfs/tracing/trace
      
      You will have the following output:
      
       # tracer: kmemtrace
       #
       #
       # ALLOC  TYPE  REQ   GIVEN  FLAGS           POINTER         NODE    CALLER
       # FREE   |      |     |       |              |   |            |        |
       # |
      
      type_id 1 call_site 18446744071565527833 ptr 18446612134395152256
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      
      That was to stay backward compatible with the format output produced in
      inux/tracepoint.h.
      
      This is the default ouput, but note that I tried something else.
      
      If you change an option:
      
      echo kmem_minimalistic > /debugfs/trace_options
      
      and then cat /debugfs/trace, you will have the following output:
      
       # tracer: kmemtrace
       #
       #
       # ALLOC  TYPE  REQ   GIVEN  FLAGS           POINTER         NODE    CALLER
       # FREE   |      |     |       |              |   |            |        |
       # |
      
         -      C                            0xffff88007c088780          file_free_rcu
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc780     -1   d_alloc
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc870     -1   d_alloc
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc960     -1   d_alloc
         +      K   1304   1312   000000d0   0xffff8800791d7340     -1   reiserfs_alloc_inode
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         -      C                            0xffff88007cad6000          putname
         +      K    992   1000   000000d0   0xffff880079045b58     -1   alloc_inode
         +      K    768   1024   000080d0   0xffff88007c096400     -1   alloc_pipe_info
         +      K    240    240   000000d0   0xffff8800790dca50     -1   d_alloc
         +      K    272    320   000080d0   0xffff88007c088780     -1   get_empty_filp
         +      K    272    320   000080d0   0xffff88007c088000     -1   get_empty_filp
      
      Yeah I shall confess kmem_minimalistic should be: kmem_alternative.
      
      Whatever, I find it more readable but this a personal opinion of course.
      We can drop it if you want.
      
      On the ALLOC/FREE column, + means an allocation and - a free.
      
      On the type column, you have K = kmalloc, C = cache, P = page
      
      I would like the flags to be GFP_* strings but that would not be easy to not
      break the column with strings....
      
      About the node...it seems to always be -1. I don't know why but that shouldn't
      be difficult to find.
      
      I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would
      be more easy to find the tracer headers if they are all in their common
      directory.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      36994e58
  13. 18 12月, 2008 1 次提交
    • S
      trace: add a way to enable or disable the stack tracer · f38f1d2a
      Steven Rostedt 提交于
      Impact: enhancement to stack tracer
      
      The stack tracer currently is either on when configured in or
      off when it is not. It can not be disabled when it is configured on.
      (besides disabling the function tracer that it uses)
      
      This patch adds a way to enable or disable the stack tracer at
      run time. It defaults off on bootup, but a kernel parameter 'stacktrace'
      has been added to enable it on bootup.
      
      A new sysctl has been added "kernel.stack_tracer_enabled" to let
      the user enable or disable the stack tracer at run time.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f38f1d2a
  14. 12 12月, 2008 1 次提交
  15. 03 12月, 2008 1 次提交
  16. 26 11月, 2008 3 次提交
  17. 23 11月, 2008 3 次提交
    • T
      tracing/stack-tracer: introduce CONFIG_USER_STACKTRACE_SUPPORT · 8d26487f
      Török Edwin 提交于
      Impact: cleanup
      
      User stack tracing is just implemented for x86, but it is not x86 specific.
      
      Introduce a generic config flag, that is currently enabled only for x86.
      When other arches implement it, they will have to
      SELECT USER_STACKTRACE_SUPPORT.
      Signed-off-by: NTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8d26487f
    • S
      trace: profile all if conditionals · 2bcd521a
      Steven Rostedt 提交于
      Impact: feature to profile if statements
      
      This patch adds a branch profiler for all if () statements.
      The results will be found in:
      
        /debugfs/tracing/profile_branch
      
      For example:
      
         miss      hit    %        Function                  File              Line
       ------- ---------  -        --------                  ----              ----
             0        1 100 x86_64_start_reservations      head64.c             127
             0        1 100 copy_bootdata                  head64.c             69
             1        0   0 x86_64_start_kernel            head64.c             111
            32        0   0 set_intr_gate                  desc.h               319
             1        0   0 reserve_ebda_region            head.c               51
             1        0   0 reserve_ebda_region            head.c               47
             0        1 100 reserve_ebda_region            head.c               42
             0        0   X maxcpus                        main.c               165
      
      Miss means the branch was not taken. Hit means the branch was taken.
      The percent is the percentage the branch was taken.
      
      This adds a significant amount of overhead and should only be used
      by those analyzing their system.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2bcd521a
    • S
      trace: consolidate unlikely and likely profiler · 45b79749
      Steven Rostedt 提交于
      Impact: clean up to make one profiler of like and unlikely tracer
      
      The likely and unlikely profiler prints out the file and line numbers
      of the annotated branches that it is profiling. It shows the number
      of times it was correct or incorrect in its guess. Having two
      different files or sections for that matter to tell us if it was a
      likely or unlikely is pretty pointless. We really only care if
      it was correct or not.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      45b79749
  18. 16 11月, 2008 1 次提交
    • F
      tracing/function-return-tracer: support for dynamic ftrace on function return tracer · e7d3737e
      Frederic Weisbecker 提交于
      This patch adds the support for dynamic tracing on the function return tracer.
      The whole difference with normal dynamic function tracing is that we don't need
      to hook on a particular callback. The only pro that we want is to nop or set
      dynamically the calls to ftrace_caller (which is ftrace_return_caller here).
      
      Some security checks ensure that we are not trying to launch dynamic tracing for
      return tracing while normal function tracing is already running.
      
      An example of trace with getnstimeofday set as a filter:
      
      ktime_get_ts+0x22/0x50 -> getnstimeofday (2283 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1396 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1825 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1426 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1524 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1434 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1502 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1404 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1397 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1051 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1314 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1344 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1163 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1390 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1374 ns)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e7d3737e
  19. 13 11月, 2008 1 次提交
  20. 12 11月, 2008 2 次提交
    • S
      tracing: likely/unlikely branch annotation tracer · 52f232cb
      Steven Rostedt 提交于
      Impact: new likely/unlikely branch tracer
      
      This patch adds a way to record the instances of the likely() and unlikely()
      branch condition annotations.
      
      When "unlikely" is set in /debugfs/tracing/iter_ctrl the unlikely conditions
      will be added to any of the ftrace tracers. The change takes effect when
      a new tracer is passed into the current_tracer file.
      
      For example:
      
       bash-3471  [003]   357.014755: [INCORRECT] sched_info_dequeued:sched_stats.h:177
       bash-3471  [003]   357.014756: [correct] update_curr:sched_fair.c:489
       bash-3471  [003]   357.014758: [correct] calc_delta_fair:sched_fair.c:411
       bash-3471  [003]   357.014759: [correct] account_group_exec_runtime:sched_stats.h:356
       bash-3471  [003]   357.014761: [correct] update_curr:sched_fair.c:489
       bash-3471  [003]   357.014763: [INCORRECT] calc_delta_fair:sched_fair.c:411
       bash-3471  [003]   357.014765: [correct] calc_delta_mine:sched.c:1279
      
      Which shows the normal tracer heading, as well as whether the condition was
      correct "[correct]" or was mistaken "[INCORRECT]", followed by the function,
      file name and line number.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      52f232cb
    • S
      tracing: profile likely and unlikely annotations · 1f0d69a9
      Steven Rostedt 提交于
      Impact: new unlikely/likely profiler
      
      Andrew Morton recently suggested having an in-kernel way to profile
      likely and unlikely macros. This patch achieves that goal.
      
      When configured, every(*) likely and unlikely macro gets a counter attached
      to it. When the condition is hit, the hit and misses of that condition
      are recorded. These numbers can later be retrieved by:
      
        /debugfs/tracing/profile_likely    - All likely markers
        /debugfs/tracing/profile_unlikely  - All unlikely markers.
      
      # cat /debug/tracing/profile_unlikely | head
       correct incorrect  %        Function                  File              Line
       ------- ---------  -        --------                  ----              ----
          2167        0   0 do_arch_prctl                  process_64.c         832
             0        0   0 do_arch_prctl                  process_64.c         804
          2670        0   0 IS_ERR                         err.h                34
         71230     5693   7 __switch_to                    process_64.c         673
         76919        0   0 __switch_to                    process_64.c         639
         43184    33743  43 __switch_to                    process_64.c         624
         12740    64181  83 __switch_to                    process_64.c         594
         12740    64174  83 __switch_to                    process_64.c         590
      
      # cat /debug/tracing/profile_unlikely | \
        awk '{ if ($3 > 25) print $0; }' |head -20
         44963    35259  43 __switch_to                    process_64.c         624
         12762    67454  84 __switch_to                    process_64.c         594
         12762    67447  84 __switch_to                    process_64.c         590
          1478      595  28 syscall_get_error              syscall.h            51
             0     2821 100 syscall_trace_leave            ptrace.c             1567
             0        1 100 native_smp_prepare_cpus        smpboot.c            1237
         86338   265881  75 calc_delta_fair                sched_fair.c         408
        210410   108540  34 calc_delta_mine                sched.c              1267
             0    54550 100 sched_info_queued              sched_stats.h        222
         51899    66435  56 pick_next_task_fair            sched_fair.c         1422
             6       10  62 yield_task_fair                sched_fair.c         982
          7325     2692  26 rt_policy                      sched.c              144
             0     1270 100 pre_schedule_rt                sched_rt.c           1261
          1268    48073  97 pick_next_task_rt              sched_rt.c           884
             0    45181 100 sched_info_dequeued            sched_stats.h        177
             0       15 100 sched_move_task                sched.c              8700
             0       15 100 sched_move_task                sched.c              8690
         53167    33217  38 schedule                       sched.c              4457
             0    80208 100 sched_info_switch              sched_stats.h        270
         30585    49631  61 context_switch                 sched.c              2619
      
      # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }'
         39900    36577  47 pick_next_task                 sched.c              4397
         20824    15233  42 switch_mm                      mmu_context_64.h     18
             0        7 100 __cancel_work_timer            workqueue.c          560
           617    66484  99 clocksource_adjust             timekeeping.c        456
             0   346340 100 audit_syscall_exit             auditsc.c            1570
            38   347350  99 audit_get_context              auditsc.c            732
             0   345244 100 audit_syscall_entry            auditsc.c            1541
            38     1017  96 audit_free                     auditsc.c            1446
             0     1090 100 audit_alloc                    auditsc.c            862
          2618     1090  29 audit_alloc                    auditsc.c            858
             0        6 100 move_masked_irq                migration.c          9
             1      198  99 probe_sched_wakeup             trace_sched_switch.c 58
             2        2  50 probe_wakeup                   trace_sched_wakeup.c 227
             0        2 100 probe_wakeup_sched_switch      trace_sched_wakeup.c 144
          4514     2090  31 __grab_cache_page              filemap.c            2149
         12882   228786  94 mapping_unevictable            pagemap.h            50
             4       11  73 __flush_cpu_slab               slub.c               1466
        627757   330451  34 slab_free                      slub.c               1731
          2959    61245  95 dentry_lru_del_init            dcache.c             153
           946     1217  56 load_elf_binary                binfmt_elf.c         904
           102       82  44 disk_put_part                  genhd.h              206
             1        1  50 dst_gc_task                    dst.c                82
             0       19 100 tcp_mss_split_point            tcp_output.c         1126
      
      As you can see by the above, there's a bit of work to do in rethinking
      the use of some unlikelys and likelys. Note: the unlikely case had 71 hits
      that were more than 25%.
      
      Note:  After submitting my first version of this patch, Andrew Morton
        showed me a version written by Daniel Walker, where I picked up
        the following ideas from:
      
        1)  Using __builtin_constant_p to avoid profiling fixed values.
        2)  Using __FILE__ instead of instruction pointers.
        3)  Using the preprocessor to stop all profiling of likely
             annotations from vsyscall_64.c.
      
      Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar
      for their feed back on this patch.
      
      (*) Not ever unlikely is recorded, those that are used by vsyscalls
       (a few of them) had to have profiling disabled.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Theodore Tso <tytso@mit.edu>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1f0d69a9
  21. 11 11月, 2008 1 次提交
    • F
      tracing: add a tracer to catch execution time of kernel functions · 15e6cb36
      Frederic Weisbecker 提交于
      Impact: add new tracing plugin which can trace full (entry+exit) function calls
      
      This tracer uses the low level function return ftrace plugin to
      measure the execution time of the kernel functions.
      
      The first field is the caller of the function, the second is the
      measured function, and the last one is the execution time in
      nanoseconds.
      
      - v3:
      
      - HAVE_FUNCTION_RET_TRACER have been added. Each arch that support ftrace return
        should enable it.
      - ftrace_return_stub becomes ftrace_stub.
      - CONFIG_FUNCTION_RET_TRACER depends now on CONFIG_FUNCTION_TRACER
      - Return traces printing can be used for other tracers on trace.c
      - Adapt to the new tracing API (no more ctrl_update callback)
      - Correct the check of "disabled" during insertion.
      - Minor changes...
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      15e6cb36
  22. 06 11月, 2008 1 次提交
    • S
      ftrace: add quick function trace stop · 60a7ecf4
      Steven Rostedt 提交于
      Impact: quick start and stop of function tracer
      
      This patch adds a way to disable the function tracer quickly without
      the need to run kstop_machine. It adds a new variable called
      function_trace_stop which will stop the calls to functions from mcount
      when set.  This is just an on/off switch and does not handle recursion
      like preempt_disable().
      
      It's main purpose is to help other tracers/debuggers start and stop tracing
      fuctions without the need to call kstop_machine.
      
      The config option HAVE_FUNCTION_TRACE_MCOUNT_TEST is added for archs
      that implement the testing of the function_trace_stop in the mcount
      arch dependent code. Otherwise, the test is done in the C code.
      
      x86 is the only arch at the moment that supports this.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      60a7ecf4
  23. 03 11月, 2008 1 次提交
  24. 30 10月, 2008 1 次提交
    • S
      ftrace: fix trace_nop config select · f3384b28
      Steven Rostedt 提交于
      Impact: build fix on non-function-tracing architectures
      
      The trace_nop is the tracer that is defined when no tracer is set in
      the ftrace infrastructure.
      
      The trace_nop was mistakenly selected by HAVE_FTRACE due to the confusion
      between ftrace infrastructure and the ftrace function tracer (which has
      been solved by renaming the function tracer).
      
      This patch changes the select to the approriate TRACING.
      
      This patch should fix compile errors on architectures that do not define
      the FUNCTION_TRACER.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f3384b28
  25. 27 10月, 2008 1 次提交
    • F
      tracing/ftrace: make boot tracer select the sched_switch tracer · ea31e72d
      Frederic Weisbecker 提交于
      Impact: build fix
      
      If the boot tracer is selected but not the sched_switch,
      there will be a build failure:
      
       kernel/built-in.o: In function `boot_trace_init':
       trace_boot.c:(.text+0x5ee38): undefined reference to `sched_switch_trace'
       kernel/built-in.o: In function `disable_boot_trace':
       (.text+0x5eee1): undefined reference to `tracing_stop_cmdline_record'
       kernel/built-in.o: In function `enable_boot_trace':
       (.text+0x5ef11): undefined reference to `tracing_start_cmdline_record'
      
      This patch fixes it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea31e72d
  26. 22 10月, 2008 1 次提交
  27. 21 10月, 2008 2 次提交
  28. 14 10月, 2008 5 次提交
    • I
      tracing/fastboot: improve help text · 98d9c66a
      Ingo Molnar 提交于
      Improve the help text of the boot tracer.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      98d9c66a
    • I
      tracing/stacktrace: improve help text · 4519d9e5
      Ingo Molnar 提交于
      Improve the help text that is displayed for CONFIG_STACK_TRACER.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4519d9e5
    • S
      tracing: unified trace buffer · 7a8e76a3
      Steven Rostedt 提交于
      This is a unified tracing buffer that implements a ring buffer that
      hopefully everyone will eventually be able to use.
      
      The events recorded into the buffer have the following structure:
      
        struct ring_buffer_event {
      	u32 type:2, len:3, time_delta:27;
      	u32 array[];
        };
      
      The minimum size of an event is 8 bytes. All events are 4 byte
      aligned inside the buffer.
      
      There are 4 types (all internal use for the ring buffer, only
      the data type is exported to the interface users).
      
       RINGBUF_TYPE_PADDING: this type is used to note extra space at the end
      	of a buffer page.
      
       RINGBUF_TYPE_TIME_EXTENT: This type is used when the time between events
      	is greater than the 27 bit delta can hold. We add another
      	32 bits, and record that in its own event (8 byte size).
      
       RINGBUF_TYPE_TIME_STAMP: (Not implemented yet). This will hold data to
      	help keep the buffer timestamps in sync.
      
      RINGBUF_TYPE_DATA: The event actually holds user data.
      
      The "len" field is only three bits. Since the data must be
      4 byte aligned, this field is shifted left by 2, giving a
      max length of 28 bytes. If the data load is greater than 28
      bytes, the first array field holds the full length of the
      data load and the len field is set to zero.
      
      Example, data size of 7 bytes:
      
      	type = RINGBUF_TYPE_DATA
      	len = 2
      	time_delta: <time-stamp> - <prev_event-time-stamp>
      	array[0..1]: <7 bytes of data> <1 byte empty>
      
      This event is saved in 12 bytes of the buffer.
      
      An event with 82 bytes of data:
      
      	type = RINGBUF_TYPE_DATA
      	len = 0
      	time_delta: <time-stamp> - <prev_event-time-stamp>
      	array[0]: 84 (Note the alignment)
      	array[1..14]: <82 bytes of data> <2 bytes empty>
      
      The above event is saved in 92 bytes (if my math is correct).
      82 bytes of data, 2 bytes empty, 4 byte header, 4 byte length.
      
      Do not reference the above event struct directly. Use the following
      functions to gain access to the event table, since the
      ring_buffer_event structure may change in the future.
      
      ring_buffer_event_length(event): get the length of the event.
      	This is the size of the memory used to record this
      	event, and not the size of the data pay load.
      
      ring_buffer_time_delta(event): get the time delta of the event
      	This returns the delta time stamp since the last event.
      	Note: Even though this is in the header, there should
      		be no reason to access this directly, accept
      		for debugging.
      
      ring_buffer_event_data(event): get the data from the event
      	This is the function to use to get the actual data
      	from the event. Note, it is only a pointer to the
      	data inside the buffer. This data must be copied to
      	another location otherwise you risk it being written
      	over in the buffer.
      
      ring_buffer_lock: A way to lock the entire buffer.
      ring_buffer_unlock: unlock the buffer.
      
      ring_buffer_alloc: create a new ring buffer. Can choose between
      	overwrite or consumer/producer mode. Overwrite will
      	overwrite old data, where as consumer producer will
      	throw away new data if the consumer catches up with the
      	producer.  The consumer/producer is the default.
      
      ring_buffer_free: free the ring buffer.
      
      ring_buffer_resize: resize the buffer. Changes the size of each cpu
      	buffer. Note, it is up to the caller to provide that
      	the buffer is not being used while this is happening.
      	This requirement may go away but do not count on it.
      
      ring_buffer_lock_reserve: locks the ring buffer and allocates an
      	entry on the buffer to write to.
      ring_buffer_unlock_commit: unlocks the ring buffer and commits it to
      	the buffer.
      
      ring_buffer_write: writes some data into the ring buffer.
      
      ring_buffer_peek: Look at a next item in the cpu buffer.
      ring_buffer_consume: get the next item in the cpu buffer and
      	consume it. That is, this function increments the head
      	pointer.
      
      ring_buffer_read_start: Start an iterator of a cpu buffer.
      	For now, this disables the cpu buffer, until you issue
      	a finish. This is just because we do not want the iterator
      	to be overwritten. This restriction may change in the future.
      	But note, this is used for static reading of a buffer which
      	is usually done "after" a trace. Live readings would want
      	to use the ring_buffer_consume above, which will not
      	disable the ring buffer.
      
      ring_buffer_read_finish: Finishes the read iterator and reenables
      	the ring buffer.
      
      ring_buffer_iter_peek: Look at the next item in the cpu iterator.
      ring_buffer_read: Read the iterator and increment it.
      ring_buffer_iter_reset: Reset the iterator to point to the beginning
      	of the cpu buffer.
      ring_buffer_iter_empty: Returns true if the iterator is at the end
      	of the cpu buffer.
      
      ring_buffer_size: returns the size in bytes of each cpu buffer.
      	Note, the real size is this times the number of CPUs.
      
      ring_buffer_reset_cpu: Sets the cpu buffer to empty
      ring_buffer_reset: sets all cpu buffers to empty
      
      ring_buffer_swap_cpu: swaps a cpu buffer from one buffer with a
      	cpu buffer of another buffer. This is handy when you
      	want to take a snap shot of a running trace on just one
      	cpu. Having a backup buffer, to swap with facilitates this.
      	Ftrace max latencies use this.
      
      ring_buffer_empty: Returns true if the ring buffer is empty.
      ring_buffer_empty_cpu: Returns true if the cpu buffer is empty.
      
      ring_buffer_record_disable: disable all cpu buffers (read only)
      ring_buffer_record_disable_cpu: disable a single cpu buffer (read only)
      ring_buffer_record_enable: enable all cpu buffers.
      ring_buffer_record_enabl_cpu: enable a single cpu buffer.
      
      ring_buffer_entries: The number of entries in a ring buffer.
      ring_buffer_overruns: The number of entries removed due to writing wrap.
      
      ring_buffer_time_stamp: Get the time stamp used by the ring buffer
      ring_buffer_normalize_time_stamp: normalize the ring buffer time stamp
      	into nanosecs.
      
      I still need to implement the GTOD feature. But we need support from
      the cpu frequency infrastructure.  But this can be done at a later
      time without affecting the ring buffer interface.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a8e76a3
    • F
      ftrace/fastboot: disable tracers self-tests when boot tracer is selected · 3ce2b920
      Frédéric Weisbecker 提交于
      The tracing engine resets the ring buffer and the tracers touch it
      too during self-tests. These self-tests happen during tracers registering
      and work against boot tracing which is logging initcalls.
      
      We have to disable tracing self-tests if the boot-tracer is selected.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ce2b920
    • F
      tracing/ftrace: give an entry on the config for boot tracer · 1f5c2abb
      Frédéric Weisbecker 提交于
      Bring the entry to choose the boot tracer on the kernel config.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1f5c2abb