1. 24 4月, 2012 2 次提交
    • V
      ring-buffer: Add per_cpu ring buffer control files · 438ced17
      Vaibhav Nagarnaik 提交于
      Add a debugfs entry under per_cpu/ folder for each cpu called
      buffer_size_kb to control the ring buffer size for each CPU
      independently.
      
      If the global file buffer_size_kb is used to set size, the individual
      ring buffers will be adjusted to the given size. The buffer_size_kb will
      report the common size to maintain backward compatibility.
      
      If the buffer_size_kb file under the per_cpu/ directory is used to
      change buffer size for a specific CPU, only the size of the respective
      ring buffer is updated. When tracing/buffer_size_kb is read, it reports
      'X' to indicate that sizes of per_cpu ring buffers are not equivalent.
      
      Link: http://lkml.kernel.org/r/1328212844-11889-1-git-send-email-vnagarnaik@google.com
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: David Sharp <dhsharp@google.com>
      Cc: Justin Teravest <teravest@google.com>
      Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      438ced17
    • S
      tracing: Add percpu buffers for trace_printk() · 07d777fe
      Steven Rostedt 提交于
      Currently, trace_printk() uses a single buffer to write into
      to calculate the size and format needed to save the trace. To
      do this safely in an SMP environment, a spin_lock() is taken
      to only allow one writer at a time to the buffer. But this could
      also affect what is being traced, and add synchronization that
      would not be there otherwise.
      
      Ideally, using percpu buffers would be useful, but since trace_printk()
      is only used in development, having per cpu buffers for something
      never used is a waste of space. Thus, the use of the trace_bprintk()
      format section is changed to be used for static fmts as well as dynamic ones.
      Then at boot up, we can check if the section that holds the trace_printk
      formats is non-empty, and if it does contain something, then we
      know a trace_printk() has been added to the kernel. At this time
      the trace_printk per cpu buffers are allocated. A check is also
      done at module load time in case a module is added that contains a
      trace_printk().
      
      Once the buffers are allocated, they are never freed. If you use
      a trace_printk() then you should know what you are doing.
      
      A buffer is made for each type of context:
      
        normal
        softirq
        irq
        nmi
      
      The context is checked and the appropriate buffer is used.
      This allows for totally lockless usage of trace_printk(),
      and they no longer even disable interrupts.
      Requested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      07d777fe
  2. 14 3月, 2012 1 次提交
  3. 23 2月, 2012 1 次提交
    • S
      tracing/ring-buffer: Only have tracing_on disable tracing buffers · 499e5470
      Steven Rostedt 提交于
      As the ring-buffer code is being used by other facilities in the
      kernel, having tracing_on file disable *all* buffers is not a desired
      affect. It should only disable the ftrace buffers that are being used.
      
      Move the code into the trace.c file and use the buffer disabling
      for tracing_on() and tracing_off(). This way only the ftrace buffers
      will be affected by them and other kernel utilities will not be
      confused to why their output suddenly stopped.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      499e5470
  4. 22 2月, 2012 5 次提交
  5. 04 1月, 2012 1 次提交
  6. 17 11月, 2011 1 次提交
    • S
      tracing: Add irq, preempt-count and need resched info to default trace output · 77271ce4
      Steven Rostedt 提交于
      People keep asking how to get the preempt count, irq, and need resched info
      and we keep telling them to enable the latency format. Some developers think
      that traces without this info is completely useless, and for a lot of tasks
      it is useless.
      
      The first option was to enable the latency trace as the default format, but
      the header for the latency format is pretty useless for most tracers and
      it also does the timestamp in straight microseconds from the time the trace
      started. This is sometimes more difficult to read as the default trace is
      seconds from the start of boot up.
      
      Latency format:
      
       # tracer: nop
       #
       # nop latency trace v1.1.5 on 3.2.0-rc1-test+
       # --------------------------------------------------------------------
       # latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
       #    -----------------
       #    | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
       #    -----------------
       #
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /     delay
       #  cmd     pid   ||||| time  |   caller
       #     \   /      |||||  \    |   /
       migratio-6       0...2 41778231us+: rcu_note_context_switch <-__schedule
       migratio-6       0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
       migratio-6       0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
       migratio-6       0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
       migratio-6       0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
       migratio-6       0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
      
      default format:
      
       # tracer: nop
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
            migration/0-6     [000]    50.025810: rcu_note_context_switch <-__schedule
            migration/0-6     [000]    50.025812: trace_rcu_utilization <-rcu_note_context_switch
            migration/0-6     [000]    50.025813: rcu_sched_qs <-rcu_note_context_switch
            migration/0-6     [000]    50.025815: rcu_preempt_qs <-rcu_note_context_switch
            migration/0-6     [000]    50.025817: trace_rcu_utilization <-rcu_note_context_switch
            migration/0-6     [000]    50.025818: debug_lockdep_rcu_enabled <-__schedule
            migration/0-6     [000]    50.025820: debug_lockdep_rcu_enabled <-__schedule
      
      The latency format header has latency information that is pretty meaningless
      for most tracers. Although some of the header is useful, and we can add that
      later to the default format as well.
      
      What is really useful with the latency format is the irqs-off, need-resched
      hard/softirq context and the preempt count.
      
      This commit adds the option irq-info which is on by default that adds this
      information:
      
       # tracer: nop
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
                 <idle>-0     [000] d..2    49.309305: cpuidle_get_driver <-cpuidle_idle_call
                 <idle>-0     [000] d..2    49.309307: mwait_idle <-cpu_idle
                 <idle>-0     [000] d..2    49.309309: need_resched <-mwait_idle
                 <idle>-0     [000] d..2    49.309310: test_ti_thread_flag <-need_resched
                 <idle>-0     [000] d..2    49.309312: trace_power_start.constprop.13 <-mwait_idle
                 <idle>-0     [000] d..2    49.309313: trace_cpu_idle <-mwait_idle
                 <idle>-0     [000] d..2    49.309315: need_resched <-mwait_idle
      
      If a user wants the old format, they can disable the 'irq-info' option:
      
       # tracer: nop
       #
       #           TASK-PID   CPU#      TIMESTAMP  FUNCTION
       #              | |       |          |         |
                 <idle>-0     [000]     49.309305: cpuidle_get_driver <-cpuidle_idle_call
                 <idle>-0     [000]     49.309307: mwait_idle <-cpu_idle
                 <idle>-0     [000]     49.309309: need_resched <-mwait_idle
                 <idle>-0     [000]     49.309310: test_ti_thread_flag <-need_resched
                 <idle>-0     [000]     49.309312: trace_power_start.constprop.13 <-mwait_idle
                 <idle>-0     [000]     49.309313: trace_cpu_idle <-mwait_idle
                 <idle>-0     [000]     49.309315: need_resched <-mwait_idle
      Requested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      77271ce4
  7. 08 11月, 2011 1 次提交
    • J
      tracing/latency: Fix header output for latency tracers · 7e9a49ef
      Jiri Olsa 提交于
      In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the
      function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers
      do not display proper latency header.
      
      The involved/fixed latency tracers are:
              wakeup_rt
              wakeup
              preemptirqsoff
              preemptoff
              irqsoff
      
      The patch adds proper handling of tracer configuration options for latency
      tracers, and displaying correct header info accordingly.
      
      * The current output (for wakeup tracer) with both graph and function
        tracers disabled is:
      
        # tracer: wakeup
        #
          <idle>-0       0d.h5    1us+:      0:120:R   + [000]     7:  0:R watchdog/0
          <idle>-0       0d.h5    3us+: ttwu_do_activate.clone.1 <-try_to_wake_up
          ...
      
      * The fixed output is:
      
        # tracer: wakeup
        #
        # wakeup latency trace v1.1.5 on 3.1.0-tip+
        # --------------------------------------------------------------------
        # latency: 55 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
        #    -----------------
        #    | task: migration/0-6 (uid:0 nice:0 policy:1 rt_prio:99)
        #    -----------------
        #
        #                  _------=> CPU#
        #                 / _-----=> irqs-off
        #                | / _----=> need-resched
        #                || / _---=> hardirq/softirq
        #                ||| / _--=> preempt-depth
        #                |||| /     delay
        #  cmd     pid   ||||| time  |   caller
        #     \   /      |||||  \    |   /
             cat-1129    0d..4    1us :   1129:120:R   + [000]     6:  0:R migration/0
             cat-1129    0d..4    2us+: ttwu_do_activate.clone.1 <-try_to_wake_up
      
      * The current output (for wakeup tracer) with only function
        tracer enabled is:
      
        # tracer: wakeup
        #
             cat-1140    0d..4    1us+:   1140:120:R   + [000]     6:  0:R migration/0
             cat-1140    0d..4    2us : ttwu_do_activate.clone.1 <-try_to_wake_up
      
      * The fixed output is:
        # tracer: wakeup
        #
        # wakeup latency trace v1.1.5 on 3.1.0-tip+
        # --------------------------------------------------------------------
        # latency: 207 us, #109/109, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
        #    -----------------
        #    | task: watchdog/1-12 (uid:0 nice:0 policy:1 rt_prio:99)
        #    -----------------
        #
        #                  _------=> CPU#
        #                 / _-----=> irqs-off
        #                | / _----=> need-resched
        #                || / _---=> hardirq/softirq
        #                ||| / _--=> preempt-depth
        #                |||| /     delay
        #  cmd     pid   ||||| time  |   caller
        #     \   /      |||||  \    |   /
          <idle>-0       1d.h5    1us+:      0:120:R   + [001]    12:  0:R watchdog/1
          <idle>-0       1d.h5    3us : ttwu_do_activate.clone.1 <-try_to_wake_up
      
      Link: http://lkml.kernel.org/r/20111107150849.GE1807@m.brq.redhat.com
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7e9a49ef
  8. 11 10月, 2011 1 次提交
    • S
      tracing: Warn on output if the function tracer was found corrupted · e0a413f6
      Steven Rostedt 提交于
      As the function tracer is very intrusive, lots of self checks are
      performed on the tracer and if something is found to be strange
      it will shut itself down keeping it from corrupting the rest of the
      kernel. This shutdown may still allow functions to be traced, as the
      tracing only stops new modifications from happening. Trying to stop
      the function tracer itself can cause more harm as it requires code
      modification.
      
      Although a WARN_ON() is executed, a user may not notice it. To help
      the user see that something isn't right with the tracing of the system
      a big warning is added to the output of the tracer that lets the user
      know that their data may be incomplete.
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e0a413f6
  9. 20 8月, 2011 2 次提交
  10. 27 7月, 2011 1 次提交
  11. 08 7月, 2011 1 次提交
  12. 07 7月, 2011 1 次提交
    • S
      tracing: Fix bug when reading system filters on module removal · e9dbfae5
      Steven Rostedt 提交于
      The event system is freed when its nr_events is set to zero. This happens
      when a module created an event system and then later the module is
      removed. Modules may share systems, so the system is allocated when
      it is created and freed when the modules are unloaded and all the
      events under the system are removed (nr_events set to zero).
      
      The problem arises when a task opened the "filter" file for the
      system. If the module is unloaded and it removed the last event for
      that system, the system structure is freed. If the task that opened
      the filter file accesses the "filter" file after the system has
      been freed, the system will access an invalid pointer.
      
      By adding a ref_count, and using it to keep track of what
      is using the event system, we can free it after all users
      are finished with the event system.
      
      Cc: <stable@kernel.org>
      Reported-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e9dbfae5
  13. 15 6月, 2011 2 次提交
    • M
      tracing/kprobes: Fix kprobe-tracer to support stack trace · 1fd8df2c
      Masami Hiramatsu 提交于
      Fix to support kernel stack trace correctly on kprobe-tracer.
      Since the execution path of kprobe-based dynamic events is different
      from other tracepoint-based events, normal ftrace_trace_stack() doesn't
      work correctly. To fix that, this introduces ftrace_trace_stack_regs()
      which traces stack via pt_regs instead of current stack register.
      
      e.g.
      
       # echo p schedule+4 > /sys/kernel/debug/tracing/kprobe_events
       # echo 1 > /sys/kernel/debug/tracing/options/stacktrace
       # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
       # head -n 20 /sys/kernel/debug/tracing/trace
                  bash-2968  [000] 10297.050245: p_schedule_4: (schedule+0x4/0x4ca)
                  bash-2968  [000] 10297.050247: <stack trace>
       => schedule_timeout
       => n_tty_read
       => tty_read
       => vfs_read
       => sys_read
       => system_call_fastpath
           kworker/0:1-2940  [000] 10297.050265: p_schedule_4: (schedule+0x4/0x4ca)
           kworker/0:1-2940  [000] 10297.050266: <stack trace>
       => worker_thread
       => kthread
       => kernel_thread_helper
                  sshd-1132  [000] 10297.050365: p_schedule_4: (schedule+0x4/0x4ca)
                  sshd-1132  [000] 10297.050365: <stack trace>
       => sysret_careful
      
      Note: Even with this fix, the first entry will be skipped
      if the probe is put on the function entry area before
      the frame pointer is set up (usually, that is 4 bytes
       (push %bp; mov %sp %bp) on x86), because stack unwinder
      depends on the frame pointer.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Link: http://lkml.kernel.org/r/20110608070934.17777.17116.stgit@fedora15Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1fd8df2c
    • S
      tracing: Add disable_on_free option · cf30cf67
      Steven Rostedt 提交于
      Add a trace option to disable tracing on free. When this option is
      set, a write into the free_buffer file will not only shrink the
      ring buffer down to zero, but it will also disable tracing.
      
      Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cf30cf67
  14. 26 5月, 2011 1 次提交
    • S
      ftrace: Add internal recursive checks · b1cff0ad
      Steven Rostedt 提交于
      Witold reported a reboot caused by the selftests of the dynamic function
      tracer. He sent me a config and I used ktest to do a config_bisect on it
      (as my config did not cause the crash). It pointed out that the problem
      config was CONFIG_PROVE_RCU.
      
      What happened was that if multiple callbacks are attached to the
      function tracer, we iterate a list of callbacks. Because the list is
      managed by synchronize_sched() and preempt_disable, the access to the
      pointers uses rcu_dereference_raw().
      
      When PROVE_RCU is enabled, the rcu_dereference_raw() calls some
      debugging functions, which happen to be traced. The tracing of the debug
      function would then call rcu_dereference_raw() which would then call the
      debug function and then... well you get the idea.
      
      I first wrote two different patches to solve this bug.
      
      1) add a __rcu_dereference_raw() that would not do any checks.
      2) add notrace to the offending debug functions.
      
      Both of these patches worked.
      
      Talking with Paul McKenney on IRC, he suggested to add recursion
      detection instead. This seemed to be a better solution, so I decided to
      implement it. As the task_struct already has a trace_recursion to detect
      recursion in the ring buffer, and that has a very small number it
      allows, I decided to use that same variable to add flags that can detect
      the recursion inside the infrastructure of the function tracer.
      
      I plan to change it so that the task struct bit can be checked in
      mcount, but as that requires changes to all archs, I will hold that off
      to the next merge window.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1306348063.1465.116.camel@gandalf.stny.rr.comReported-by: NWitold Baryluk <baryluk@smp.if.uj.edu.pl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b1cff0ad
  15. 19 5月, 2011 1 次提交
  16. 10 3月, 2011 2 次提交
  17. 08 2月, 2011 7 次提交
    • S
      tracing/filter: Increase the max preds to 2^14 · bf93f9ed
      Steven Rostedt 提交于
      Now that the filter logic does not require to save the pred results
      on the stack, we can increase the max number of preds we allow.
      As the preds are index by a short value, and we use the MSBs as flags
      we can increase the max preds to 2^14 (16384) which should be way
      more than enough.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bf93f9ed
    • S
      tracing/filter: Move MAX_FILTER_PRED to local tracing directory · 4a3d27e9
      Steven Rostedt 提交于
      The MAX_FILTER_PRED is only needed by the kernel/trace/*.c files.
      Move it to kernel/trace/trace.h.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4a3d27e9
    • S
      tracing/filter: Optimize filter by folding the tree · 43cd4145
      Steven Rostedt 提交于
      There are many cases that a filter will contain multiple ORs or
      ANDs together near the leafs. Walking up and down the tree to get
      to the next compare can be a waste.
      
      If there are several ORs or ANDs together, fold them into a single
      pred and allocate an array of the conditions that they check.
      This will speed up the filter by linearly walking an array
      and can still break out if a short circuit condition is met.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      43cd4145
    • S
      tracing/filter: Use a tree instead of stack for filter_match_preds() · 61e9dea2
      Steven Rostedt 提交于
      Currently the filter_match_preds() requires a stack to push
      and pop the preds to determine if the filter matches the record or not.
      This has two drawbacks:
      
      1) It requires a stack to store state information. As this is done
         in fast paths we can't allocate the storage for this stack, and
         we can't use a global as it must be re-entrant. The stack is stored
         on the kernel stack and this greatly limits how many preds we
         may allow.
      
      2) All conditions are calculated even when a short circuit exists.
         a || b  will always calculate a and b even though a was determined
         to be true.
      
      Using a tree we can walk a constant structure that will save
      the state as we go. The algorithm is simply:
      
        pred = root;
        do {
      	switch (move) {
      	case MOVE_DOWN:
      		if (OR or AND) {
      			pred = left;
      			continue;
      		}
      		if (pred == root)
      			break;
      		match = pred->fn();
      		pred = pred->parent;
      		move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT;
      		continue;
      
      	case MOVE_UP_FROM_LEFT:
      		/* Only OR or AND can be a parent */
      		if (match && OR || !match && AND) {
      			/* short circuit */
      			if (pred == root)
      				break;
      			pred = pred->parent;
      			move = left child ?
      				MOVE_UP_FROM_LEFT :
      				MOVE_UP_FROM_RIGHT;
      			continue;
      		}
      		pred = pred->right;
      		move = MOVE_DOWN;
      		continue;
      
      	case MOVE_UP_FROM_RIGHT:
      		if (pred == root)
      			break;
      		pred = pred->parent;
      		move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT;
      		continue;
      	}
      	done = 1;
        } while (!done);
      
      This way there's no strict limit to how many preds we allow
      and it also will short circuit the logical operations when possible.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      61e9dea2
    • S
      tracing/filter: Allocate the preds in an array · 74e9e58c
      Steven Rostedt 提交于
      Currently we allocate an array of pointers to filter_preds, and then
      allocate a separate filter_pred for each item in the array.
      This adds slight overhead in the filters as it needs to derefernce
      twice to get to the op condition.
      
      Allocating the preds themselves in a single array removes a dereference
      as well as helps on the cache footprint.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      74e9e58c
    • S
      tracing/filter: Dynamically allocate preds · c9c53ca0
      Steven Rostedt 提交于
      For every filter that is made, we create predicates to hold every
      operation within the filter. We have a max of 32 predicates that we
      can hold. Currently, we allocate all 32 even if we only need to
      use one.
      
      Part of the reason we do this is that the filter can be used at
      any moment by any event. Fortunately, the filter is only used
      with preemption disabled. By reseting the count of preds used "n_preds"
      to zero, then performing a synchronize_sched(), we can safely
      free and reallocate a new array of preds.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c9c53ca0
    • S
      tracing/filter: Move OR and AND logic out of fn() method · 58d9a597
      Steven Rostedt 提交于
      The ops OR and AND act different from the other ops, as they
      are the only ones to take other ops as their arguements.
      These ops als change the logic of the filter_match_preds.
      
      By removing the OR and AND fn's we can also remove the val1 and val2
      that is passed to all other fn's and are unused.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      58d9a597
  18. 18 10月, 2010 1 次提交
  19. 05 8月, 2010 1 次提交
  20. 21 7月, 2010 2 次提交
    • K
      tracing: Shrink max latency ringbuffer if unnecessary · ef710e10
      KOSAKI Motohiro 提交于
      Documentation/trace/ftrace.txt says
      
        buffer_size_kb:
      
              This sets or displays the number of kilobytes each CPU
              buffer can hold. The tracer buffers are the same size
              for each CPU. The displayed number is the size of the
              CPU buffer and not total size of all buffers. The
              trace buffers are allocated in pages (blocks of memory
              that the kernel uses for allocation, usually 4 KB in size).
              If the last page allocated has room for more bytes
              than requested, the rest of the page will be used,
              making the actual allocation bigger than requested.
              ( Note, the size may not be a multiple of the page size
                due to buffer management overhead. )
      
              This can only be updated when the current_tracer
              is set to "nop".
      
      But it's incorrect. currently total memory consumption is
      'buffer_size_kb x CPUs x 2'.
      
      Why two times difference is there? because ftrace implicitly allocate
      the buffer for max latency too.
      
      That makes sad result when admin want to use large buffer. (If admin
      want full logging and makes detail analysis). example, If admin
      have 24 CPUs machine and write 200MB to buffer_size_kb, the system
      consume ~10GB memory (200MB x 24 x 2). umm.. 5GB memory waste is
      usually unacceptable.
      
      Fortunatelly, almost all users don't use max latency feature.
      The max latency buffer can be disabled easily.
      
      This patch shrink buffer size of the max latency buffer if
      unnecessary.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      LKML-Reference: <20100701104554.DA2D.A69D9226@jp.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ef710e10
    • L
      tracing: Allow to disable cmdline recording · e870e9a1
      Li Zefan 提交于
      We found that even enabling a single trace event that will rarely be
      triggered can add big overhead to context switch.
      
      (lmbench context switch test)
       -------------------------------------------------
       2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
       ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
      ------ ------ ------ ------ ------ ------- -------
        2.19   2.3   2.21   2.56   2.13     2.54    2.07
        2.39   2.51  2.35   2.75   2.27     2.81    2.24
      
      The overhead is 6% ~ 11%.
      
      It's because when a trace event is enabled 3 tracepoints (sched_switch,
      sched_wakeup, sched_wakeup_new) will be activated to map pid to cmdname.
      
      We'd like to avoid this overhead, so add a trace option '(no)record-cmd'
      to allow to disable cmdline recording.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4C2D57F4.2050204@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e870e9a1
  21. 20 7月, 2010 2 次提交
  22. 16 7月, 2010 1 次提交
    • F
      tracing: Remove ksym tracer · 5d550467
      Frederic Weisbecker 提交于
      The ksym (breakpoint) ftrace plugin has been superseded by perf
      tools that are much more poweful to use the cpu breakpoints.
      This tracer doesn't bring more feature. It has been deprecated
      for a while now, lets remove it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      5d550467
  23. 29 6月, 2010 1 次提交
  24. 09 6月, 2010 1 次提交