1. 19 5月, 2011 5 次提交
    • S
      ftrace: Use counters to enable functions to trace · ed926f9b
      Steven Rostedt 提交于
      Every function has its own record that stores the instruction
      pointer and flags for the function to be traced. There are only
      two flags: enabled and free. The enabled flag states that tracing
      for the function has been enabled (actively traced), and the free
      flag states that the record no longer points to a function and can
      be used by new functions (loaded modules).
      
      These flags are now moved to the MSB of the flags (actually just
      the top 32bits). The rest of the bits (30 bits) are now used as
      a ref counter. Everytime a tracer register functions to trace,
      those functions will have its counter incremented.
      
      When tracing is enabled, to determine if a function should be traced,
      the counter is examined, and if it is non-zero it is set to trace.
      
      When a ftrace_ops is registered to trace functions, its hashes
      are examined. If the ftrace_ops filter_hash count is zero, then
      all functions are set to be traced, otherwise only the functions
      in the hash are to be traced. The exception to this is if a function
      is also in the ftrace_ops notrace_hash. Then that function's counter
      is not incremented for this ftrace_ops.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ed926f9b
    • S
      ftrace: Separate hash allocation and assignment · 33dc9b12
      Steven Rostedt 提交于
      When filtering, allocate a hash to insert the function records.
      After the filtering is complete, assign it to the ftrace_ops structure.
      
      This allows the ftrace_ops structure to have a much smaller array of
      hash buckets instead of wasting a lot of memory.
      
      A read only empty_hash is created to be the minimum size that any ftrace_ops
      can point to.
      
      When a new hash is created, it has the following steps:
      
      o Allocate a default hash.
      o Walk the function records assigning the filtered records to the hash
      o Allocate a new hash with the appropriate size buckets
      o Move the entries from the default hash to the new hash.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      33dc9b12
    • S
      ftrace: Create a global_ops to hold the filter and notrace hashes · f45948e8
      Steven Rostedt 提交于
      Combine the filter and notrace hashes to be accessed by a single entity,
      the global_ops. The global_ops is a ftrace_ops structure that is passed
      to different functions that can read or modify the filtering of the
      function tracer.
      
      The ftrace_ops structure was modified to hold a filter and notrace
      hashes so that later patches may allow each ftrace_ops to have its own
      set of rules to what functions may be filtered.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f45948e8
    • S
      ftrace: Use hash instead for FTRACE_FL_FILTER · 1cf41dd7
      Steven Rostedt 提交于
      When multiple users are allowed to have their own set of functions
      to trace, having the FTRACE_FL_FILTER flag will not be enough to
      handle the accounting of those users. Each user will need their own
      set of functions.
      
      Replace the FTRACE_FL_FILTER with a filter_hash instead. This is
      temporary until the rest of the function filtering accounting
      gets in.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1cf41dd7
    • S
      ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions · b448c4e3
      Steven Rostedt 提交于
      To prepare for the accounting system that will allow multiple users of
      the function tracer, having the FTRACE_FL_NOTRACE as a flag in the
      dyn_trace record does not make sense.
      
      All ftrace_ops will soon have a hash of functions they should trace
      and not trace. By making a global hash of functions not to trace makes
      this easier for the transition.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b448c4e3
  2. 07 5月, 2011 1 次提交
  3. 30 4月, 2011 9 次提交
  4. 21 4月, 2011 1 次提交
  5. 16 4月, 2011 1 次提交
  6. 12 4月, 2011 2 次提交
  7. 05 4月, 2011 3 次提交
    • J
      tracing: Avoid soft lockup in trace_pipe · ee5e51f5
      Jiri Olsa 提交于
      running following commands:
      
        # enable the binary option
        echo 1 > ./options/bin
        # disable context info option
        echo 0 > ./options/context-info
        # tracing only events
        echo 1 > ./events/enable
        cat trace_pipe
      
      plus forcing system to generate many tracing events,
      is causing lockup (in NON preemptive kernels) inside
      tracing_read_pipe function.
      
      The issue is also easily reproduced by running ltp stress test.
      (ftrace_stress_test.sh)
      
      The reasons are:
       - bin/hex/raw output functions for events are set to
         trace_nop_print function, which prints nothing and
         returns TRACE_TYPE_HANDLED value
       - LOST EVENT trace do not handle trace_seq overflow
      
      These reasons force the while loop in tracing_read_pipe
      function never to break.
      
      The attached patch fixies handling of lost event trace, and
      changes trace_nop_print to print minimal info, which is needed
      for the correct tracing_read_pipe processing.
      
      v2 changes:
       - omit the cond_resched changes by trace_nop_print changes
       - WARN changed to WARN_ONCE and added info to be able
         to find out the culprit
      
      v3 changes:
       - make more accurate patch comment
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <20110325110518.GC1922@jolsa.brq.redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ee5e51f5
    • S
      tracing: Print trace_bprintk() formats for modules too · 1813dc37
      Steven Rostedt 提交于
      The file debugfs/tracing/printk_formats maps the addresses
      to the formats that are used by trace_bprintk() so that userspace
      tools can read the buffer and be able to decode trace_bprintk events
      to get the format saved when reading the ring buffer directly.
      
      This is because trace_bprintk() does not store the format into the
      buffer, but just the address of the format, which is hidden in
      the kernel memory.
      
      But currently it only exports trace_bprintk()s from the kernel core
      and not for modules. The modules need their formats exported
      as well.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1813dc37
    • S
      tracing: Convert trace_printk() formats for module to const char * · 0588fa30
      Steven Rostedt 提交于
      The trace_printk() formats for modules do not show up in the
      debugfs/tracing/printk_formats file. Only the formats that are
      for trace_printk()s that are in the kernel core.
      
      To facilitate the change to add trace_printk() formats from modules
      into that file as well, we need to convert the structure that
      holds the formats from char fmt[], into const char *fmt,
      and allocate them separately.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0588fa30
  8. 31 3月, 2011 1 次提交
  9. 23 3月, 2011 1 次提交
    • J
      tracing: Fix set_ftrace_filter probe function display · 1106b699
      Jiri Olsa 提交于
      If one or more function probes (like traceon) are enabled,
      and there's no other function filter, the first probe
      func is skipped (which one depends on the position in the hash).
      
      $ echo sys_open:traceon sys_close:traceon > ./set_ftrace_filter
      $ cat set_ftrace_filter
      #### all functions enabled ####
      sys_close:traceon:unlimited
      $
      
      The reason was, that in the case of no other function filter,
      the func_pos was not properly updated before calling t_hash_start.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <1297874134-7008-1-git-send-email-jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1106b699
  10. 18 3月, 2011 1 次提交
  11. 17 3月, 2011 1 次提交
  12. 12 3月, 2011 1 次提交
    • T
      blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. · 805f6b5e
      Tao Ma 提交于
      In blk_add_trace_rq, we only chose the minor 2 bits from
      request's cmd_flags and did some check for discard.
      so most of other flags(e.g, REQ_SYNC) are missing.
      
      For example, with a sync write after blkparse we get:
        8,16   1        1     0.001776503  7509  A  WS 1349632 + 1024 <- (8,17) 1347584
        8,16   1        2     0.001776813  7509  Q  WS 1349632 + 1024 [dd]
        8,16   1        3     0.001780395  7509  G  WS 1349632 + 1024 [dd]
        8,16   1        5     0.001783186  7509  I   W 1349632 + 1024 [dd]
        8,16   1       11     0.001816987  7509  D   W 1349632 + 1024 [dd]
        8,16   0        2     0.006218192     0  C   W 1349632 + 1024 [0]
      
      Since now we have integrated the flags of both bio and request,
      it is safe to pass rq->cmd_flags directly to __blk_add_trace.
      
      With this patch, after a sync write we get:
        8,16   1        1     0.001776900  5425  A  WS 1189888 + 1024 <- (8,17) 1187840
        8,16   1        2     0.001777179  5425  Q  WS 1189888 + 1024 [dd]
        8,16   1        3     0.001780797  5425  G  WS 1189888 + 1024 [dd]
        8,16   1        5     0.001783402  5425  I  WS 1189888 + 1024 [dd]
        8,16   1       11     0.001817468  5425  D  WS 1189888 + 1024 [dd]
        8,16   0        2     0.005640709     0  C  WS 1189888 + 1024 [0]
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      805f6b5e
  13. 10 3月, 2011 9 次提交
    • S
      tracing: Fix irqoff selftest expanding max buffer · 4a0b1665
      Steven Rostedt 提交于
      If the kernel command line declares a tracer "ftrace=sometracer" and
      that tracer is either not defined or is enabled after irqsoff,
      then the irqs off selftest will fail with the following error:
      
      Testing tracer irqsoff:
      ------------[ cut here ]------------
      WARNING: at /home/rostedt/work/autotest/nobackup/linux-test.git/kernel/trace/tra
      ce.c:713 update_max_tr_single+0xfa/0x11b()
      Hardware name:
      Modules linked in:
      Pid: 1, comm: swapper Not tainted 2.6.38-rc8-test #1
      Call Trace:
       [<c0441d9d>] ? warn_slowpath_common+0x65/0x7a
       [<c049adb2>] ? update_max_tr_single+0xfa/0x11b
       [<c0441dc1>] ? warn_slowpath_null+0xf/0x13
       [<c049adb2>] ? update_max_tr_single+0xfa/0x11b
       [<c049e454>] ? stop_critical_timing+0x154/0x204
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049e529>] ? time_hardirqs_on+0x25/0x28
       [<c0468bca>] ? trace_hardirqs_on_caller+0x18/0x12f
       [<c0468cec>] ? trace_hardirqs_on+0xb/0xd
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b6b8>] ? register_tracer+0xf8/0x1a3
       [<c14e93fe>] ? init_irqsoff_tracer+0xd/0x11
       [<c040115e>] ? do_one_initcall+0x71/0x121
       [<c14e93f1>] ? init_irqsoff_tracer+0x0/0x11
       [<c14ce3a9>] ? kernel_init+0x13a/0x1b6
       [<c14ce26f>] ? kernel_init+0x0/0x1b6
       [<c0403842>] ? kernel_thread_helper+0x6/0x10
      ---[ end trace e93713a9d40cd06c ]---
      .. no entries found ..FAILED!
      
      What happens is the "ftrace=..." will expand the ring buffer to its
      default size (from its minimum size) but it will not expand the
      max ring buffer (the ring buffer to store maximum latencies).
      When the irqsoff test runs, it will call the ring buffer swap routine
      that checks if the max ring buffer is the same size as the normal
      ring buffer, and will fail if it is not. This causes the test to fail.
      
      The solution is to expand the max ring buffer before running the self
      test if the max ring buffer is used by that tracer and the normal ring
      buffer is expanded. The max ring buffer should be shrunk again after
      the test is done to save space.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4a0b1665
    • S
      tracing: Align 4 byte ints together in struct tracer · 9a24470b
      Steven Rostedt 提交于
      Move elements in struct tracer for better alignment.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9a24470b
    • Y
      tracing: Export trace_set_clr_event() · 56355b83
      Yuanhan Liu 提交于
      Trace events belonging to a module only exists when the module is
      loaded. Well, we can use trace_set_clr_event funtion to enable some
      trace event at the module init routine, so that we will not miss
      something while loading then module.
      
      So, Export the trace_set_clr_event function so that module can use it.
      Signed-off-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      LKML-Reference: <1289196312-25323-1-git-send-email-yuanhan.liu@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      56355b83
    • J
      tracing: Explain about unstable clock on resume with ring buffer warning · 31274d72
      Jiri Olsa 提交于
      The "Delta way too big" warning might appear on a system with a
      unstable shed clock right after the system is resumed and tracing
      was enabled at time of suspend.
      
      Since it's not realy a bug, and the unstable sched clock is working
      fast and reliable otherwise, Steven suggested to keep using the
      sched clock in any case and just to make note in the warning itself.
      
      v2 changes:
      - added #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <20110218145219.GD2604@jolsa.brq.redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      31274d72
    • D
      tracing: Adjust conditional expression latency formatting. · 10da37a6
      David Sharp 提交于
      Formatting change only to improve code readability. No code changes except to
      introduce intermediate variables.
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-13-git-send-email-dhsharp@google.com>
      
      [ Keep variable declarations and assignment separate ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      10da37a6
    • D
      tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup · 140e4f2d
      David Sharp 提交于
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-6-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      140e4f2d
    • S
      tracing: Remove lock_depth from event entry · e6e1e259
      Steven Rostedt 提交于
      The lock_depth field in the event headers was added as a temporary
      data point for help in removing the BKL. Now that the BKL is pretty
      much been removed, we can remove this field.
      
      This in turn changes the header from 12 bytes to 8 bytes,
      removing the 4 byte buffer that gcc would insert if the first field
      in the data load was 8 bytes in size.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e6e1e259
    • D
      ring-buffer: Remove unused #include <linux/trace_irq.h> · de29be5e
      David Sharp 提交于
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-3-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      de29be5e
    • D
      tracing: Add an 'overwrite' trace_option. · 750912fa
      David Sharp 提交于
      Add an "overwrite" trace_option for ftrace to control whether the buffer should
      be overwritten on overflow or not. The default remains to overwrite old events
      when the buffer is full. This patch adds the option to instead discard newest
      events when the buffer is full. This is useful to get a snapshot of traces just
      after enabling traces. Dropping the current event is also a simpler code path.
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291844807-15481-1-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      750912fa
  14. 03 3月, 2011 1 次提交
    • T
      blktrace: Remove blk_fill_rwbs_rq. · 2d3a8497
      Tao Ma 提交于
      If we enable trace events to trace block actions, We use
      blk_fill_rwbs_rq to analyze the corresponding actions
      in request's cmd_flags, but we only choose the minor 2 bits
      from it, so most of other flags(e.g, REQ_SYNC) are missing.
      For example, with a sync write we get:
      write_test-2409  [001]   160.013869: block_rq_insert: 3,64 W 0 () 258135 + =
      8 [write_test]
      
      Since now we have integrated the flags of both bio and request,
      it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and
      blk_fill_rwbs_rq isn't needed any more.
      
      With this patch, after a sync write we get:
      write_test-2417  [000]   226.603878: block_rq_insert: 3,64 WS 0 () 258135 +=
       8 [write_test]
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      2d3a8497
  15. 18 2月, 2011 1 次提交
  16. 14 2月, 2011 1 次提交
  17. 12 2月, 2011 1 次提交
    • S
      ftrace: Fix memory leak with function graph and cpu hotplug · 868baf07
      Steven Rostedt 提交于
      When the fuction graph tracer starts, it needs to make a special
      stack for each task to save the real return values of the tasks.
      All running tasks have this stack created, as well as any new
      tasks.
      
      On CPU hot plug, the new idle task will allocate a stack as well
      when init_idle() is called. The problem is that cpu hotplug does
      not create a new idle_task. Instead it uses the idle task that
      existed when the cpu went down.
      
      ftrace_graph_init_task() will add a new ret_stack to the task
      that is given to it. Because a clone will make the task
      have a stack of its parent it does not check if the task's
      ret_stack is already NULL or not. When the CPU hotplug code
      starts a CPU up again, it will allocate a new stack even
      though one already existed for it.
      
      The solution is to treat the idle_task specially. In fact, the
      function_graph code already does, just not at init_idle().
      Instead of using the ftrace_graph_init_task() for the idle task,
      which that function expects the task to be a clone, have a
      separate ftrace_graph_init_idle_task(). Also, we will create a
      per_cpu ret_stack that is used by the idle task. When we call
      ftrace_graph_init_idle_task() it will check if the idle task's
      ret_stack is NULL, if it is, then it will assign it the per_cpu
      ret_stack.
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stable Tree <stable@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      868baf07