1. 30 4月, 2011 2 次提交
  2. 16 4月, 2011 1 次提交
  3. 12 4月, 2011 2 次提交
  4. 05 4月, 2011 3 次提交
    • J
      tracing: Avoid soft lockup in trace_pipe · ee5e51f5
      Jiri Olsa 提交于
      running following commands:
      
        # enable the binary option
        echo 1 > ./options/bin
        # disable context info option
        echo 0 > ./options/context-info
        # tracing only events
        echo 1 > ./events/enable
        cat trace_pipe
      
      plus forcing system to generate many tracing events,
      is causing lockup (in NON preemptive kernels) inside
      tracing_read_pipe function.
      
      The issue is also easily reproduced by running ltp stress test.
      (ftrace_stress_test.sh)
      
      The reasons are:
       - bin/hex/raw output functions for events are set to
         trace_nop_print function, which prints nothing and
         returns TRACE_TYPE_HANDLED value
       - LOST EVENT trace do not handle trace_seq overflow
      
      These reasons force the while loop in tracing_read_pipe
      function never to break.
      
      The attached patch fixies handling of lost event trace, and
      changes trace_nop_print to print minimal info, which is needed
      for the correct tracing_read_pipe processing.
      
      v2 changes:
       - omit the cond_resched changes by trace_nop_print changes
       - WARN changed to WARN_ONCE and added info to be able
         to find out the culprit
      
      v3 changes:
       - make more accurate patch comment
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <20110325110518.GC1922@jolsa.brq.redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ee5e51f5
    • S
      tracing: Print trace_bprintk() formats for modules too · 1813dc37
      Steven Rostedt 提交于
      The file debugfs/tracing/printk_formats maps the addresses
      to the formats that are used by trace_bprintk() so that userspace
      tools can read the buffer and be able to decode trace_bprintk events
      to get the format saved when reading the ring buffer directly.
      
      This is because trace_bprintk() does not store the format into the
      buffer, but just the address of the format, which is hidden in
      the kernel memory.
      
      But currently it only exports trace_bprintk()s from the kernel core
      and not for modules. The modules need their formats exported
      as well.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1813dc37
    • S
      tracing: Convert trace_printk() formats for module to const char * · 0588fa30
      Steven Rostedt 提交于
      The trace_printk() formats for modules do not show up in the
      debugfs/tracing/printk_formats file. Only the formats that are
      for trace_printk()s that are in the kernel core.
      
      To facilitate the change to add trace_printk() formats from modules
      into that file as well, we need to convert the structure that
      holds the formats from char fmt[], into const char *fmt,
      and allocate them separately.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0588fa30
  5. 31 3月, 2011 1 次提交
  6. 23 3月, 2011 1 次提交
    • J
      tracing: Fix set_ftrace_filter probe function display · 1106b699
      Jiri Olsa 提交于
      If one or more function probes (like traceon) are enabled,
      and there's no other function filter, the first probe
      func is skipped (which one depends on the position in the hash).
      
      $ echo sys_open:traceon sys_close:traceon > ./set_ftrace_filter
      $ cat set_ftrace_filter
      #### all functions enabled ####
      sys_close:traceon:unlimited
      $
      
      The reason was, that in the case of no other function filter,
      the func_pos was not properly updated before calling t_hash_start.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <1297874134-7008-1-git-send-email-jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1106b699
  7. 18 3月, 2011 1 次提交
  8. 17 3月, 2011 1 次提交
  9. 12 3月, 2011 1 次提交
    • T
      blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. · 805f6b5e
      Tao Ma 提交于
      In blk_add_trace_rq, we only chose the minor 2 bits from
      request's cmd_flags and did some check for discard.
      so most of other flags(e.g, REQ_SYNC) are missing.
      
      For example, with a sync write after blkparse we get:
        8,16   1        1     0.001776503  7509  A  WS 1349632 + 1024 <- (8,17) 1347584
        8,16   1        2     0.001776813  7509  Q  WS 1349632 + 1024 [dd]
        8,16   1        3     0.001780395  7509  G  WS 1349632 + 1024 [dd]
        8,16   1        5     0.001783186  7509  I   W 1349632 + 1024 [dd]
        8,16   1       11     0.001816987  7509  D   W 1349632 + 1024 [dd]
        8,16   0        2     0.006218192     0  C   W 1349632 + 1024 [0]
      
      Since now we have integrated the flags of both bio and request,
      it is safe to pass rq->cmd_flags directly to __blk_add_trace.
      
      With this patch, after a sync write we get:
        8,16   1        1     0.001776900  5425  A  WS 1189888 + 1024 <- (8,17) 1187840
        8,16   1        2     0.001777179  5425  Q  WS 1189888 + 1024 [dd]
        8,16   1        3     0.001780797  5425  G  WS 1189888 + 1024 [dd]
        8,16   1        5     0.001783402  5425  I  WS 1189888 + 1024 [dd]
        8,16   1       11     0.001817468  5425  D  WS 1189888 + 1024 [dd]
        8,16   0        2     0.005640709     0  C  WS 1189888 + 1024 [0]
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      805f6b5e
  10. 10 3月, 2011 9 次提交
    • S
      tracing: Fix irqoff selftest expanding max buffer · 4a0b1665
      Steven Rostedt 提交于
      If the kernel command line declares a tracer "ftrace=sometracer" and
      that tracer is either not defined or is enabled after irqsoff,
      then the irqs off selftest will fail with the following error:
      
      Testing tracer irqsoff:
      ------------[ cut here ]------------
      WARNING: at /home/rostedt/work/autotest/nobackup/linux-test.git/kernel/trace/tra
      ce.c:713 update_max_tr_single+0xfa/0x11b()
      Hardware name:
      Modules linked in:
      Pid: 1, comm: swapper Not tainted 2.6.38-rc8-test #1
      Call Trace:
       [<c0441d9d>] ? warn_slowpath_common+0x65/0x7a
       [<c049adb2>] ? update_max_tr_single+0xfa/0x11b
       [<c0441dc1>] ? warn_slowpath_null+0xf/0x13
       [<c049adb2>] ? update_max_tr_single+0xfa/0x11b
       [<c049e454>] ? stop_critical_timing+0x154/0x204
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049e529>] ? time_hardirqs_on+0x25/0x28
       [<c0468bca>] ? trace_hardirqs_on_caller+0x18/0x12f
       [<c0468cec>] ? trace_hardirqs_on+0xb/0xd
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b6b8>] ? register_tracer+0xf8/0x1a3
       [<c14e93fe>] ? init_irqsoff_tracer+0xd/0x11
       [<c040115e>] ? do_one_initcall+0x71/0x121
       [<c14e93f1>] ? init_irqsoff_tracer+0x0/0x11
       [<c14ce3a9>] ? kernel_init+0x13a/0x1b6
       [<c14ce26f>] ? kernel_init+0x0/0x1b6
       [<c0403842>] ? kernel_thread_helper+0x6/0x10
      ---[ end trace e93713a9d40cd06c ]---
      .. no entries found ..FAILED!
      
      What happens is the "ftrace=..." will expand the ring buffer to its
      default size (from its minimum size) but it will not expand the
      max ring buffer (the ring buffer to store maximum latencies).
      When the irqsoff test runs, it will call the ring buffer swap routine
      that checks if the max ring buffer is the same size as the normal
      ring buffer, and will fail if it is not. This causes the test to fail.
      
      The solution is to expand the max ring buffer before running the self
      test if the max ring buffer is used by that tracer and the normal ring
      buffer is expanded. The max ring buffer should be shrunk again after
      the test is done to save space.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4a0b1665
    • S
      tracing: Align 4 byte ints together in struct tracer · 9a24470b
      Steven Rostedt 提交于
      Move elements in struct tracer for better alignment.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9a24470b
    • Y
      tracing: Export trace_set_clr_event() · 56355b83
      Yuanhan Liu 提交于
      Trace events belonging to a module only exists when the module is
      loaded. Well, we can use trace_set_clr_event funtion to enable some
      trace event at the module init routine, so that we will not miss
      something while loading then module.
      
      So, Export the trace_set_clr_event function so that module can use it.
      Signed-off-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      LKML-Reference: <1289196312-25323-1-git-send-email-yuanhan.liu@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      56355b83
    • J
      tracing: Explain about unstable clock on resume with ring buffer warning · 31274d72
      Jiri Olsa 提交于
      The "Delta way too big" warning might appear on a system with a
      unstable shed clock right after the system is resumed and tracing
      was enabled at time of suspend.
      
      Since it's not realy a bug, and the unstable sched clock is working
      fast and reliable otherwise, Steven suggested to keep using the
      sched clock in any case and just to make note in the warning itself.
      
      v2 changes:
      - added #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <20110218145219.GD2604@jolsa.brq.redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      31274d72
    • D
      tracing: Adjust conditional expression latency formatting. · 10da37a6
      David Sharp 提交于
      Formatting change only to improve code readability. No code changes except to
      introduce intermediate variables.
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-13-git-send-email-dhsharp@google.com>
      
      [ Keep variable declarations and assignment separate ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      10da37a6
    • D
      tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup · 140e4f2d
      David Sharp 提交于
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-6-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      140e4f2d
    • S
      tracing: Remove lock_depth from event entry · e6e1e259
      Steven Rostedt 提交于
      The lock_depth field in the event headers was added as a temporary
      data point for help in removing the BKL. Now that the BKL is pretty
      much been removed, we can remove this field.
      
      This in turn changes the header from 12 bytes to 8 bytes,
      removing the 4 byte buffer that gcc would insert if the first field
      in the data load was 8 bytes in size.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e6e1e259
    • D
      ring-buffer: Remove unused #include <linux/trace_irq.h> · de29be5e
      David Sharp 提交于
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-3-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      de29be5e
    • D
      tracing: Add an 'overwrite' trace_option. · 750912fa
      David Sharp 提交于
      Add an "overwrite" trace_option for ftrace to control whether the buffer should
      be overwritten on overflow or not. The default remains to overwrite old events
      when the buffer is full. This patch adds the option to instead discard newest
      events when the buffer is full. This is useful to get a snapshot of traces just
      after enabling traces. Dropping the current event is also a simpler code path.
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291844807-15481-1-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      750912fa
  11. 03 3月, 2011 1 次提交
    • T
      blktrace: Remove blk_fill_rwbs_rq. · 2d3a8497
      Tao Ma 提交于
      If we enable trace events to trace block actions, We use
      blk_fill_rwbs_rq to analyze the corresponding actions
      in request's cmd_flags, but we only choose the minor 2 bits
      from it, so most of other flags(e.g, REQ_SYNC) are missing.
      For example, with a sync write we get:
      write_test-2409  [001]   160.013869: block_rq_insert: 3,64 W 0 () 258135 + =
      8 [write_test]
      
      Since now we have integrated the flags of both bio and request,
      it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and
      blk_fill_rwbs_rq isn't needed any more.
      
      With this patch, after a sync write we get:
      write_test-2417  [000]   226.603878: block_rq_insert: 3,64 WS 0 () 258135 +=
       8 [write_test]
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      2d3a8497
  12. 18 2月, 2011 1 次提交
  13. 14 2月, 2011 1 次提交
  14. 12 2月, 2011 1 次提交
    • S
      ftrace: Fix memory leak with function graph and cpu hotplug · 868baf07
      Steven Rostedt 提交于
      When the fuction graph tracer starts, it needs to make a special
      stack for each task to save the real return values of the tasks.
      All running tasks have this stack created, as well as any new
      tasks.
      
      On CPU hot plug, the new idle task will allocate a stack as well
      when init_idle() is called. The problem is that cpu hotplug does
      not create a new idle_task. Instead it uses the idle task that
      existed when the cpu went down.
      
      ftrace_graph_init_task() will add a new ret_stack to the task
      that is given to it. Because a clone will make the task
      have a stack of its parent it does not check if the task's
      ret_stack is already NULL or not. When the CPU hotplug code
      starts a CPU up again, it will allocate a new stack even
      though one already existed for it.
      
      The solution is to treat the idle_task specially. In fact, the
      function_graph code already does, just not at init_idle().
      Instead of using the ftrace_graph_init_task() for the idle task,
      which that function expects the task to be a clone, have a
      separate ftrace_graph_init_idle_task(). Also, we will create a
      per_cpu ret_stack that is used by the idle task. When we call
      ftrace_graph_init_idle_task() it will check if the idle task's
      ret_stack is NULL, if it is, then it will assign it the per_cpu
      ret_stack.
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stable Tree <stable@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      868baf07
  15. 09 2月, 2011 3 次提交
  16. 08 2月, 2011 11 次提交
    • I
      tracing/syscalls: Early terminate search for sys_ni_syscall · ae07f551
      Ian Munsie 提交于
      Many system calls are unimplemented and mapped to sys_ni_syscall, but at
      boot ftrace would still search through every syscall metadata entry for
      a match which wouldn't be there.
      
      This patch adds causes the search to terminate early if the system call
      is not mapped.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      LKML-Reference: <1296703645-18718-7-git-send-email-imunsie@au1.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ae07f551
    • I
      tracing/syscalls: Allow arch specific syscall symbol matching · b2d55496
      Ian Munsie 提交于
      Some architectures have unusual symbol names and the generic code to
      match the symbol name with the function name for the syscall metadata
      will fail. For example, symbols on PPC64 start with a period and the
      generic code will fail to match them.
      
      This patch moves the match logic out into a separate function which an
      arch can override by defining ARCH_HAS_SYSCALL_MATCH_SYM_NAME in
      asm/ftrace.h and implementing arch_syscall_match_sym_name.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      LKML-Reference: <1296703645-18718-5-git-send-email-imunsie@au1.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b2d55496
    • I
      tracing/syscalls: Make arch_syscall_addr weak · c763ba06
      Ian Munsie 提交于
      Some architectures use non-trivial system call tables and will not work
      with the generic arch_syscall_addr code. For example, PowerPC64 uses a
      table of twin long longs.
      
      This patch makes the generic arch_syscall_addr weak to allow
      architectures with non-trivial system call tables to override it.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      LKML-Reference: <1296703645-18718-4-git-send-email-imunsie@au1.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c763ba06
    • I
      tracing/syscalls: Convert redundant syscall_nr checks into WARN_ON · 3773b389
      Ian Munsie 提交于
      With the ftrace events now checking if the syscall_nr is valid upon
      initialisation it should no longer be possible to register or unregister
      a syscall event without a valid syscall_nr since they should not be
      created. This adds a WARN_ON_ONCE in the register and unregister
      functions to locate potential regressions in the future.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      LKML-Reference: <1296703645-18718-3-git-send-email-imunsie@au1.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3773b389
    • I
      tracing/syscalls: Don't add events for unmapped syscalls · ba976970
      Ian Munsie 提交于
      FTRACE_SYSCALLS would create events for each and every system call, even
      if it had failed to map the system call's name with it's number. This
      resulted in a number of events being created that would not behave as
      expected.
      
      This could happen, for example, on architectures who's symbol names are
      unusual and will not match the system call name. It could also happen
      with system calls which were mapped to sys_ni_syscall.
      
      This patch changes the default system call number in the metadata to -1.
      If the system call name from the metadata is not successfully mapped to
      a system call number during boot, than the event initialisation routine
      will now return an error, preventing the event from being created.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      LKML-Reference: <1296703645-18718-2-git-send-email-imunsie@au1.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ba976970
    • S
      tracing/filter: Remove synchronize_sched() from __alloc_preds() · 4defe682
      Steven Rostedt 提交于
      Because the filters are processed first and then activated
      (added to the call), we no longer need to worry about the preds
      of the filter in __alloc_preds() being used. As the filter that
      is allocating preds is not activated yet.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4defe682
    • S
      tracing/filter: Swap entire filter of events · 75b8e982
      Steven Rostedt 提交于
      When creating a new filter, instead of allocating the filter to the
      event call first and then processing the filter, it is easier to
      process a temporary filter and then just swap it with the call filter.
      By doing this, it simplifies the code.
      
      A filter is allocated and processed, when it is done, it is
      swapped with the call filter, synchronize_sched() is called to make
      sure all callers are done with the old filter (filters are called
      with premption disabled), and then the old filter is freed.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      75b8e982
    • S
      tracing/filter: Increase the max preds to 2^14 · bf93f9ed
      Steven Rostedt 提交于
      Now that the filter logic does not require to save the pred results
      on the stack, we can increase the max number of preds we allow.
      As the preds are index by a short value, and we use the MSBs as flags
      we can increase the max preds to 2^14 (16384) which should be way
      more than enough.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bf93f9ed
    • S
      tracing/filter: Move MAX_FILTER_PRED to local tracing directory · 4a3d27e9
      Steven Rostedt 提交于
      The MAX_FILTER_PRED is only needed by the kernel/trace/*.c files.
      Move it to kernel/trace/trace.h.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4a3d27e9
    • S
      tracing/filter: Optimize filter by folding the tree · 43cd4145
      Steven Rostedt 提交于
      There are many cases that a filter will contain multiple ORs or
      ANDs together near the leafs. Walking up and down the tree to get
      to the next compare can be a waste.
      
      If there are several ORs or ANDs together, fold them into a single
      pred and allocate an array of the conditions that they check.
      This will speed up the filter by linearly walking an array
      and can still break out if a short circuit condition is met.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      43cd4145
    • S
      tracing/filter: Check the created pred tree · ec126cac
      Steven Rostedt 提交于
      Since the filter walks a tree to determine if a match is made or not,
      if the tree was incorrectly created, it could cause an infinite loop.
      
      Add a check to walk the entire tree before assigning it as a filter
      to make sure the tree is correct.
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ec126cac