1. 24 6月, 2009 6 次提交
  2. 20 6月, 2009 2 次提交
    • F
      tracing/urgent: warn in case of ftrace_start_up inbalance · 9ea1a153
      Frederic Weisbecker 提交于
      Prevent from further ftrace_start_up inbalances so that we avoid
      future nop patching omissions with dynamic ftrace.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      9ea1a153
    • F
      tracing/urgent: fix unbalanced ftrace_start_up · c85a17e2
      Frederic Weisbecker 提交于
      Perfcounter reports the following stats for a wide system
      profiling:
      
       #
       # (2364 samples)
       #
       # Overhead  Symbol
       # ........  ......
       #
          15.40%  [k] mwait_idle_with_hints
           8.29%  [k] read_hpet
           5.75%  [k] ftrace_caller
           3.60%  [k] ftrace_call
           [...]
      
      This snapshot has been taken while neither the function tracer nor
      the function graph tracer was running.
      With dynamic ftrace, such results show a wrong ftrace behaviour
      because all calls to ftrace_caller or ftrace_graph_caller (the patched
      calls to mcount) are supposed to be patched into nop if none of those
      tracers are running.
      
      The problem occurs after the first run of the function tracer. Once we
      launch it a second time, the callsites will never be nopped back,
      unless you set custom filters.
      For example it happens during the self tests at boot time.
      The function tracer selftest runs, and then the dynamic tracing is
      tested too. After that, the callsites are left un-nopped.
      
      This is because the reset callback of the function tracer tries to
      unregister two ftrace callbacks in once: the common function tracer
      and the function tracer with stack backtrace, regardless of which
      one is currently in use.
      It then creates an unbalance on ftrace_start_up value which is expected
      to be zero when the last ftrace callback is unregistered. When it
      reaches zero, the FTRACE_DISABLE_CALLS is set on the next ftrace
      command, triggering the patching into nop. But since it becomes
      unbalanced, ie becomes lower than zero, if the kernel functions
      are patched again (as in every further function tracer runs), they
      won't ever be nopped back.
      
      Note that ftrace_call and ftrace_graph_call are still patched back
      to ftrace_stub in the off case, but not the callers of ftrace_call
      and ftrace_graph_caller. It means that the tracing is well deactivated
      but we waste a useless call into every kernel function.
      
      This patch just unregisters the right ftrace_ops for the function
      tracer on its reset callback and ignores the other one which is
      not registered, fixing the unbalance. The problem also happens
      is .30
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: stable@kernel.org
      c85a17e2
  3. 19 6月, 2009 2 次提交
    • S
      function-graph: add stack frame test · 71e308a2
      Steven Rostedt 提交于
      In case gcc does something funny with the stack frames, or the return
      from function code, we would like to detect that.
      
      An arch may implement passing of a variable that is unique to the
      function and can be saved on entering a function and can be tested
      when exiting the function. Usually the frame pointer can be used for
      this purpose.
      
      This patch also implements this for x86. Where it passes in the stack
      frame of the parent function, and will test that frame on exit.
      
      There was a case in x86_32 with optimize for size (-Os) where, for a
      few functions, gcc would align the stack frame and place a copy of the
      return address into it. The function graph tracer modified the copy and
      not the actual return address. On return from the funtion, it did not go
      to the tracer hook, but returned to the parent. This broke the function
      graph tracer, because the return of the parent (where gcc did not do
      this funky manipulation) returned to the location that the child function
      was suppose to. This caused strange kernel crashes.
      
      This test detected the problem and pointed out where the issue was.
      
      This modifies the parameters of one of the functions that the arch
      specific code calls, so it includes changes to arch code to accommodate
      the new prototype.
      
      Note, I notice that the parsic arch implements its own push_return_trace.
      This is now a generic function and the ftrace_push_return_trace should be
      used instead. This patch does not touch that code.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      71e308a2
    • S
      function-graph: disable when both x86_32 and optimize for size are configured · eb4a0378
      Steven Rostedt 提交于
      On x86_32, when optimize for size is set, gcc may align the frame pointer
      and make a copy of the the return address inside the stack frame.
      The return address that is located in the stack frame may not be
      the one used to return to the calling function. This will break the
      function graph tracer.
      
      The function graph tracer replaces the return address with a jump to a hook
      function that can trace the exit of the function. If it only replaces
      a copy, then the hook will not be called when the function returns.
      Worse yet, when the parent function returns, the function graph tracer
      will return back to the location of the child function which will
      easily crash the kernel with weird results.
      
      To see the problem, when i386 is compiled with -Os we get:
      
      c106be03:       57                      push   %edi
      c106be04:       8d 7c 24 08             lea    0x8(%esp),%edi
      c106be08:       83 e4 e0                and    $0xffffffe0,%esp
      c106be0b:       ff 77 fc                pushl  0xfffffffc(%edi)
      c106be0e:       55                      push   %ebp
      c106be0f:       89 e5                   mov    %esp,%ebp
      c106be11:       57                      push   %edi
      c106be12:       56                      push   %esi
      c106be13:       53                      push   %ebx
      c106be14:       81 ec 8c 00 00 00       sub    $0x8c,%esp
      c106be1a:       e8 f5 57 fb ff          call   c1021614 <mcount>
      
      When it is compiled with -O2 instead we get:
      
      c10896f0:       55                      push   %ebp
      c10896f1:       89 e5                   mov    %esp,%ebp
      c10896f3:       83 ec 28                sub    $0x28,%esp
      c10896f6:       89 5d f4                mov    %ebx,0xfffffff4(%ebp)
      c10896f9:       89 75 f8                mov    %esi,0xfffffff8(%ebp)
      c10896fc:       89 7d fc                mov    %edi,0xfffffffc(%ebp)
      c10896ff:       e8 d0 08 fa ff          call   c1029fd4 <mcount>
      
      The compile with -Os will align the stack pointer then set up the
      frame pointer (%ebp), and it copies the return address back into
      the stack frame. The change to the return address in mcount is done
      to the copy and not the real place holder of the return address.
      
      Then compile with -O2 sets up the frame pointer first, this makes
      the change to the return address by mcount affect where the function
      will jump on exit.
      Reported-by: NJake Edge <jake@lwn.net>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      eb4a0378
  4. 18 6月, 2009 5 次提交
    • S
      ring-buffer: have benchmark test print to trace buffer · 4b221f03
      Steven Rostedt 提交于
      Currently the output of the ring buffer benchmark/test prints to
      the console. This test runs for ten seconds every ten seconds and
      ouputs the result after every iteration. This needlessly fills up
      the logs.
      
      This patch makes the ring buffer benchmark/test print to the ftrace
      buffer using trace_printk. To view the test results, you must examine
      the debug/tracing/trace file.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4b221f03
    • S
      ring-buffer: do not grab locks in nmi · 8d707e8e
      Steven Rostedt 提交于
      If ftrace_dump_on_oops is set, and an NMI detects a lockup, then it
      will need to read from the ring buffer. But the read side of the
      ring buffer still takes locks. This patch adds a check on the read
      side that if it is in an NMI, then it will disable the ring buffer
      and not take any locks.
      
      Reads can still happen on a disabled ring buffer.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8d707e8e
    • S
      ring-buffer: add locks around rb_per_cpu_empty · d4788207
      Steven Rostedt 提交于
      The checking of whether the buffer is empty or not needs to be serialized
      among the readers. Add the reader spin lock around it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d4788207
    • S
      ring-buffer: check for less than two in size allocation · 5f78abee
      Steven Rostedt 提交于
      The ring buffer must have at least two pages allocated for the
      reader page swap to work.
      
      The page count check will miss the case of a zero size passed in.
      Even though a zero size ring buffer would probably fail an allocation,
      making the min size check for less than two instead of equal to one makes
      the code a bit more robust.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5f78abee
    • S
      ring-buffer: remove useless compile check for buffer_page size · 0dcd4d6c
      Steven Rostedt 提交于
      The original version of the ring buffer had a hack to map the
      page struct that held the pages of the buffer to also be the structure
      that the ring buffer would keep the pages in a link list.
      
      This overlap of the page struct was very dangerous and that hack was
      removed a while ago.
      
      But there was a check to make sure the buffer_page never became bigger
      than the page struct, and would fail the compile if it did. The
      check was only meaningful when we had the hack. Now that we have separate
      allocated descriptors for the buffer pages, we can remove this check.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0dcd4d6c
  5. 17 6月, 2009 7 次提交
    • S
      ring-buffer: remove useless warn on check · c6a9d7b5
      Steven Rostedt 提交于
      A check if "write > BUF_PAGE_SIZE" is done right after a
      
      	if (write > BUF_PAGE_SIZE)
      		return ...;
      
      Thus the check is actually testing the compiler and not the
      kernel. This is useless, remove it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c6a9d7b5
    • S
      ring-buffer: use BUF_PAGE_HDR_SIZE in calculating index · 22f470f8
      Steven Rostedt 提交于
      The index of the event is found by masking PAGE_MASK to it and
      subtracting the header size. Currently the header size is calculate
      by PAGE_SIZE - BUF_PAGE_SIZE, when we already have a macro
      BUF_PAGE_HDR_SIZE to define it.
      
      If we want to change BUF_PAGE_SIZE to something less than filling
      the rest of the page (this is done for debugging), then we break
      the algorithm to find the index.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      22f470f8
    • L
      tracing/filters: fix race between filter setting and module unload · 00e95830
      Li Zefan 提交于
      Module unload is protected by event_mutex, while setting filter is
      protected by filter_mutex. This leads to the race:
      
      echo 'bar == 0 || bar == 10' \    |
      		> sample/filter   |
                                        |  insmod sample.ko
        add_pred("bar == 0")            |
          -> n_preds == 1               |
        add_pred("bar == 100")          |
          -> n_preds == 2               |
                                        |  rmmod sample.ko
                                        |  insmod sample.ko
        add_pred("&&")                  |
          -> n_preds == 1 (should be 3) |
      
      Now event->filter->preds is corrupted. An then when filter_match_preds()
      is called, the WARN_ON() in it will be triggered.
      
      To avoid the race, we remove filter_mutex, and replace it with event_mutex.
      
      [ Impact: prevent corruption of filters by module removing and loading ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A375A4D.6000205@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      00e95830
    • L
      tracing/filters: free filter_string in destroy_preds() · 57be8887
      Li Zefan 提交于
      filter->filter_string is not freed when unloading a module:
      
       # insmod trace-events-sample.ko
       # echo "bar < 100" > /mnt/tracing/events/sample/foo_bar/filter
       # rmmod trace-events-sample.ko
      
      [ Impact: fix memory leak when unloading module ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A375A30.9060802@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      57be8887
    • S
      ring-buffer: use commit counters for commit pointer accounting · fa743953
      Steven Rostedt 提交于
      The ring buffer is made up of three sets of pointers.
      
      The head page pointer, which points to the next page for the reader to
      get.
      
      The commit pointer and commit index, which points to the page and index
      of the last committed write respectively.
      
      The tail pointer and tail index, which points to the page and the index
      of the last reserved data respectively (non committed).
      
      The commit pointer is only moved forward by the outer most writer.
      If a nested writer comes in, it will not move the pointer forward.
      
      The current implementation has a flaw. It assumes that the outer most
      writer successfully reserved data. There's a small race window where
      the outer most writer could find the tail pointer, but a nested
      writer could come in (via interrupt) and move the tail forward, and
      even the commit forward.
      
      The outer writer would not realized the commit moved forward and the
      accounting will break.
      
      This patch changes the design to use counters in the per cpu buffers
      to keep track of commits. The counters are incremented at the start
      of the commit, and decremented at the end. If the end commit counter
      is 1, then it moves the commit pointers. A loop is made to check for
      races between checking and moving the commit pointers. Only the outer
      commit should move the pointers anyway.
      
      The test of knowing if a reserve is equal to the last commit update
      is still needed to know for time keeping. The time code is much less
      racey than the commit updates.
      
      This change not only solves the mentioned race, but also makes the
      code simpler.
      
      [ Impact: fix commit race and simplify code ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      fa743953
    • S
      ring-buffer: remove unused variable · 263294f3
      Steven Rostedt 提交于
      Fix the compiler error:
      
      kernel/trace/ring_buffer.c: In function 'rb_move_tail':
      kernel/trace/ring_buffer.c:1236: warning: unused variable 'event'
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      263294f3
    • S
      ring-buffer: have benchmark test handle discarded events · 9086c7b9
      Steven Rostedt 提交于
      With the addition of commit:
      
        c7b09308
        ring-buffer: prevent adding write in discarded area
      
      The ring buffer may now add discarded events when a write passes
      the end of a buffer page. Before, a discarded event was only added
      when the tracer deliberately created one. The ring buffer benchmark
      test does not handle discarded events when it reads the buffer and
      fails when it encounters one.
      
      Also fix the increment for large data entries (luckily, the test did
      not add any yet).
      
      [ Impact: fix false failure of ring buffer self test ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9086c7b9
  6. 16 6月, 2009 1 次提交
    • G
      debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem. · 156f5a78
      GeunSik Lim 提交于
      Many developers use "/debug/" or "/debugfs/" or "/sys/kernel/debug/"
      directory name to mount debugfs filesystem for ftrace according to
      ./Documentation/tracers/ftrace.txt file.
      
      And, three directory names(ex:/debug/, /debugfs/, /sys/kernel/debug/) is
      existed in kernel source like ftrace, DRM, Wireless, Documentation,
      Network[sky2]files to mount debugfs filesystem.
      
      debugfs means debug filesystem for debugging easy to use by greg kroah
      hartman. "/sys/kernel/debug/" name is suitable as directory name
      of debugfs filesystem.
      - debugfs related reference: http://lwn.net/Articles/334546/
      
      Fix inconsistency of directory name to mount debugfs filesystem.
      
      * From Steven Rostedt
        - find_debugfs() and tracing_files() in this patch.
      Signed-off-by: NGeunSik Lim <geunsik.lim@samsung.com>
      Acked-by     : Inaky Perez-Gonzalez <inaky@linux.intel.com>
      Reviewed-by  : Steven Rostedt <rostedt@goodmis.org>
      Reviewed-by  : James Smart <james.smart@emulex.com>
      CC: Jiri Kosina <trivial@kernel.org>
      CC: David Airlie <airlied@linux.ie>
      CC: Peter Osterlund <petero2@telia.com>
      CC: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      CC: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      CC: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      156f5a78
  7. 15 6月, 2009 7 次提交
  8. 10 6月, 2009 4 次提交
    • S
      tracing: add protection around module events unload · 110bf2b7
      Steven Rostedt 提交于
      When reading the trace buffer, there is a race that when a module
      is unloaded it removes events that is stilled referenced in the buffers.
      This patch adds the protection around the unloading of the events
      from modules and the reading of the trace buffers.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      110bf2b7
    • S
      tracing: add trace_seq_vprint interface · 725c624a
      Steven Rostedt 提交于
      The code to update the print formats for events requires a vprintf
      format in the trace_seq. This patch adds that interface.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      725c624a
    • L
      tracing/events: convert block trace points to TRACE_EVENT() · 55782138
      Li Zefan 提交于
      TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
      these new capabilities to this tracepoint:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
        ...
      
      Cons:
      
        - no dev_t info for the output of plug, unplug_timer and unplug_io events.
          no dev_t info for getrq and sleeprq events if bio == NULL.
          no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
      
          This is mainly because we can't get the deivce from a request queue.
          But this may change in the future.
      
        - A packet command is converted to a string in TP_assign, not TP_print.
          While blktrace do the convertion just before output.
      
          Since pc requests should be rather rare, this is not a big issue.
      
        - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
          has a unique format, which means we have some unused data in a trace entry.
      
          The overhead is minimized by using __dynamic_array() instead of __array().
      
      I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
      
            dd                   dd + ioctl blktrace       dd + TRACE_EVENT (splice)
      1     7.36s, 42.7 MB/s     7.50s, 42.0 MB/s          7.41s, 42.5 MB/s
      2     7.43s, 42.3 MB/s     7.48s, 42.1 MB/s          7.43s, 42.4 MB/s
      3     7.38s, 42.6 MB/s     7.45s, 42.2 MB/s          7.41s, 42.5 MB/s
      
      So the overhead of tracing is very small, and no regression when using
      those trace events vs blktrace.
      
      And the binary output of TRACE_EVENT is much smaller than blktrace:
      
       # ls -l -h
       -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
       -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
       -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
      
      Following are some comparisons between TRACE_EVENT and blktrace:
      
      plug:
        kjournald-480   [000]   303.084981: block_plug: [kjournald]
        kjournald-480   [000]   303.084981:   8,0    P   N [kjournald]
      
      unplug_io:
        kblockd/0-118   [000]   300.052973: block_unplug_io: [kblockd/0] 1
        kblockd/0-118   [000]   300.052974:   8,0    U   N [kblockd/0] 1
      
      remap:
        kjournald-480   [000]   303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
        kjournald-480   [000]   303.085043:   8,0    A   W 102736992 + 8 <- (8,8) 33384
      
      bio_backmerge:
        kjournald-480   [000]   303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
        kjournald-480   [000]   303.085086:   8,0    M   W 102737032 + 8 [kjournald]
      
      getrq:
        kjournald-480   [000]   303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
        kjournald-480   [000]   303.084975:   8,0    G   W 102736984 + 8 [kjournald]
      
        bash-2066  [001]  1072.953770:   8,0    G   N [bash]
        bash-2066  [001]  1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
      
      rq_complete:
        konsole-2065  [001]   300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
        konsole-2065  [001]   300.053191:   8,0    C   W 103669040 + 16 [0]
      
        ksoftirqd/1-7   [001]  1072.953811:   8,0    C   N (5a 00 08 00 00 00 00 00 24 00) [0]
        ksoftirqd/1-7   [001]  1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
      
      rq_insert:
        kjournald-480   [000]   303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
        kjournald-480   [000]   303.084986:   8,0    I   W 102736984 + 8 [kjournald]
      
      Changelog from v2 -> v3:
      
      - use the newly introduced __dynamic_array().
      
      Changelog from v1 -> v2:
      
      - use __string() instead of __array() to minimize the memory required
        to store hex dump of rq->cmd().
      
      - support large pc requests.
      
      - add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
      
      - some cleanups.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      55782138
    • S
      ring-buffer: fix ret in rb_add_time_stamp · f57a8a19
      Steven Rostedt 提交于
      The update of ret got mistakenly added to the if statement of
      rb_try_to_discard. The variable ret should be 1 on commit and zero
      otherwise.
      
      [ Impact: fix compiler warning and real bug ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f57a8a19
  9. 09 6月, 2009 1 次提交
    • P
      ring-buffer: pass in lockdep class key for reader_lock · 1f8a6a10
      Peter Zijlstra 提交于
      On Sun, 7 Jun 2009, Ingo Molnar wrote:
      > Testing tracer sched_switch: <6>Starting ring buffer hammer
      > PASSED
      > Testing tracer sysprof: PASSED
      > Testing tracer function: PASSED
      > Testing tracer irqsoff:
      > =============================================
      > PASSED
      > Testing tracer preemptoff: PASSED
      > Testing tracer preemptirqsoff: [ INFO: possible recursive locking detected ]
      > PASSED
      > Testing tracer branch: 2.6.30-rc8-tip-01972-ge5b9078-dirty #5760
      > ---------------------------------------------
      > rb_consumer/431 is trying to acquire lock:
      >  (&cpu_buffer->reader_lock){......}, at: [<c109eef7>] ring_buffer_reset_cpu+0x37/0x70
      >
      > but task is already holding lock:
      >  (&cpu_buffer->reader_lock){......}, at: [<c10a019e>] ring_buffer_consume+0x7e/0xc0
      >
      > other info that might help us debug this:
      > 1 lock held by rb_consumer/431:
      >  #0:  (&cpu_buffer->reader_lock){......}, at: [<c10a019e>] ring_buffer_consume+0x7e/0xc0
      
      The ring buffer is a generic structure, and can be used outside of
      ftrace. If ftrace traces within the use of the ring buffer, it can produce
      false positives with lockdep.
      
      This patch passes in a static lock key into the allocation of the ring
      buffer, so that different ring buffers will have their own lock class.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1244477919.13761.9042.camel@twins>
      
      [ store key in ring buffer descriptor ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1f8a6a10
  10. 03 6月, 2009 5 次提交
    • S
      tracing: add annotation to what type of stack trace is recorded · 563af16c
      Steven Rostedt 提交于
      The current method of printing out a stack trace is to add a new line
      and print out the trace:
      
          yum-updatesd-3120  [002]   573.691303:
       => do_softirq
       => irq_exit
       => smp_apic_timer_interrupt
       => apic_timer_interrupt
      
      This looks a bit awkward, and if we have both stack and user stack traces
      running, it would be nice to have a title to tell them apart, although
      it is easy to tell by the output.
      
      This patch adds an annotation to the start of the stack traces:
      
                  init-1     [003]   929.304979: <stack trace>
       => user_path_at
       => vfs_fstatat
       => vfs_stat
       => sys_newstat
       => system_call_fastpath
      
                   cat-3459  [002]  1016.824040: <user stack trace>
       =>  <0000003aae6c0250>
       =>  <00007ffff4b06ae4>
       =>  <69636172742f6775>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      563af16c
    • S
      tracing: fix multiple use of __print_flags and __print_symbolic · 56d8bd3f
      Steven Whitehouse 提交于
      Here is an updated patch to include the extra call to
      trace_seq_init() as requested. This is vs. the latest
      -tip tree and fixes the use of multiple __print_flags
      and __print_symbolic in a single tracer. Also tested
      to ensure its working now:
      
      mount.gfs2-2534  [000]   235.850587: gfs2_glock_queue: 8.7 glock 1:2 dequeue PR
      mount.gfs2-2534  [000]   235.850591: gfs2_demote_rq: 8.7 glock 1:0 demote EX to NL flags:DI
      mount.gfs2-2534  [000]   235.850591: gfs2_glock_queue: 8.7 glock 1:0 dequeue EX
      glock_workqueue-2529  [000]   235.850666: gfs2_glock_state_change: 8.7 glock 1:0 state EX => NL tgt:NL dmt:NL flags:lDpI
      glock_workqueue-2529  [000]   235.850672: gfs2_glock_put: 8.7 glock 1:0 state NL => IV flags:I
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      LKML-Reference: <1244037123.29604.603.camel@localhost.localdomain>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      56d8bd3f
    • W
      tracing/events: fix output format of user stack · 048dc50c
      walimis 提交于
      According to "events/ftrace/user_stack/format", fix the output of
      user stack.
      
      before fix:
      
        sh-1073  [000]    31.137561:  <b7f274fe> <-  <0804e33c> <-  <080835c1>
      
      after fix:
      
        sh-1072  [000]    37.039329:
       =>  <b7f8a4fe>
       =>  <0804e33c>
       =>  <080835c1>
      Signed-off-by: Nwalimis <walimisdev@gmail.com>
      LKML-Reference: <1244016090-7814-3-git-send-email-walimisdev@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      048dc50c
    • W
      tracing/events: fix output format of kernel stack · f11b3f4e
      walimis 提交于
      According to "events/ftrace/kernel_stack/format", output format of
      kernel stack should use "=>" instead of "<=".
      
      The second problem is that we shouldn't skip the first entry in the stack,
      although it seems to be duplicated when used in the "function" tracer,
      but events also use it. If we skip the first one, we will drop the topmost
      entry of the stack.
      
      The last problem is that if the last entry is ULONG_MAX(0xffffffff), we should
      drop it, otherwise it will print a NULL name line.
      
      before fix:
      
            sh-1072  [000]   26.957239: sched_process_fork: parent sh:1072 child sh:1073
            sh-1072  [000]   26.957262:
       <= syscall_call
       <=
            sh-1072  [000]   26.957744: sched_switch: task sh:1072 [120] (R) ==> sh:1073 [120]
            sh-1072  [000]   26.957752:
       <= preempt_schedule
       <= wake_up_new_task
       <= do_fork
       <= sys_clone
       <= syscall_call
       <=
      
      After fix:
      
            sh-1075  [000]    39.791848: sched_process_fork: parent sh:1075  child sh:1076
            sh-1075  [000]    39.791871:
       => sys_clone
       => syscall_call
            sh-1075  [000]    39.792713: sched_switch: task sh:1075 [120] (R) ==> sh:1076 [120]
            sh-1075  [000]    39.792722:
       => schedule
       => preempt_schedule
       => wake_up_new_task
       => do_fork
       => sys_clone
       => syscall_call
      Signed-off-by: Nwalimis <walimisdev@gmail.com>
      LKML-Reference: <1244016090-7814-2-git-send-email-walimisdev@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f11b3f4e
    • W
      tracing/trace_stack: fix the number of entries in the header · 083a63b4
      walimis 提交于
      The last entry in the stack_dump_trace is ULONG_MAX, which is not
      a valid entry, but max_stack_trace.nr_entries has accounted for it.
      So when printing the header, we should decrease it by one.
      Before fix, print as following, for example:
      
      	Depth    Size   Location    (53 entries)	<--- should be 52
      	-----    ----   --------
        0)     3264     108   update_wall_time+0x4d5/0x9a0
        ...
       51)       80      80   syscall_call+0x7/0xb
       ^^^
         it's correct.
      Signed-off-by: Nwalimis <walimisdev@gmail.com>
      LKML-Reference: <1244016090-7814-1-git-send-email-walimisdev@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      083a63b4