1. 07 4月, 2009 7 次提交
    • Z
      ftrace: Correct a text align for event format output · 1bbe2a83
      Zhaolei 提交于
      If we cat debugfs/tracing/events/ftrace/bprint/format, we'll see:
      name: bprint
      ID: 6
      format:
      	field:unsigned char common_type;	offset:0;	size:1;
      	field:unsigned char common_flags;	offset:1;	size:1;
      	field:unsigned char common_preempt_count;	offset:2;	size:1;
      	field:int common_pid;	offset:4;	size:4;
      	field:int common_tgid;	offset:8;	size:4;
      
      	field:unsigned long ip;	offset:12;	size:4;
      	field:char * fmt;	offset:16;	size:4;
      	field: char buf;	offset:20;	size:0;
      
      print fmt: "%08lx (%d) fmt:%p %s"
      
      There is an inconsistent blank before char buf.
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      LKML-Reference: <49D5E3EE.70201@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1bbe2a83
    • N
      Update /debug/tracing/README · bc2b6871
      Nikanth Karthikesan 提交于
      Some of the tracers have been renamed, which was not updated in the in-kernel
      run-time README file. Update it.
      Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
      LKML-Reference: <200903231158.32151.knikanth@suse.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bc2b6871
    • F
      tracing/ftrace: alloc the started cpumask for the trace file · b0dfa978
      Frederic Weisbecker 提交于
      Impact: fix a crash while cat trace file
      
      Currently we are using a cpumask to remind each cpu where a
      trace occured. It lets us notice the user that a cpu just had
      its first trace.
      
      But on latest -tip we have the following crash once we cat the trace
      file:
      
      IP: [<c0270c4a>] print_trace_fmt+0x45/0xe7
      *pde = 00000000
      Oops: 0000 [#1] PREEMPT SMP
      last sysfs file: /sys/class/net/eth0/carrier
      Pid: 3897, comm: cat Not tainted (2.6.29-tip-02825-g0f22972-dirty #81)
      EIP: 0060:[<c0270c4a>] EFLAGS: 00010297 CPU: 0
      EIP is at print_trace_fmt+0x45/0xe7
      EAX: 00000000 EBX: 00000000 ECX: c12d9e98 EDX: ccdb7010
      ESI: d31f4000 EDI: 00322401 EBP: d31f3f10 ESP: d31f3efc
      DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process cat (pid: 3897, ti=d31f2000 task=d3b3cf20 task.ti=d31f2000)
      Stack:
      d31f4080 ccdb7010 d31f4000 d691fe70 ccdb7010 d31f3f24 c0270e5c d31f4000
      d691fe70 d31f4000 d31f3f34 c02718e8 c12d9e98 d691fe70 d31f3f70 c02bfc33
      00001000 09130000 d3b46e00 d691fe98 00000000 00000079 00000001 00000000
      Call Trace:
      [<c0270e5c>] ? print_trace_line+0x170/0x17c
      [<c02718e8>] ? s_show+0xa7/0xbd
      [<c02bfc33>] ? seq_read+0x24a/0x327
      [<c02bf9e9>] ? seq_read+0x0/0x327
      [<c02ab18b>] ? vfs_read+0x86/0xe1
      [<c02ab289>] ? sys_read+0x40/0x65
      [<c0202d8f>] ? sysenter_do_call+0x12/0x3c
      Code: 00 00 00 89 45 ec f7 c7 00 20 00 00 89 55 f0 74 4e f6 86 98 10 00 00 02 74 45 8b 86 8c 10 00 00 8b 9e a8 10 00 00 e8 52 f3 ff ff <0f> a3 03 19 c0 85 c0 75 2b 8b 86 8c 10 00 00 8b 9e a8 10 00 00
      EIP: [<c0270c4a>] print_trace_fmt+0x45/0xe7 SS:ESP 0068:d31f3efc
      CR2: 0000000000000000
      ---[ end trace aa9cf38e5ebed9dd ]---
      
      This is because we alloc the iter->started cpumask on tracing_pipe_open but
      not on tracing_open.
      
      It hadn't been noticed until now because we need to have ring buffer overruns
      to activate the starting of cpu buffer detection.
      
      Also, we need a check to not print the messagge for the first trace on the file.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1238619188-6109-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b0dfa978
    • Z
      ftrace: Add check of sched_stopped for probe_sched_wakeup · 8bcae09b
      Zhaolei 提交于
      The wakeup tracing in sched_switch does not stop when a user
      disables tracing. This is because the probe_sched_wakeup() is missing
      the check to prevent the wakeup from being traced.
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      LKML-Reference: <49D1C543.3010307@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8bcae09b
    • F
      tracing/ftrace: fix missing include string.h · 5f0c6c03
      Frederic Weisbecker 提交于
      Building a kernel with tracing can raise the following warning on
      tip/master:
      
      kernel/trace/trace.c:1249: error: implicit declaration of function 'vbin_printf'
      
      We are missing an include to string.h
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1238160130-7437-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5f0c6c03
    • L
      tracing: fix incorrect return type of ns2usecs() · cf8e3474
      Lai Jiangshan 提交于
      Impact: fix time output bug in 32bits system
      
      ns2usecs() returns 'long', it's incorrect.
      
      (In i386)
      ...
                <idle>-0     [000]   521.442100: _spin_lock <-tick_do_update_jiffies64
                <idle>-0     [000]   521.442101: do_timer <-tick_do_update_jiffies64
                <idle>-0     [000]   521.442102: update_wall_time <-do_timer
                <idle>-0     [000]   521.442102: update_xtime_cache <-update_wall_time
      ....
      (It always print the time less than 2200 seconds besides ...)
      Because 'long' is 32bits in i386. ( (1<<31) useconds is about 2200 seconds)
      
      ...
                <idle>-0     [001] 4154502640.134759: rcu_bh_qsctr_inc <-__do_softirq
                <idle>-0     [001] 4154502640.134760: _local_bh_enable <-__do_softirq
                <idle>-0     [001] 4154502640.134761: idle_cpu <-irq_exit
      ...
      (very large value)
      Because 'long' is a signed type and it is 32bits in i386.
      
      Changes in v2:
      return 'unsigned long long' instead of 'cycle_t'
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <49D05D10.4030009@cn.fujitsu.com>
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cf8e3474
    • S
      tracing: remove CALLER_ADDR2 from wakeup tracer · 301fd748
      Steven Rostedt 提交于
      Maneesh Soni was getting a crash when running the wakeup tracer.
      We debugged it down to the recording of the function with the
      CALLER_ADDR2 macro.  This is used to get the location of the caller
      to schedule.
      
      But the problem comes when schedule is called by assmebly. In the case
      that Maneesh had, retint_careful would call schedule. But retint_careful
      does not set up a proper frame pointer. CALLER_ADDR2 is defined as
      __builtin_return_address(2). This produces the following assembly in
      the wakeup tracer code.
      
         mov    0x0(%rbp),%rcx  <--- get the frame pointer of the caller
         mov    %r14d,%r8d
         mov    0xf2de8e(%rip),%rdi
      
         mov    0x8(%rcx),%rsi  <-- this is __builtin_return_address(1)
         mov    0x28(%rdi,%rax,8),%rbx
      
         mov    (%rcx),%rax  <-- get the frame pointer of the caller's caller
         mov    %r12,%rcx
         mov    0x8(%rax),%rdx <-- this is __builtin_return_address(2)
      
      At the reading of 0x8(%rax) Maneesh's machine would take a fault.
      The reason is that retint_careful did not set up the return address
      and the content of %rax here was zero.
      
      To verify this, I sent Maneesh a patch to create a frame pointer
      in retint_careful. He ran the test again but this time he would take
      the same type of fault from sysret_careful. The retint_careful was no
      longer an issue, but there are other callers that still have issues.
      
      Instead of adding frame pointers for all callers to schedule (in possibly
      all archs), it is much safer to simply not use CALLER_ADDR2. This
      loses out on knowing what called schedule, but the function tracer
      will help there if needed.
      Reported-by: NManeesh Soni <maneesh@in.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      301fd748
  2. 03 4月, 2009 7 次提交
  3. 01 4月, 2009 2 次提交
    • S
      ring-buffer: do not remove reader page from list on ring buffer free · 2e572895
      Steven Rostedt 提交于
      Impact: prevent possible memory leak
      
      The reader page of the ring buffer is special. Although it points
      into the ring buffer, it is not part of the actual buffer. It is
      a page used by the reader to swap with a page in the ring buffer.
      Once the swap is made, the new reader page is again outside the
      buffer.
      
      Even though the reader page points into the buffer, it is really
      pointing to residual data. Note, this data is used by the reader.
      
                    reader page
                        |
                        v
             (prev)   +---+    (next)
           +----------|   |----------+
           |          +---+          |
           v                         v
         +---+        +---+        +---+
      -->|   |------->|   |------->|   |--->
      <--|   |<-------|   |<-------|   |<---
         +---+        +---+        +---+
      
           ^            ^            ^
            \           |            /
             ------- Buffer---------
      
      If we perform a list_del_init() on the reader page we will actually remove
      the last page the reader swapped with and not the reader page itself.
      This will cause that page to not be freed, and thus is a memory leak.
      
      Luckily, the only user of the ring buffer so far is ftrace. And ftrace
      will not free its ring buffer after it allocates it. There is no current
      possible memory leak. But once there are other users, or if ftrace
      dynamically creates and frees its ring buffer, then this would be a
      memory leak.
      
      This patch fixes the leak for future cases.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e572895
    • S
      function-graph: allow unregistering twice · 2aad1b76
      Steven Rostedt 提交于
      Impact: fix to permanent disabling of function graph tracer
      
      There should be nothing to prevent a tracer from unregistering a
      function graph callback more than once. This can simplify error paths.
      
      But currently, the counter does not account for mulitple unregistering
      of the function graph callback. If it happens, the function graph
      tracer will be permanently disabled.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2aad1b76
  4. 31 3月, 2009 12 次提交
  5. 30 3月, 2009 2 次提交
  6. 26 3月, 2009 5 次提交
  7. 24 3月, 2009 5 次提交
    • L
      tracing: use union for multi-usages field · ee000b7f
      Lai Jiangshan 提交于
      Impact: cleanup
      
      struct dyn_ftrace::ip has different usages in his lifecycle,
      we use union for it. And also for struct dyn_ftrace::flags.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49C871BE.3080405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ee000b7f
    • L
      ftrace: show virtual PID · cc59c9e8
      Lai Jiangshan 提交于
      Impact: fix PID output under namespaces
      
      When current namespace is not the global namespace,
      pid read from set_ftrace_pid is no correct.
      
       # ~/newpid_namespace_run bash
       # echo $$
       1
       # echo 1 > set_ftrace_pid
       # cat set_ftrace_pid
       3756
      
      Since we write virtual PID to set_ftrace_pid, we need get
      virtual PID when we read it.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49C84D65.9050606@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc59c9e8
    • S
      function-graph: add option for include sleep times · be6f164a
      Steven Rostedt 提交于
      Impact: give user a choice to show times spent while sleeping
      
      The user may want to see the time a function spent sleeping.
      This patch adds the trace option "sleep-time" to allow that.
      The "sleep-time" option is default on.
      
       echo sleep-time > /debug/tracing/trace_options
      
      produces:
      
       ------------------------------------------
       2)  avahi-d-3428  =>    <idle>-0
       ------------------------------------------
      
       2)               |      finish_task_switch() {
       2)   0.621 us    |        _spin_unlock_irq();
       2)   2.202 us    |      }
       2) ! 1002.197 us |    }
       2) ! 1003.521 us |  }
      
      where as,
      
       echo nosleep-time > /debug/tracing/trace_options
      
      produces:
      
       0)    <idle>-0    =>  yum-upd-3416
       ------------------------------------------
      
       0)               |              finish_task_switch() {
       0)   0.643 us    |                _spin_unlock_irq();
       0)   2.342 us    |              }
       0) + 41.302 us   |            }
       0) + 42.453 us   |          }
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      be6f164a
    • S
      function-graph: ignore times across schedule · 8aef2d28
      Steven Rostedt 提交于
      Impact: more accurate timings
      
      The current method of function graph tracing does not take into
      account the time spent when a task is not running. This shows functions
      that call schedule have increased costs:
      
       3) + 18.664 us   |      }
       ------------------------------------------
       3)    <idle>-0    =>  kblockd-123
       ------------------------------------------
      
       3)               |      finish_task_switch() {
       3)   1.441 us    |        _spin_unlock_irq();
       3)   3.966 us    |      }
       3) ! 2959.433 us |    }
       3) ! 2961.465 us |  }
      
      This patch uses the tracepoint in the scheduling context switch to
      account for time that has elapsed while a task is scheduled out.
      Now we see:
      
       ------------------------------------------
       3)    <idle>-0    =>  edac-po-1067
       ------------------------------------------
      
       3)               |      finish_task_switch() {
       3)   0.685 us    |        _spin_unlock_irq();
       3)   2.331 us    |      }
       3) + 41.439 us   |    }
       3) + 42.663 us   |  }
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      8aef2d28
    • S
      function-graph: prevent more than one tracer registering · 05ce5818
      Steven Rostedt 提交于
      Impact: prevent crash due to multiple function graph tracers
      
      The function graph tracer can currently only handle a single tracer
      being registered. If another tracer registers with the function
      graph tracer it can crash the system.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      05ce5818