1. 10 9月, 2010 2 次提交
  2. 08 9月, 2010 3 次提交
  3. 09 9月, 2010 1 次提交
    • S
      tracing: Do not allow llseek to set_ftrace_filter · 9c55cb12
      Steven Rostedt 提交于
      Reading the file set_ftrace_filter does three things.
      
      1) shows whether or not filters are set for the function tracer
      2) shows what functions are set for the function tracer
      3) shows what triggers are set on any functions
      
      3 is independent from 1 and 2.
      
      The way this file currently works is that it is a state machine,
      and as you read it, it may change state. But this assumption breaks
      when you use lseek() on the file. The state machine gets out of sync
      and the t_show() may use the wrong pointer and cause a kernel oops.
      
      Luckily, this will only kill the app that does the lseek, but the app
      dies while holding a mutex. This prevents anyone else from using the
      set_ftrace_filter file (or any other function tracing file for that matter).
      
      A real fix for this is to rewrite the code, but that is too much for
      a -rc release or stable. This patch simply disables llseek on the
      set_ftrace_filter() file for now, and we can do the proper fix for the
      next major release.
      Reported-by: NRobert Swiecki <swiecki@google.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Eugene Teo <eugene@redhat.com>
      Cc: vendor-sec@lst.de
      Cc: <stable@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9c55cb12
  4. 05 9月, 2010 1 次提交
  5. 01 9月, 2010 1 次提交
    • L
      tracing: Fix a race in function profile · 3aaba20f
      Li Zefan 提交于
      While we are reading trace_stat/functionX and someone just
      disabled function_profile at that time, we can trigger this:
      
      	divide error: 0000 [#1] PREEMPT SMP
      	...
      	EIP is at function_stat_show+0x90/0x230
      	...
      
      This fix just takes the ftrace_profile_lock and checks if
      rec->counter is 0. If it's 0, we know the profile buffer
      has been reset.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: stable@kernel.org
      LKML-Reference: <4C723644.4040708@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3aaba20f
  6. 25 8月, 2010 1 次提交
    • A
      tracing/trace_stack: Fix stack trace on ppc64 · 151772db
      Anton Blanchard 提交于
      save_stack_trace() stores the instruction pointer, not the
      function descriptor. On ppc64 the trace stack code currently
      dereferences the instruction pointer and shows 8 bytes of
      instructions in our backtraces:
      
       # cat /sys/kernel/debug/tracing/stack_trace
              Depth    Size   Location    (26 entries)
              -----    ----   --------
        0)     5424     112   0x6000000048000004
        1)     5312     160   0x60000000ebad01b0
        2)     5152     160   0x2c23000041c20030
        3)     4992     240   0x600000007c781b79
        4)     4752     160   0xe84100284800000c
        5)     4592     192   0x600000002fa30000
        6)     4400     256   0x7f1800347b7407e0
        7)     4144     208   0xe89f0108f87f0070
        8)     3936     272   0xe84100282fa30000
      
      Since we aren't dealing with function descriptors, use %pS
      instead of %pF to fix it:
      
       # cat /sys/kernel/debug/tracing/stack_trace
              Depth    Size   Location    (26 entries)
              -----    ----   --------
        0)     5424     112   ftrace_call+0x4/0x8
        1)     5312     160   .current_io_context+0x28/0x74
        2)     5152     160   .get_io_context+0x48/0xa0
        3)     4992     240   .cfq_set_request+0x94/0x4c4
        4)     4752     160   .elv_set_request+0x60/0x84
        5)     4592     192   .get_request+0x2d4/0x468
        6)     4400     256   .get_request_wait+0x7c/0x258
        7)     4144     208   .__make_request+0x49c/0x610
        8)     3936     272   .generic_make_request+0x390/0x434
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: rostedt@goodmis.org
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100825013238.GE28360@kryten>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      151772db
  7. 14 8月, 2010 1 次提交
    • M
      tracing: Sanitize value returned from write(trace_marker, "...", len) · 1aa54bca
      Marcin Slusarz 提交于
      When userspace code writes non-new-line-terminated string to trace_marker
      file, write handler appends new-line and returns number of bytes written
      to trace buffer, so
      write(fd, "abc", 3) will return 4
      
      That's unexpected and unfortunately it confuses glibc's fprintf function.
      
      Example:
      int main() {
        fprintf(stderr, "abc");
        return 0;
      }
      
      $ gcc test.c -o test
      $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
      $ ./test 2>/sys/kernel/debug/tracing/trace_marker
      
      results in infinite loop:
      write(fd, "abc", 3) = 4
      write(fd, "", 1) = 0
      write(fd, "", 1) = 0
      write(fd, "", 1) = 0
      write(fd, "", 1) = 0
      write(fd, "", 1) = 0
      write(fd, "", 1) = 0
      write(fd, "", 1) = 0
      (...)
      
      ...and kernel trace buffer full of empty markers.
      
      Fix it by sanitizing write return value.
      Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
      LKML-Reference: <20100727231801.GB2826@joi.lan>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1aa54bca
  8. 13 8月, 2010 1 次提交
    • S
      tracing/events: Convert format output to seq_file · 2a37a3df
      Steven Rostedt 提交于
      Two new events were added that broke the current format output.
      
      Both from the SCSI system: scsi_dispatch_cmd_done and scsi_dispatch_cmd_timeout
      
      The reason is that their print_fmt exceeded a page size. Since the output
      of the format used simple_read_from_buffer and trace_seq, it was limited
      to a page size in output.
      
      This patch converts the printing of the format of an event into seq_file,
      which allows greater than a page size to be shown.
      
      I diffed all event formats comparing the output with and without this
      patch. All matched except for the above two, which showed just:
      
        FORMAT TOO BIG
      
      without this patch, but now properly displays the output with this patch.
      
      v2: Remove updating *pos in seq start function.
         [ Thanks to Li Zefan for pointing that out ]
      Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Kei Tokunaga <tokunaga.keiich@jp.fujitsu.com>
      Cc: James Bottomley <James.Bottomley@suse.de>
      Cc: Tomohiro Kusumi <kusumi.tomohiro@jp.fujitsu.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2a37a3df
  9. 12 8月, 2010 1 次提交
  10. 08 8月, 2010 3 次提交
  11. 07 8月, 2010 2 次提交
    • H
      tracing: Fix ring_buffer_read_page reading out of page boundary · 18fab912
      Huang Ying 提交于
      With the configuration: CONFIG_DEBUG_PAGEALLOC=y and Shaohua's patch:
      
      [PATCH]x86: make spurious_fault check correct pte bit
      
      Function call graph trace with the following will trigger a page fault.
      
      # cd /sys/kernel/debug/tracing/
      # echo function_graph > current_tracer
      # cat per_cpu/cpu1/trace_pipe_raw > /dev/null
      
      BUG: unable to handle kernel paging request at ffff880006e99000
      IP: [<ffffffff81085572>] rb_event_length+0x1/0x3f
      PGD 1b19063 PUD 1b1d063 PMD 3f067 PTE 6e99160
      Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      last sysfs file: /sys/devices/virtual/net/lo/operstate
      CPU 1
      Modules linked in:
      
      Pid: 1982, comm: cat Not tainted 2.6.35-rc6-aes+ #300 /Bochs
      RIP: 0010:[<ffffffff81085572>]  [<ffffffff81085572>] rb_event_length+0x1/0x3f
      RSP: 0018:ffff880006475e38  EFLAGS: 00010006
      RAX: 0000000000000ff0 RBX: ffff88000786c630 RCX: 000000000000001d
      RDX: ffff880006e98000 RSI: 0000000000000ff0 RDI: ffff880006e99000
      RBP: ffff880006475eb8 R08: 000000145d7008bd R09: 0000000000000000
      R10: 0000000000008000 R11: ffffffff815d9336 R12: ffff880006d08000
      R13: ffff880006e605d8 R14: 0000000000000000 R15: 0000000000000018
      FS:  00007f2b83e456f0(0000) GS:ffff880002100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: ffff880006e99000 CR3: 00000000064a8000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process cat (pid: 1982, threadinfo ffff880006474000, task ffff880006e40770)
      Stack:
       ffff880006475eb8 ffffffff8108730f 0000000000000ff0 000000145d7008bd
      <0> ffff880006e98010 ffff880006d08010 0000000000000296 ffff88000786c640
      <0> ffffffff81002956 0000000000000000 ffff8800071f4680 ffff8800071f4680
      Call Trace:
       [<ffffffff8108730f>] ? ring_buffer_read_page+0x15a/0x24a
       [<ffffffff81002956>] ? return_to_handler+0x15/0x2f
       [<ffffffff8108a575>] tracing_buffers_read+0xb9/0x164
       [<ffffffff810debfe>] vfs_read+0xaf/0x150
       [<ffffffff81002941>] return_to_handler+0x0/0x2f
       [<ffffffff810248b0>] __bad_area_nosemaphore+0x17e/0x1a1
       [<ffffffff81002941>] return_to_handler+0x0/0x2f
       [<ffffffff810248e6>] bad_area_nosemaphore+0x13/0x15
      Code: 80 25 b2 16 b3 00 fe c9 c3 55 48 89 e5 f0 80 0d a4 16 b3 00 02 c9 c3 55 31 c0 48 89 e5 48 83 3d 94 16 b3 00 01 c9 0f 94 c0 c3 55 <8a> 0f 48 89 e5 83 e1 1f b8 08 00 00 00 0f b6 d1 83 fa 1e 74 27
      RIP  [<ffffffff81085572>] rb_event_length+0x1/0x3f
       RSP <ffff880006475e38>
      CR2: ffff880006e99000
      ---[ end trace a6877bb92ccb36bb ]---
      
      The root cause is that ring_buffer_read_page() may read out of page
      boundary, because the boundary checking is done after reading. This is
      fixed via doing boundary checking before reading.
      Reported-by: NShaohua Li <shaohua.li@intel.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      LKML-Reference: <1280297641.2771.307.camel@yhuang-dev>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      18fab912
    • S
      tracing: Fix an unallocated memory access in function_graph · 575570f0
      Shaohua Li 提交于
      With CONFIG_DEBUG_PAGEALLOC, I observed an unallocated memory access in
      function_graph trace. It appears we find a small size entry in ring buffer,
      but we access it as a big size entry. The access overflows the page size
      and touches an unallocated page.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      LKML-Reference: <1280217994.32400.76.camel@sli10-desk.sh.intel.com>
      [ Added a comment to explain the problem - SDR ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      575570f0
  12. 05 8月, 2010 2 次提交
  13. 04 8月, 2010 1 次提交
  14. 02 8月, 2010 1 次提交
    • F
      perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call · 669336e4
      Frederic Weisbecker 提交于
      We use synchronize_sched() to ensure a tracepoint won't be called
      while/after we release the perf buffers it references.
      
      But the tracepoint API has its own API for that:
      tracepoint_synchronize_unregister(). Use it instead as it's
      self-explanatory and eases maintainance.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      669336e4
  15. 27 7月, 2010 1 次提交
  16. 23 7月, 2010 1 次提交
  17. 21 7月, 2010 4 次提交
    • K
      tracing: Shrink max latency ringbuffer if unnecessary · ef710e10
      KOSAKI Motohiro 提交于
      Documentation/trace/ftrace.txt says
      
        buffer_size_kb:
      
              This sets or displays the number of kilobytes each CPU
              buffer can hold. The tracer buffers are the same size
              for each CPU. The displayed number is the size of the
              CPU buffer and not total size of all buffers. The
              trace buffers are allocated in pages (blocks of memory
              that the kernel uses for allocation, usually 4 KB in size).
              If the last page allocated has room for more bytes
              than requested, the rest of the page will be used,
              making the actual allocation bigger than requested.
              ( Note, the size may not be a multiple of the page size
                due to buffer management overhead. )
      
              This can only be updated when the current_tracer
              is set to "nop".
      
      But it's incorrect. currently total memory consumption is
      'buffer_size_kb x CPUs x 2'.
      
      Why two times difference is there? because ftrace implicitly allocate
      the buffer for max latency too.
      
      That makes sad result when admin want to use large buffer. (If admin
      want full logging and makes detail analysis). example, If admin
      have 24 CPUs machine and write 200MB to buffer_size_kb, the system
      consume ~10GB memory (200MB x 24 x 2). umm.. 5GB memory waste is
      usually unacceptable.
      
      Fortunatelly, almost all users don't use max latency feature.
      The max latency buffer can be disabled easily.
      
      This patch shrink buffer size of the max latency buffer if
      unnecessary.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      LKML-Reference: <20100701104554.DA2D.A69D9226@jp.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ef710e10
    • L
      tracing: Reduce latency and remove percpu trace_seq · bc289ae9
      Lai Jiangshan 提交于
      __print_flags() and __print_symbolic() use percpu trace_seq:
      
      1) Its memory is allocated at compile time, it wastes memory if we don't use tracing.
      2) It is percpu data and it wastes more memory for multi-cpus system.
      3) It disables preemption when it executes its core routine
         "trace_seq_printf(s, "%s: ", #call);" and introduces latency.
      
      So we move this trace_seq to struct trace_iterator.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4C078350.7090106@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bc289ae9
    • R
      trace: Reorder struct ring_buffer_per_cpu to remove padding on 64bit · 985023de
      Richard Kennedy 提交于
      Reorder structure to remove 8 bytes of padding on 64 bit builds.
      This shrinks the size to 128 bytes so allowing allocation from a smaller
      slab & needed one fewer cache lines.
      Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
      LKML-Reference: <1269516456.2054.8.camel@localhost>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      985023de
    • L
      tracing: Allow to disable cmdline recording · e870e9a1
      Li Zefan 提交于
      We found that even enabling a single trace event that will rarely be
      triggered can add big overhead to context switch.
      
      (lmbench context switch test)
       -------------------------------------------------
       2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
       ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
      ------ ------ ------ ------ ------ ------- -------
        2.19   2.3   2.21   2.56   2.13     2.54    2.07
        2.39   2.51  2.35   2.75   2.27     2.81    2.24
      
      The overhead is 6% ~ 11%.
      
      It's because when a trace event is enabled 3 tracepoints (sched_switch,
      sched_wakeup, sched_wakeup_new) will be activated to map pid to cmdname.
      
      We'd like to avoid this overhead, so add a trace option '(no)record-cmd'
      to allow to disable cmdline recording.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4C2D57F4.2050204@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e870e9a1
  18. 20 7月, 2010 3 次提交
  19. 16 7月, 2010 1 次提交
    • F
      tracing: Remove ksym tracer · 5d550467
      Frederic Weisbecker 提交于
      The ksym (breakpoint) ftrace plugin has been superseded by perf
      tools that are much more poweful to use the cpu breakpoints.
      This tracer doesn't bring more feature. It has been deprecated
      for a while now, lets remove it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      5d550467
  20. 09 7月, 2010 1 次提交
  21. 06 7月, 2010 1 次提交
    • M
      tracing/kprobes: Support "string" type · e09c8614
      Masami Hiramatsu 提交于
      Support string type tracing and printing in kprobe-tracer.
      
      This allows user to trace string data in kernel including __user data. Note
      that sometimes __user data may not be accessed if it is paged-out (sorry, but
      kprobes operation should be done in atomic, we can not wait for page-in).
      
      Commiter note: Fixed up conflicts with b7e2ecef.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100519195724.2885.18788.stgit@localhost6.localdomain6>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e09c8614
  22. 29 6月, 2010 7 次提交