提交 · 7770841e63730d62928b0879498064e9614b2ce0 · openeuler / raspberrypi-kernel

12 8月, 2009 1 次提交

tracing: Rename set_tracer_flags()'s local variable trace_flags · 7770841e

由 Zhaolei 提交于 8月 07, 2009

set_tracer_flags() have a local variable named trace_flags which has
the same name than a global one in the same scope.
This leads to confusion, using tracer_flags should be better by its
meaning.

Changelog:
v1->v2: Simplified another patch in this patchset, no change in this
        patch.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

7770841e

09 8月, 2009 1 次提交

perf_counter: Fix/complete ftrace event records sampling · f413cdb8

由 Frederic Weisbecker 提交于 8月 07, 2009

This patch implements the kernel side support for ftrace event
record sampling.

A new counter sampling attribute is added:

   PERF_SAMPLE_TP_RECORD

which requests ftrace events record sampling. In this case
if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
fires, we emit the tracepoint binary record to the
perfcounter event buffer, as a sample.

Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
record:

 perf record -f -F 1 -a -e workqueue:workqueue_execution
 perf report -D

 0x21e18 [0x48]: event: 9
 .
 . ... raw event: size 72 bytes
 .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
 .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
 .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
 .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
 .  0040:  e0 b1 31 81 ff ff ff ff                          .......
.
0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33

The raw ftrace binary record starts at offset 0020.

Translation:

 struct trace_entry {
	type		= 0x2b = 43;
	flags		= 1;
	preempt_count	= 2;
	pid		= 0xa = 10;
	tgid		= 0xa = 10;
 }

 thread_comm = "events/1"
 thread_pid  = 0xa = 10;
 func	    = 0xffffffff8131b1e0 = flush_to_ldisc()

What will come next?

 - Userspace support ('perf trace'), 'flight data recorder' mode
   for perf trace, etc.

 - The unconditional copy from the profiling callback brings
   some costs however if someone wants no such sampling to
   occur, and needs to be fixed in the future. For that we need
   to have an instant access to the perf counter attribute.
   This is a matter of a flag to add in the struct ftrace_event.

 - Take care of the events recursivity! Don't ever try to record
   a lock event for example, it seems some locking is used in
   the profiling fast path and lead to a tracing recursivity.
   That will be fixed using raw spinlock or recursivity
   protection.

 - [...]

 - Profit! :-)
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f413cdb8

06 8月, 2009 4 次提交

tracing/function-graph-tracer: Move graph event insertion helpers in the graph tracer file · 1a0799a8

由 Frederic Weisbecker 提交于 7月 29, 2009

The function graph events helpers which insert the function entry and
return events into the ring buffer currently reside in trace.c
But this file is quite overloaded and the right place for these helpers
is in the function graph tracer file.

Then move them to trace_functions_graph.c
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

1a0799a8

tracing: Move sched event insertion helpers in the sched switch tracer file · 82e04af4

由 Frederic Weisbecker 提交于 7月 29, 2009

The sched events helpers which insert the sched switch and wakeup
events into the ring buffer currently reside in trace.c
But this file is quite overloaded and the right place for these helpers
is in the sched switch tracer file.

Then move them to trace_functions.c
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

82e04af4

tracing/core: Make the stack entry helpers global · c0a0d0d3

由 Frederic Weisbecker 提交于 7月 29, 2009

Make the stacktrace event insertion helpers globals.
This has two effects:

- Prepare for moving the sched events insertion helpers to
  the sched switch tracer file.
- Move some ifdef outside function definitions
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

c0a0d0d3

tracing/core: Turn ftrace_cpu_disabled into a global var · 5e5bf483

由 Frederic Weisbecker 提交于 7月 29, 2009

In order to prepare the moving of the function graph tracer insertion
helpers from trace.c to trace_functions_graph.c, we need to export the
ftrace_cpu_disabled variable.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

5e5bf483

29 7月, 2009 1 次提交

tracing: Fix missing function_graph events when we splice_read from trace_pipe · 74e7ff8c

由 Lai Jiangshan 提交于 7月 28, 2009

About a half events are missing when we splice_read
from trace_pipe. They are unexpectedly consumed because we ignore
the TRACE_TYPE_NO_CONSUME return value used by the function graph
tracer when it needs to consume the events by itself to walk on
the ring buffer.

The same problem appears with ftrace_dump()

Example of an output before this patch:

1)               |      ktime_get_real() {
1)   2.846 us    |          read_hpet();
1)   4.558 us    |        }
1)   6.195 us    |      }

After this patch:

0)               |      ktime_get_real() {
0)               |        getnstimeofday() {
0)   1.960 us    |          read_hpet();
0)   3.597 us    |        }
0)   5.196 us    |      }

The fix also applies on 2.6.30
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@kernel.org
LKML-Reference: <4A6EEC52.90704@cn.fujitsu.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

74e7ff8c

23 7月, 2009 1 次提交

tracing: only truncate ftrace files when O_TRUNC is set · 8650ae32

由 Steven Rostedt 提交于 7月 22, 2009

The current code will truncate the ftrace files contents if O_APPEND
is not set and the file is opened in write mode. This is incorrect.
It should only truncate the file if O_TRUNC is set. Otherwise
if one of these files is opened by a C program with fopen "r+",
it will incorrectly truncate the file.
Reported-by: NJiri Olsa <jolsa@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

8650ae32

21 7月, 2009 1 次提交

tracing: cleanup for tracing_trace_options_read() · ff4e9da2

由 Xiao Guangrong 提交于 6月 22, 2009

'\n' is already appended, and what we need is just an extra
space for the '\0'.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
LKML-Reference: <4A3EED63.3090908@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

ff4e9da2

18 7月, 2009 1 次提交

tracing: Remove unused fields/variables · 566b0aaf

由 jolsa@redhat.com 提交于 7月 16, 2009

Signed-off-by: NJiri Olsa <jolsa@redhat.com>
Cc: rostedt@goodmis.org
LKML-Reference: <1247773468-11594-2-git-send-email-jolsa@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

566b0aaf

13 7月, 2009 1 次提交

headers: smp_lock.h redux · 405f5571

由 Alexey Dobriyan 提交于 7月 11, 2009

* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

405f5571

08 7月, 2009 1 次提交

ring-buffer: make lockless · 77ae365e

由 Steven Rostedt 提交于 3月 27, 2009

This patch converts the ring buffers into a completely lockless
buffer recording system. The read side still takes locks since
we still serialize readers. But the writers are the ones that
must be lockless (those can happen in NMIs).

The main change is to the "head_page" pointer. We write to the
tail, and read from the head. The "head_page" pointer in the cpu
buffer is now just a reference to where to look. The real head
page is now kept in the head_page->list->prev->next pointer.
That is, in the list head of the previous page we set flags.

The list pages are allocated to be aligned such that the lowest
significant bits are always zero pointing to the list. This gives
us play to put in flags to their pointers.

bit 0: set when the page is a head page
bit 1: set when the writer is moving the page (for overwrite mode)

cmpxchg is used to update the pointer.

When the writer wraps the buffer and the tail meets the head,
in overwrite mode, the writer must move the head page forward.
It first uses cmpxchg to change the pointer flag from 1 to 2.
Once this is done, the reader on another CPU will not take the
page from the buffer.

The writers need to protect against interrupts (we don't bother with
disabling interrupts because NMIs are allowed to write too).

After the writer sets the pointer flag to 2, it takes care to
manage interrupts coming in. This is discribed in detail within the
comments of the code.

 Changes in version 2:
  - Let reader reset entries value of header page.
  - Fix tail page passing commit page on reader page test.
  - Always increment entries and write counter in rb_tail_page_update
  - Add safety check in rb_set_commit_to_write to break out of infinite loop
  - add mask in rb_is_reader_page

[ Impact: lock free writing to the ring buffer ]
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

77ae365e

01 7月, 2009 1 次提交

tracing/events: Add trace_event boot option · 020e5f85

由 Li Zefan 提交于 7月 01, 2009

We already have ftrace= boot option, and this adds a similar
boot option for trace events, so allow trace events to be
enabled at boot, for boot debugging purpose.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A4ACE29.3010407@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

020e5f85

24 6月, 2009 2 次提交

tracing: Fix trace_buf_size boot option · 9d612bef

由 Li Zefan 提交于 6月 24, 2009

We should be able to specify [KMG] when setting trace_buf_size
boot option, as documented in kernel-parameters.txt
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A41F2DB.4020102@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9d612bef

tracing: Reset iterator in t_start() · f129e965

由 Li Zefan 提交于 6月 24, 2009

The iterator is m->private, but it's not reset to trace_types in
t_start(). If the output is larger than PAGE_SIZE and t_start()
is called the 2nd time, things will go wrong.
Reviewed-by: NLiming Wang <liming.wang@windriver.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A418728.5020506@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f129e965

16 6月, 2009 1 次提交

debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem. · 156f5a78

由 GeunSik Lim 提交于 6月 02, 2009

Many developers use "/debug/" or "/debugfs/" or "/sys/kernel/debug/"
directory name to mount debugfs filesystem for ftrace according to
./Documentation/tracers/ftrace.txt file.

And, three directory names(ex:/debug/, /debugfs/, /sys/kernel/debug/) is
existed in kernel source like ftrace, DRM, Wireless, Documentation,
Network[sky2]files to mount debugfs filesystem.

debugfs means debug filesystem for debugging easy to use by greg kroah
hartman. "/sys/kernel/debug/" name is suitable as directory name
of debugfs filesystem.
- debugfs related reference: http://lwn.net/Articles/334546/

Fix inconsistency of directory name to mount debugfs filesystem.

* From Steven Rostedt
  - find_debugfs() and tracing_files() in this patch.
Signed-off-by: NGeunSik Lim <geunsik.lim@samsung.com>
Acked-by     : Inaky Perez-Gonzalez <inaky@linux.intel.com>
Reviewed-by  : Steven Rostedt <rostedt@goodmis.org>
Reviewed-by  : James Smart <james.smart@emulex.com>
CC: Jiri Kosina <trivial@kernel.org>
CC: David Airlie <airlied@linux.ie>
CC: Peter Osterlund <petero2@telia.com>
CC: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
CC: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

156f5a78

15 6月, 2009 2 次提交

tracing: replace a GFP_ATOMIC with GFP_KERNEL allocation · e4f2d10f

由 Li Zefan 提交于 6月 15, 2009

Atomic allocation is not needed here.

[ Impact: clean up of memory alloction type ]
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A35B898.2050607@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e4f2d10f

tracing: fix a typo in tracing_cpumask_write() · 215368e8

由 Li Zefan 提交于 6月 15, 2009

It's tracing_cpumask_new that should be kfree()ed.

This causes tracing_cpumask to be freed due to the typo:

 # echo z > tracing_cpumask
 bash: echo: write error: Invalid argument

And subsequent reads/writes to tracing_cpuamsk will access this
already-freed tracing_cpumask, thus may lead to crash.

[ Impact: fix leak and crash when writing invalid val to tracing_cpumask ]
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A35B86A.7070608@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

215368e8

02 6月, 2009 1 次提交

tracing: make trace pipe recognize latency format flag · 112f38a7

由 Steven Rostedt 提交于 6月 01, 2009

The trace_pipe did not recognize the latency format flag and would produce
different output than the trace file. The problem was partly due that
the trace flags in the iterator was not set as well as the trace_pipe
zeros out part of the iterator (including the flags) to be able to use
the same routines as the trace file. trace_flags of the iterator should
not cause any problems when not zeroed out by for trace_pipe.
Reported-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

112f38a7

28 5月, 2009 1 次提交

trace: disable preemption before taking raw spinlocks · 5b6045a9

由 Heiko Carstens 提交于 5月 26, 2009

s390 code uses smp_processor_id() in __raw_spin_lock() code which
reveals that a (raw) spinlock is taken without preemption disabled.
This can potentially deadlock.

To fix this explicitly disable and enable preemption.

BUG: using smp_processor_id() in preemptible [00000000] code: cat/2278
caller is trace_find_cmdline+0x40/0xfc
CPU: 0 Not tainted 2.6.30-rc7-dirty #39
Process cat (pid: 2278, task: 000000003faedb68, ksp: 000000003b33b988)
000000003b33b988 000000003b33bae0 0000000000000002 0000000000000000
       000000003b33bb80 000000003b33baf8 000000003b33baf8 00000000000175d6
       0000000000000001 000000003b33b988 000000003f9b0000 000000000000000b
       000000000000000c 000000003b33bb40 000000003b33bae0 0000000000000000
       0000000000000000 00000000000175d6 000000003b33bae0 000000003b33bb28
Call Trace:
([<00000000000174b2>] show_trace+0x112/0x170)
 [<0000000000017582>] show_stack+0x72/0x100
 [<0000000000441538>] dump_stack+0xc8/0xd8
 [<000000000025c350>] debug_smp_processor_id+0x114/0x130
 [<00000000000bf0e4>] trace_find_cmdline+0x40/0xfc
 [<00000000000c35d4>] trace_print_context+0x58/0xac
 [<00000000000bb676>] print_trace_line+0x416/0x470
 [<00000000000bc8fe>] s_show+0x4e/0x428
 [<000000000013834e>] seq_read+0x36a/0x5d4
 [<0000000000112a78>] vfs_read+0xc8/0x174
 [<0000000000112c58>] SyS_read+0x74/0xc4
 [<000000000002c7ae>] sysc_noemu+0x10/0x16
 [<000002000012436c>] 0x2000012436c
1 lock held by cat/2278:
 #0:  (&p->lock){+.+.+.}, at: [<0000000000138056>] seq_read+0x72/0x5d4

[ Impact: fix preempt-unsafe raw spinlock ]
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

5b6045a9

26 5月, 2009 1 次提交

tracing: add trace_event_read_lock() · 4f535968

由 Lai Jiangshan 提交于 5月 18, 2009

I found that there is nothing to protect event_hash in
ftrace_find_event(). Rcu protects the event hashlist
but not the event itself while we use it after its extraction
through ftrace_find_event().

This lack of a proper locking in this spot opens a race
window between any event dereferencing and module removal.

Eg:

--Task A--

print_trace_line(trace) {
  event = find_ftrace_event(trace)

--Task B--

trace_module_remove_events(mod) {
  list_trace_events_module(ev, mod) {
    unregister_ftrace_event(ev->event) {
      hlist_del(ev->event->node)
        list_del(....)
    }
  }
}
|--> module removed, the event has been dropped

--Task A--

  event->print(trace); // Dereferencing freed memory

If the event retrieved belongs to a module and this module
is concurrently removed, we may end up dereferencing a data
from a freed module.

RCU could solve this, but it would add latency to the kernel and
forbid tracers output callbacks to call any sleepable code.
So this fix converts 'trace_event_mutex' to a read/write semaphore,
and adds trace_event_read_lock() to protect ftrace_find_event().

[ Impact: fix possible freed memory dereference in ftrace ]
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4A114806.7090302@cn.fujitsu.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

4f535968

16 5月, 2009 1 次提交

tracing: Append prompt in /debug/tracing/README file · 88fc86c2

由 GeunSik Lim 提交于 5月 14, 2009

append prompt in /debug/tracing/README file.

This is trivial issue. Fix typo Mini Howto file(README) for ftrace.

[ Impact: cleanup ]
Signed-off-by: NGeunSik Lim <geunsik.lim@samsung.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: williams <williams@redhat.com>
LKML-Reference: <1242289418.31161.45.camel@centos51>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

88fc86c2

07 5月, 2009 1 次提交

tracing: reset ring buffer when removing modules with events · 9456f0fa

由 Steven Rostedt 提交于 5月 06, 2009

Li Zefan found that there's a race using the event ids of events and
modules. When a module is loaded, an event id is incremented. We only
have 16 bits for event ids (65536) and there is a possible (but highly
unlikely) race that we could load and unload a module that registers
events so many times that the event id counter overflows.

When it overflows, it then restarts and goes looking for available
ids. An id is available if it was added by a module and released.

The race is if you have one module add an id, and then is removed.
Another module loaded can use that same event id. But if the old module
still had events in the ring buffer, the new module's call back would
get bogus data.  At best (and most likely) the output would just be
garbage. But if the module for some reason used pointers (not recommended)
then this could potentially crash.

The safest thing to do is just reset the ring buffer if a module that
registered events is removed.

[ Impact: prevent unpredictable results of event id overflows ]
Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <49FEAFD0.30106@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

9456f0fa

06 5月, 2009 2 次提交

tracing: use proper export symbol for tracing api · 94487d6d

由 Steven Rostedt 提交于 5月 05, 2009

When adding the EXPORT_SYMBOL to some of the tracing API, I accidently
used EXPORT_SYMBOL instead of EXPORT_SYMBOL_GPL. This patch fixes
that mistake.

[ Impact: export the tracing code only for GPL modules ]
Reported-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

94487d6d

tracing: export stats of ring buffers to userspace · c8d77183

由 Steven Rostedt 提交于 4月 29, 2009

This patch adds stats to the ftrace ring buffers:

 # cat /debugfs/tracing/per_cpu/cpu0/stats
 entries: 42360
 overrun: 30509326
 commit overrun: 0
 nmi dropped: 0

Where entries are the total number of data entries in the buffer.

overrun is the number of entries not consumed and were overwritten by
the writer.

commit overrun is the number of entries dropped due to nested writers
wrapping the buffer before the initial writer finished the commit.

nmi dropped is the number of entries dropped due to the ring buffer
lock being held when an nmi was going to write to the ring buffer.
Note, this field will be meaningless and will go away when the ring
buffer becomes lockless.

[ Impact: let userspace know what is happening in the ring buffers ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

c8d77183

29 4月, 2009 4 次提交

tracing: fix ref count in splice pages · 7267fa68

由 Steven Rostedt 提交于 4月 29, 2009

The pages allocated for the splice binary buffer did not initialize
the ref count correctly. This caused pages not to be freed and causes
a drastic memory leak.

Thanks to logdev I was able to trace the tracer to find where the leak
was.

[ Impact: stop memory leak when using splice ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7267fa68

tracing: have splice only copy full pages · f2957f1f

由 Steven Rostedt 提交于 4月 29, 2009

Splice works with pages, it is much more effecient to use an entire
page than to copy bits over several pages.

Using logdev to trace the internals of the splice mechanism, I was
able to see that splice can be very aggressive. When tracing is
occurring, and the reader caught up to the writer, and the writer
is on the reader page, the reader will copy what is there into the
splice page. Splice may iterate over several pages and if the
writer is still writing to the page, the reader will keep copying
bits to new pages to pass to userspace.

This patch changes it to only pass data to userspace if the page
is full (the writer has left the page). This has a small side effect
that splice can not read a partial page, and must wait for the
page to fill. This should not be an issue. If tracing has stopped,
then a use of "read" will still read all of the page.

[ Impact: better performance for ring buffer splice code ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f2957f1f

tracing: only add splice page if entries exist · 93459c6c

由 Steven Rostedt 提交于 4月 29, 2009

The splice code allocates a page even when the ring buffer is empty.
It detects the ring buffer being empty when it it fails to copy
anything from the ring buffer into the page.

This patch adds a check to see if there is anything in the ring buffer
before allocating a page.

Thanks to logdev for letting me trace the tracer to find this.

[ Impact: speed up due to removing unnecessary allocation ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

93459c6c

tracing: fix ref count in splice pages · 5beae6ef

由 Steven Rostedt 提交于 4月 29, 2009

The pages allocated for the splice binary buffer did not initialize
the ref count correctly. This caused pages not to be freed and causes
a drastic memory leak.

Thanks to logdev I was able to trace the tracer to find where the leak
was.

[ Impact: stop memory leak when using splice ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

5beae6ef

28 4月, 2009 1 次提交

tracing: convert ftrace_dump spinlocks to raw · cd891ae0

由 Steven Rostedt 提交于 4月 28, 2009

ftrace_dump is used for printing out the contents of the ftrace ring buffer
to the console on failure. Currently it uses a spinlock to synchronize
the output from multiple failures on different CPUs. This spin lock
currently is a normal spinlock and can cause issues with lockdep and
lock tracing.

This patch converts it to raw since it is for error handling only.
The lock is local to the ftrace_dump and is not used by any other
infrastructure.

[ Impact: prevent ftrace_dump from locking up by internal tracing ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

cd891ae0

22 4月, 2009 1 次提交

tracing/events: make struct trace_entry->type to be int type · 7a4f453b

由 Li Zefan 提交于 4月 22, 2009

struct trace_entry->type is unsigned char, while trace event's id is
int type, thus for a event with id >= 256, it's entry->type is cast
to (id % 256), and then we can't see the trace output of this event.

 # insmod trace-events-sample.ko
 # echo foo_bar > /mnt/tracing/set_event
 # cat /debug/tracing/events/trace-events-sample/foo_bar/id
 256
 # cat /mnt/tracing/trace_pipe
           <...>-3548  [001]   215.091142: Unknown type 0
           <...>-3548  [001]   216.089207: Unknown type 0
           <...>-3548  [001]   217.087271: Unknown type 0
           <...>-3548  [001]   218.085332: Unknown type 0

[ Impact: fix output for trace events with id >= 256 ]
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49EEDB0E.5070207@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7a4f453b

18 4月, 2009 3 次提交

tracing: protect trace_printk from recursion · 3189cdb3

由 Steven Rostedt 提交于 4月 17, 2009

trace_printk can be called from any context, including NMIs.
If this happens, then we must test for for recursion before
grabbing any spinlocks.

This patch prevents trace_printk from being called recursively.

[ Impact: prevent hard lockup in lockdep event tracer ]

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

3189cdb3

tracing: add EXPORT_SYMBOL_GPL for trace commits · 12acd473

由 Steven Rostedt 提交于 4月 17, 2009

Not all the necessary symbols were exported to allow for tracing
by modules. This patch adds them in.

[ Impact: allow modules to commit data to the ring buffer ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

12acd473

tracing: fix file mode of trace and README · 339ae5d3

由 Li Zefan 提交于 4月 17, 2009

trace is read-write and README is read-only.

[ Impact: fix /debug/tracing/ file permissions. ]
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
LKML-Reference: <49E7EAB6.4070605@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

339ae5d3

17 4月, 2009 1 次提交

tracing: add saved_cmdlines file to show cached task comms · 69abe6a5

由 Avadh Patel 提交于 4月 10, 2009

Export the cached task comms to userspace. This allows user apps to translate
the pids from a trace into their respective task command lines.

[ Impact: let userspace apps reading binary buffer know comm's of pids ]
Signed-off-by: NAvadh Patel <avadh4all@gmail.com>
[ added error checking and use of buf pointer to index file_buf ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

69abe6a5

15 4月, 2009 1 次提交

tracing/events: add export symbols for trace events in modules · 17c873ec

由 Steven Rostedt 提交于 4月 10, 2009

Impact: let modules add trace events

The trace event code requires some functions to be exported to allow
modules to use TRACE_EVENT. This patch adds EXPORT_SYMBOL_GPL to the
necessary functions.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

17c873ec

14 4月, 2009 4 次提交

tracing/filters: use ring_buffer_discard_commit() in filter_check_discard() · eb02ce01

由 Tom Zanussi 提交于 4月 08, 2009

This patch changes filter_check_discard() to make use of the new
ring_buffer_discard_commit() function and modifies the current users to
call the old commit function in the non-discard case.

It also introduces a version of filter_check_discard() that uses the
global trace buffer (filter_current_check_discard()) for those cases.

v2 changes:

- fix compile error noticed by Ingo Molnar
Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1239178554.10295.36.camel@tropicana>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

eb02ce01

tracing/filters: use ring_buffer_discard_commit for discarded events · 77d9f465

由 Steven Rostedt 提交于 4月 02, 2009

The ring_buffer_discard_commit makes better usage of the ring_buffer
when an event has been discarded. It tries to remove it completely if
possible.

This patch converts the trace event filtering to use
ring_buffer_discard_commit instead of the ring_buffer_event_discard.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

77d9f465

tracing/filters: add TRACE_EVENT_FORMAT_NOFILTER event macro · e45f2e2b

由 Tom Zanussi 提交于 3月 31, 2009

Frederic Weisbecker suggested that the trace_special event shouldn't be
filterable; this patch adds a TRACE_EVENT_FORMAT_NOFILTER event macro
that allows an event format to be exported without having a filter
attached, and removes filtering from the trace_special event.
Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e45f2e2b

tracing/filters: add run-time field descriptions to TRACE_EVENT_FORMAT events · e1112b4d

由 Tom Zanussi 提交于 3月 31, 2009

This patch adds run-time field descriptions to all the event formats
exported using TRACE_EVENT_FORMAT.  It also hooks up all the tracers
that use them (i.e. the tracers in the 'ftrace subsystem') so they can
also have their output filtered by the event-filtering mechanism.

When I was testing this, there were a couple of things that fooled me
into thinking the filters weren't working, when actually they were -
I'll mention them here so others don't make the same mistakes (and file
bug reports. ;-)

One is that some of the tracers trace multiple events e.g. the
sched_switch tracer uses the context_switch and wakeup events, and if
you don't set filters on all of the traced events, the unfiltered output
from the events without filters on them can make it look like the
filtering as a whole isn't working properly, when actually it is doing
what it was asked to do - it just wasn't asked to do the right thing.

The other is that for the really high-volume tracers e.g. the function
tracer, the volume of filtered events can be so high that it pushes the
unfiltered events out of the ring buffer before they can be read so e.g.
cat'ing the trace file repeatedly shows either no output, or once in
awhile some output but that isn't there the next time you read the
trace, which isn't what you normally expect when reading the trace file.
If you read from the trace_pipe file though, you can catch them before
they disappear.

Changes from v1:

As suggested by Frederic Weisbecker:

- get rid of externs in functions
- added unlikely() to filter_check_discard()
Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e1112b4d