提交 · 35cf723e99c0e26ddf51f037dffaa4ff2c2c9106 · openeuler / raspberrypi-kernel

06 5月, 2009 16 次提交

tracing: trace_output.c, fix false positive compiler warning · 48dd0fed

由 Jaswinder Singh Rajput 提交于 5月 06, 2009

This compiler warning:

  CC      kernel/trace/trace_output.o
 kernel/trace/trace_output.c: In function ‘register_ftrace_event’:
 kernel/trace/trace_output.c:544: warning: ‘list’ may be used uninitialized in this function

Is wrong as 'list' is always initialized - but GCC (4.3.2) does not
recognize this relationship properly.

Work around the warning by initializing the variable to NULL.

[ Impact: fix false positive compiler warning ]
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

48dd0fed

blktrace: from-sector redundant in trace_block_remap · 22a7c31a

由 Alan D. Brunelle 提交于 5月 04, 2009

Remove redundant from-sector parameter: it's /always/ the bio's sector
passed in.

[ Impact: cleanup ]
Signed-off-by: NAlan D. Brunelle <alan.brunelle@hp.com>
Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <49FF517C.7000503@hp.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

22a7c31a

blktrace: correct remap names · a42aaa3b

由 Alan D. Brunelle 提交于 5月 04, 2009

This attempts to clarify names utilized during block I/O remap
operations (partition, volume manager). It correctly matches up the
/from/ information for both device & sector. This takes in the concept
from Kosaki Motohiro and extends it to include better naming for the
"device_from" field.

[ Impact: cleanup ]
Signed-off-by: NAlan D. Brunelle <alan.brunelle@hp.com>
Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <49FF4FAE.3000301@hp.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a42aaa3b

tracepoint: trace_sched_migrate_task(): remove parameter · de1d7286

由 Mathieu Desnoyers 提交于 5月 05, 2009

The orig_cpu parameter in trace_sched_migrate_task() is not necessary,
it can be got by using task_cpu(p) in the probe.

[ Impact: micro-optimization ]
Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
[ modified from Mathieu's patch. The original patch is at:
  http://marc.info/?l=linux-kernel&m=123791201716239&w=2 ]
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: zhaolei@cn.fujitsu.com
Cc: laijs@cn.fujitsu.com
LKML-Reference: <49FFFDB7.1050402@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

de1d7286

tracing/events: fix concurrent access to ftrace_events list · 20c8928a

由 Li Zefan 提交于 5月 06, 2009

A module will add/remove its trace events when it gets loaded/unloaded, so
the ftrace_events list is not "const", and concurrent access needs to be
protected.

This patch thus fixes races between loading/unloding modules and read
'available_events' or read/write 'set_event', etc.

Below shows how to reproduce the race:

 # for ((; ;)) { cat /mnt/tracing/available_events; } > /dev/null &
 # for ((; ;)) { insmod trace-events-sample.ko; rmmod sample; } &

After a while:

BUG: unable to handle kernel paging request at 0010011c
IP: [<c1080f27>] t_next+0x1b/0x2d
...
Call Trace:
 [<c10c90e6>] ? seq_read+0x217/0x30d
 [<c10c8ecf>] ? seq_read+0x0/0x30d
 [<c10b4c19>] ? vfs_read+0x8f/0x136
 [<c10b4fc3>] ? sys_read+0x40/0x65
 [<c1002a68>] ? sysenter_do_call+0x12/0x36

[ Impact: fix races when concurrent accessing ftrace_events list ]
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4A00F709.3080800@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

20c8928a

tracing/events: fix memory leak when unloading module · 2df75e41

由 Li Zefan 提交于 5月 06, 2009

When unloading a module, memory allocated by init_preds() and
trace_define_field() is not freed.

[ Impact: fix memory leak ]
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4A00F6E0.3040503@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2df75e41

ring-buffer: add benchmark and tester · 5092dbc9

由 Steven Rostedt 提交于 5月 05, 2009

This patch adds code that can benchmark the ring buffer as well as
test it. This code can be compiled into the kernel (not recommended)
or as a module.

A separate ring buffer is used to not interfer with other users, like
ftrace. It creates a producer and a consumer (option to disable creation
of the consumer) and will run for 10 seconds, then sleep for 10 seconds
and then repeat.

While running, the producer will write 10 byte loads into the ring
buffer with just putting in the current CPU number. The reader will
continually try to read the buffer. The reader will alternate from reading
the buffer via event by event, or by full pages.

The output is a pr_info, thus it will fill up the syslogs.

  Starting ring buffer hammer
  End ring buffer hammer
  Time:     9000349 (usecs)
  Overruns: 12578640
  Read:     5358440  (by events)
  Entries:  0
  Total:    17937080
  Missed:   0
  Hit:      17937080
  Entries per millisec: 1993
  501 ns per entry
  Sleeping for 10 secs
  Starting ring buffer hammer
  End ring buffer hammer
  Time:     9936350 (usecs)
  Overruns: 0
  Read:     28146644  (by pages)
  Entries:  74
  Total:    28146718
  Missed:   0
  Hit:      28146718
  Entries per millisec: 2832
  353 ns per entry
  Sleeping for 10 secs

Time:      is the time the test ran
Overruns:  the number of events that were overwritten and not read
Read:      the number of events read (either by pages or events)
Entries:   the number of entries left in the buffer
                 (the by pages will only read full pages)
Total:     Entries + Read + Overruns
Missed:    the number of entries that failed to write
Hit:       the number of entries that were written

The above example shows that it takes ~353 nanosecs per entry when
there is a reader, reading by pages (and no overruns)

The event by event reader slowed the producer down to 501 nanosecs.

[ Impact: see how changes to the ring buffer affect stability and performance ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

5092dbc9

ring-buffer: move big if statement down · aa20ae84

由 Steven Rostedt 提交于 5月 05, 2009

In the hot path of the ring buffer "__rb_reserve_next" there's a big
if statement that does not even return back to the work flow.

	code;

	if (cross to next page) {

		[ lots of code ]

		return;
	}

	more code;

The condition is even the unlikely path, although we do not denote it
with an unlikely because gcc is fine with it. The condition is true when
the write crosses a page boundary, and we need to start at a new page.

Having this if statement makes it hard to read, but calling another
function to do the work is also not appropriate, because we are using a lot
of variables that were set before the if statement, and we do not want to
send them as parameters.

This patch changes it to a goto:

	code;

	if (cross to next page)
		goto next_page;

	more code;

	return;

next_page:

	[ lots of code]

This makes the code easier to understand, and a bit more obvious.

The output from gcc is practically identical. For some reason, gcc decided
to use different registers when I switched it to a goto. But other than that,
the logic is the same.

[ Impact: easier to read code ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

aa20ae84

tracing: use proper export symbol for tracing api · 94487d6d

由 Steven Rostedt 提交于 5月 05, 2009

When adding the EXPORT_SYMBOL to some of the tracing API, I accidently
used EXPORT_SYMBOL instead of EXPORT_SYMBOL_GPL. This patch fixes
that mistake.

[ Impact: export the tracing code only for GPL modules ]
Reported-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

94487d6d

ring-buffer: disable writers when resetting buffers · 41ede23e

由 Steven Rostedt 提交于 5月 01, 2009

As a precaution, it is best to disable writing to the ring buffers
when reseting them.

[ Impact: prevent weird things if write happens during reset ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

41ede23e

ring-buffer: have read page swap increment counter with page entries · afbab76a

由 Steven Rostedt 提交于 5月 01, 2009

In the swap page ring buffer code that is used by the ftrace splice code,
we scan the page to increment the counter of entries read.

With the number of entries already in the page we simply need to add it.

[ Impact: speed up reading page from ring buffer ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

afbab76a

ring-buffer: record page entries in buffer page descriptor · 778c55d4

由 Steven Rostedt 提交于 5月 01, 2009

Currently, when the ring buffer writer overflows the buffer and must
write over non consumed data, we increment the overrun counter by
reading the entries on the page we are about to overwrite. This reads
the entries one by one.

This is not very effecient. This patch adds another entry counter
into each buffer page descriptor that keeps track of the number of
entries on the page. Now on overwrite, the overrun counter simply
needs to add the number of entries that is on the page it is about
to overwrite.

[ Impact: speed up of ring buffer in overwrite mode ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

778c55d4

ring-buffer: convert cpu buffer entries to local_t · e4906eff

由 Steven Rostedt 提交于 4月 30, 2009

The entries counter in cpu buffer is not atomic. It can be updated by
other interrupts or from another CPU (readers).

But making entries into "atomic_t" causes an atomic operation that can
hurt performance. Instead we convert it to a local_t that will increment
a counter with a local CPU atomic operation (if the arch supports it).

Instead of fighting with readers and overwrites that decrement the counter,
I added a "read" counter. Every time a reader reads an entry it is
incremented.

We already have a overrun counter and with that, the entries counter and
the read counter, we can calculate the total number of entries in the
buffer with:

  (entries - overrun) - read

As long as the total number of entries in the ring buffer is less than
the word size, this will work. But since the entries counter was previously
a long, this is no different than what we had before.

Thanks to Andrew Morton for pointing out in the first version that
atomic_t does not replace unsigned long. I switched to atomic_long_t
even though it is signed. A negative count is most likely a bug.

[ Impact: keep accurate count of cpu buffer entries ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e4906eff

tracing: export stats of ring buffers to userspace · c8d77183

由 Steven Rostedt 提交于 4月 29, 2009

This patch adds stats to the ftrace ring buffers:

 # cat /debugfs/tracing/per_cpu/cpu0/stats
 entries: 42360
 overrun: 30509326
 commit overrun: 0
 nmi dropped: 0

Where entries are the total number of data entries in the buffer.

overrun is the number of entries not consumed and were overwritten by
the writer.

commit overrun is the number of entries dropped due to nested writers
wrapping the buffer before the initial writer finished the commit.

nmi dropped is the number of entries dropped due to the ring buffer
lock being held when an nmi was going to write to the ring buffer.
Note, this field will be meaningless and will go away when the ring
buffer becomes lockless.

[ Impact: let userspace know what is happening in the ring buffers ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

c8d77183

ring-buffer: add counters for commit overrun and nmi dropped entries · f0d2c681

由 Steven Rostedt 提交于 4月 29, 2009

The WARN_ON in the ring buffer when a commit is preempted and the
buffer is filled by preceding writes can happen in normal operations.
The WARN_ON makes it look like a bug, not to mention, because
it does not stop tracing and calls printk which can also recurse, this
is prone to deadlock (the WARN_ON is not in a position to recurse).

This patch removes the WARN_ON and replaces it with a counter that
can be retrieved by a tracer. This counter is called commit_overrun.

While at it, I added a nmi_dropped counter to count any time an NMI entry
is dropped because the NMI could not take the spinlock.

[ Impact: prevent deadlock by printing normal case warning ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f0d2c681

ring-buffer: export symbols · d6ce96da

由 Steven Rostedt 提交于 5月 05, 2009

I'm adding a module to do a series of tests on the ring buffer as well
as benchmarks. This module needs to have more of the ring buffer API
exported. There's nothing wrong with reading the ring buffer from a
module.

[ Impact: allow modules to read pages from the ring buffer ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d6ce96da

29 4月, 2009 8 次提交

tracing: fix build failure on s390 · a0e39ed3

由 Heiko Carstens 提交于 4月 29, 2009

"tracing: create automated trace defines" causes this compile error on s390,
as reported by Sachin Sant against linux-next:

 kernel/built-in.o: In function `__do_softirq':
 (.text+0x1c680): undefined reference to `__tracepoint_softirq_entry'

This happens because the definitions of the softirq tracepoints were moved
from kernel/softirq.c to kernel/irq/handle.c. Since s390 doesn't support
generic hardirqs handle.c doesn't get compiled and the definitions are
missing.

So move the tracepoints to softirq.c again.

[ Impact: fix build failure on s390 ]
Reported-by: NSachin Sant <sachinp@in.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
LKML-Reference: <20090429135139.5fac79b8@osiris.boeblingen.de.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a0e39ed3

tracing/filters: a better event parser · 8b372562

由 Tom Zanussi 提交于 4月 28, 2009

Replace the current event parser hack with a better one.  Filters are
no longer specified predicate by predicate, but all at once and can
use parens and any of the following operators:

numeric fields:

==, !=, <, <=, >, >=

string fields:

==, !=

predicates can be combined with the logical operators:

&&, ||

examples:

"common_preempt_count > 4" > filter

"((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter

If there was an error, the erroneous string along with an error
message can be seen by looking at the filter e.g.:

((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
^
parse_error: Field not found

Currently the caret for an error always appears at the beginning of
the filter; a real position should be used, but the error message
should be useful even without it.

To clear a filter, '0' can be written to the filter file.

Filters can also be set or cleared for a complete subsystem by writing
the same filter as would be written to an individual event to the
filter file at the root of the subsytem.  Note however, that if any
event in the subsystem lacks a field specified in the filter being
set, the set will fail and all filters in the subsytem are
automatically cleared.  This change from the previous version was made
because using only the fields that happen to exist for a given event
would most likely result in a meaningless filter.

Because the logical operators are now implemented as predicates, the
maximum number of predicates in a filter was increased from 8 to 16.

[ Impact: add new, extended trace-filter implementation ]
Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <1240905899.6416.121.camel@tropicana>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8b372562

tracing/filters: distinguish between signed and unsigned fields · a118e4d1

由 Tom Zanussi 提交于 4月 28, 2009

The new filter comparison ops need to be able to distinguish between
signed and unsigned field types, so add an is_signed flag/param to the
event field struct/trace_define_fields().  Also define a simple macro,
is_signed_type() to determine the signedness at compile time, used in the
trace macros.  If the is_signed_type() macro won't work with a specific
type, a new slightly modified version of TRACE_FIELD() called
TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.

[ Impact: extend trace-filter code for new feature ]
Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <1240905893.6416.120.camel@tropicana>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a118e4d1

tracing/filters: move preds into event_filter object · 30e673b2

由 Tom Zanussi 提交于 4月 28, 2009

Create a new event_filter object, and move the pred-related members
out of the call and subsystem objects and into the filter object - the
details of the filter implementation don't need to be exposed in the
call and subsystem in any case, and it will also help make the new
parser implementation a little cleaner.

[ Impact: refactor trace-filter code to prepare for new features ]
Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <1240905887.6416.119.camel@tropicana>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

30e673b2

ring-buffer: fix printk output · 7d7d2b80

由 Steven Rostedt 提交于 4月 27, 2009

The warning output in trace_recursive_lock uses %d for a long when
it should be %ld.

[ Impact: fix compile warning ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

7d7d2b80

tracing: have splice only copy full pages · f2957f1f

由 Steven Rostedt 提交于 4月 29, 2009

Splice works with pages, it is much more effecient to use an entire
page than to copy bits over several pages.

Using logdev to trace the internals of the splice mechanism, I was
able to see that splice can be very aggressive. When tracing is
occurring, and the reader caught up to the writer, and the writer
is on the reader page, the reader will copy what is there into the
splice page. Splice may iterate over several pages and if the
writer is still writing to the page, the reader will keep copying
bits to new pages to pass to userspace.

This patch changes it to only pass data to userspace if the page
is full (the writer has left the page). This has a small side effect
that splice can not read a partial page, and must wait for the
page to fill. This should not be an issue. If tracing has stopped,
then a use of "read" will still read all of the page.

[ Impact: better performance for ring buffer splice code ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f2957f1f

tracing: only add splice page if entries exist · 93459c6c

由 Steven Rostedt 提交于 4月 29, 2009

The splice code allocates a page even when the ring buffer is empty.
It detects the ring buffer being empty when it it fails to copy
anything from the ring buffer into the page.

This patch adds a check to see if there is anything in the ring buffer
before allocating a page.

Thanks to logdev for letting me trace the tracer to find this.

[ Impact: speed up due to removing unnecessary allocation ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

93459c6c

tracing: fix ref count in splice pages · 5beae6ef

由 Steven Rostedt 提交于 4月 29, 2009

The pages allocated for the splice binary buffer did not initialize
the ref count correctly. This caused pages not to be freed and causes
a drastic memory leak.

Thanks to logdev I was able to trace the tracer to find where the leak
was.

[ Impact: stop memory leak when using splice ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

5beae6ef

28 4月, 2009 1 次提交

tracing: convert ftrace_dump spinlocks to raw · cd891ae0

由 Steven Rostedt 提交于 4月 28, 2009

ftrace_dump is used for printing out the contents of the ftrace ring buffer
to the console on failure. Currently it uses a spinlock to synchronize
the output from multiple failures on different CPUs. This spin lock
currently is a normal spinlock and can cause issues with lockdep and
lock tracing.

This patch converts it to raw since it is for error handling only.
The lock is local to the ftrace_dump and is not used by any other
infrastructure.

[ Impact: prevent ftrace_dump from locking up by internal tracing ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

cd891ae0

26 4月, 2009 1 次提交

tracing/events: make modules have their own file_operations structure · 701970b3

由 Steven Rostedt 提交于 4月 24, 2009

For proper module reference counting, the file_operations that modules use
must have the "owner" field set to the module. Unfortunately, the trace events
use share file_operations. The same file_operations are used by all both
kernel core and all modules.

This patch makes the modules allocate their own file_operations and
copies the functions from the core kernel. This allows those file
operations to be owned by the module.

Care is taken to free this code on module unload.

Thanks to Greg KH for reminding me that file_operations must be owned
by the module to have reference counting take place.

[ Impact: fix modular tracepoints / potential crash ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>

701970b3

25 4月, 2009 1 次提交

tracing/events: reuse trace event ids after overflow · 060fa5c8

由 Steven Rostedt 提交于 4月 24, 2009

With modules being able to add trace events, and the max trace event
counter is 16 bits (65536) we can overflow the counter easily
with a simple while loop adding and removing modules that contain
trace events.

This patch links together the registered trace events and on overflow
searches for available trace event ids. It will still fail if
over 65536 events are registered, but considering that a typical
kernel only has 22000 functions, 65000 events should be sufficient.
Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

060fa5c8

24 4月, 2009 4 次提交

ring_buffer: compressed event header · 334d4169

由 Lai Jiangshan 提交于 4月 24, 2009

RB_MAX_SMALL_DATA = 28bytes is too small for most tracers, it wastes
an 'u32' to save the actually length for events which data size > 28.

This fix uses compressed event header and enlarges RB_MAX_SMALL_DATA.

[ Impact: saves about 0%-12.5%(depends on tracer) memory in ring_buffer ]
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <49F13189.3090000@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

334d4169

tracing: add size checks for exported ftrace internal structures · 75db37d2

由 Steven Rostedt 提交于 3月 26, 2009

The events exported by TRACE_EVENT are automated and are guaranteed
to be correct when used.

The internal ftrace structures on the other hand are more manually
exported. These require the ftrace maintainer to make sure they
are up to date.

This patch adds a size check to help flag when a type changes in
an internal ftrace data structure, and the update needs to be reflected
in the export.

If a export is incorrect, then the only harm is that the user space
tools will not know how to correctly read the internal structures of
ftrace.

[ Impact: help prevent inconsistent ftrace format print outs ]
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

75db37d2

tracing: increase size of number of possible events · 89ec0dee

由 Steven Rostedt 提交于 3月 26, 2009

With the new event tracing registration, we must increase the number
of events that can be registered. Currently the type field is only
one byte, which leaves us only 256 possible events.

Since we do not save the CPU number in the tracer anymore (it is determined
by the per cpu ring buffer that is used) we have an extra byte to use.

This patch increases the size of type from 1 byte (256 events) to
2 bytes (65,536 events).

It also adds a WARN_ON_ONCE if we exceed that limit.

[ Impact: allow more than 255 events ]
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

89ec0dee

tracing/wakeup: move access to wakeup_cpu into spinlock · 9be24414

由 Steven Rostedt 提交于 3月 26, 2009

The code had the following outside the lock:

        if (next != wakeup_task)
                return;

        pc = preempt_count();

        /* The task we are waiting for is waking up */
        data = wakeup_trace->data[wakeup_cpu];

On initialization, wakeup_task is NULL and wakeup_cpu -1. This code
is not under a lock. If wakeup_task is set on another CPU as that
task is waking up, we can see the wakeup_task before wakeup_cpu is
set. If we read wakeup_cpu while it is still -1 then we will have
a bad data pointer.

This patch moves the reading of wakeup_cpu within the protection of
the spinlock used to protect the writing of wakeup_cpu and wakeup_task.

[ Impact: remove possible race causing invalid pointer dereference ]
Reported-by: NManeesh Soni <maneesh@in.ibm.com>
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

9be24414

22 4月, 2009 1 次提交

tracing/events: make struct trace_entry->type to be int type · 7a4f453b

由 Li Zefan 提交于 4月 22, 2009

struct trace_entry->type is unsigned char, while trace event's id is
int type, thus for a event with id >= 256, it's entry->type is cast
to (id % 256), and then we can't see the trace output of this event.

 # insmod trace-events-sample.ko
 # echo foo_bar > /mnt/tracing/set_event
 # cat /debug/tracing/events/trace-events-sample/foo_bar/id
 256
 # cat /mnt/tracing/trace_pipe
           <...>-3548  [001]   215.091142: Unknown type 0
           <...>-3548  [001]   216.089207: Unknown type 0
           <...>-3548  [001]   217.087271: Unknown type 0
           <...>-3548  [001]   218.085332: Unknown type 0

[ Impact: fix output for trace events with id >= 256 ]
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49EEDB0E.5070207@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7a4f453b

21 4月, 2009 8 次提交

ring-buffer: only warn on wrap if buffer is bigger than two pages · 3554228d

由 Steven Rostedt 提交于 4月 21, 2009

On boot up, to save memory, ftrace allocates the minimum buffer
which is two pages. Ftrace also goes through a series of tests
(when configured) on boot up. These tests can fill up a page within
a single interrupt.

The ring buffer also has a WARN_ON when it detects that the buffer was
completely filled within a single commit (other commits are allowed to
be nested).

Combine the small buffer on start up, with the tests that can fill more
than a single page within an interrupt, this can trigger the WARN_ON.

This patch makes the WARN_ON only happen when the ring buffer consists
of more than two pages.

[ Impact: prevent false WARN_ON in ftrace startup tests ]
Reported-by: NIngo Molnar <mingo@elte.hu>
LKML-Reference: <20090421094616.GA14561@elte.hu>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3554228d

tracing/filters: allow user-input to be integer-like string · f66578a7

由 Li Zefan 提交于 4月 21, 2009

Suppose we would like to trace all tasks named '123', but this
will fail:

 # echo 'parent_comm == 123' > events/sched/sched_process_fork/filter
 bash: echo: write error: Invalid argument

Don't guess the type of the filter pred in filter_parse(), but instead
we check it in __filter_add_pred().

[ Impact: extend allowed filter field string values ]
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49ED8DEB.6000700@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f66578a7

tracing/filters: don't remove old filters when failed to write subsys->filter · e8082f3f

由 Li Zefan 提交于 4月 21, 2009

If writing subsys->filter returns EINVAL or ENOSPC, the original
filters in subsys/ and subsys/events/ will be removed. This is
definitely wrong.

[ Impact: fix filter setting semantics on error condition ]
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49ED8DD2.2070700@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e8082f3f

tracing: use nowakeup version of commit for function event trace tests · cb4764a6

由 Steven Rostedt 提交于 4月 20, 2009

The startup tests for the event tracer also runs with the function
tracer enabled. The "wakeup" version of the trace commit was used
which can grab spinlocks. If a task was preempted by an NMI
that called a function being traced, it could deadlock due to the
function tracer trying to grab the same lock.

Thanks to Frederic Weisbecker for pointing out where the bug was.
Reported-by: NIngo Molnar <mingo@elte.hu>
Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

cb4764a6

tracing: use recursive counter over irq level · aa18efb2

由 Steven Rostedt 提交于 4月 20, 2009

Althought using the irq level (hardirq_count, softirq_count and in_nmi)
was nice to detect bad recursion right away, but since the counters are
not atomically updated with respect to the interrupts, the function tracer
might trigger the test from an interrupt handler before the hardirq_count
is updated. This will trigger a false warning.

This patch converts the recursive detection to a simple counter.
If the depth is greater than 16 then the recursive detection will trigger.
16 is more than enough for any nested interrupts.

[ Impact: fix false positive trace recursion detection ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

aa18efb2

tracing: remove recursive test from ring_buffer_event_discard · e395898e

由 Steven Rostedt 提交于 4月 20, 2009

The ring_buffer_event_discard is not tied to ring_buffer_lock_reserve.
It can be called inside or outside the reserve/commit. Even if it
is called inside the reserve/commit the commit part must also be called.

Only ring_buffer_discard_commit can be used as a replacement for
ring_buffer_unlock_commit.

This patch removes the trace_recursive_unlock from ring_buffer_event_discard
since it would be the wrong place to do so.

[Impact: prevent breakage in trace recursive testing ]

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e395898e

tracing: fix recursive test level calculation · 17487bfe

由 Steven Rostedt 提交于 4月 20, 2009

The recursive tests to detect same level recursion in the ring buffers
did not account for the hard/softirq_counts to be shifted. Thus the
numbers could be larger than then mask to be tested.

This patch includes the shift for the calculation of the irq depth.

[ Impact: stop false positives in trace recursion detection ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

17487bfe

tracing/events: call the correct event trace selftest init function · 28d20e2d

由 Steven Rostedt 提交于 4月 20, 2009

The late_initcall calls a helper function instead of the proper
init event selftest function.

This update may have been lost due to conflicting merges.

[ Impact: fix compiler warning and call extended event trace self tests ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

28d20e2d