提交 · 5e67b51e3fb22ad43faf9589e9019ad9c6a00413 · OpenHarmony / kernel_linux

31 1月, 2013 2 次提交

tracing: Use sched_clock_cpu for trace_clock_global · 5e67b51e

由 Namhyung Kim 提交于 12月 27, 2012

For systems with an unstable sched_clock, all cpu_clock() does is enable/
disable local irq during the call to sched_clock_cpu(). And for stable
systems they are same.

trace_clock_global() already disables interrupts, so it can call
sched_clock_cpu() directly.

Link: http://lkml.kernel.org/r/1356576585-28782-2-git-send-email-namhyung@kernel.orgSigned-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

5e67b51e

ring-buffer: Add stats field for amount read from trace ring buffer · ad964704

由 Steven Rostedt (Red Hat) 提交于 1月 29, 2013

Add a stat about the number of events read from the ring buffer:

 #  cat /debug/tracing/per_cpu/cpu0/stats
entries: 39869
overrun: 870512
commit overrun: 0
bytes: 1449912
oldest event ts:  6561.368690
now ts:  6565.246426
dropped events: 0
read events: 112    <-- Added
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

ad964704

30 1月, 2013 1 次提交

tracing/fgraph: Adjust fgraph depth before calling trace return callback · 03274a3f

由 Steven Rostedt (Red Hat) 提交于 1月 29, 2013

While debugging the virtual cputime with the function graph tracer
with a max_depth of 1 (most common use of the max_depth so far),
I found that I was missing kernel execution because of a race condition.

The code for the return side of the function has a slight race:

	ftrace_pop_return_trace(&trace, &ret, frame_pointer);
	trace.rettime = trace_clock_local();
	ftrace_graph_return(&trace);
	barrier();
	current->curr_ret_stack--;

The ftrace_pop_return_trace() initializes the trace structure for
the callback. The ftrace_graph_return() uses the trace structure
for its own use as that structure is on the stack and is local
to this function. Then the curr_ret_stack is decremented which
is what the trace.depth is set to.

If an interrupt comes in after the ftrace_graph_return() but
before the curr_ret_stack, then the called function will get
a depth of 2. If max_depth is set to 1 this function will be
ignored.

The problem is that the trace has already been called, and the
timestamp for that trace will not reflect the time the function
was about to re-enter userspace. Calls to the interrupt will not
be traced because the max_depth has prevented this.

To solve this issue, the ftrace_graph_return() can safely be
moved after the current->curr_ret_stack has been updated.
This way the timestamp for the return callback will reflect
the actual time.

If an interrupt comes in after the curr_ret_stack update and
ftrace_graph_return(), it will be traced. It may look a little
confusing to see it within the other function, but at least
it will not be lost.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

03274a3f

29 1月, 2013 1 次提交

tracing: Remove second iterator initializer · 38dbe0b1

由 Jovi Zhang 提交于 1月 25, 2013

The trace iterator is already initialized by trace_init_global_iter(),
so there is no need to initialize it again.

Link: http://lkml.kernel.org/r/CACV3sb+G1YnO6168JhY3dEadmJi58pA5-2cSZT8E0WVHJNFt9Q@mail.gmail.comSigned-off-by: NJovi Zhang <bookjovi@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

38dbe0b1

26 1月, 2013 1 次提交

tracing: Use __this_cpu_inc/dec operation instead of __get_cpu_var · 82146529

由 Shan Wei 提交于 11月 19, 2012

__this_cpu_inc_return() or __this_cpu_dec generates a single instruction,
which is faster than __get_cpu_var operation.

Link: http://lkml.kernel.org/r/50A9C1BD.1060308@gmail.comReviewed-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NShan Wei <davidshan@tencent.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

82146529

25 1月, 2013 1 次提交

tracing: Mark tracing_dentry_percpu() static · b736f48b

由 Josh Triplett 提交于 11月 18, 2012

Nothing outside of kernel/trace/trace.c references tracing_dentry_percpu().

Link: http://lkml.kernel.org/r/1353302917-13995-7-git-send-email-josh@joshtriplett.orgSigned-off-by: NJosh Triplett <josh@joshtriplett.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

b736f48b

24 1月, 2013 2 次提交

profiling: Remove unused timer hook · ba6fdda4

由 Frederic Weisbecker 提交于 12月 22, 2012

The last remaining user was oprofile and its use has been
removed a while ago in commit bc078e4e
("oprofile: convert oprofile from timer_hook to hrtimer").

There doesn't seem to be any upstream user of this hook
for about two years now. And I'm not even aware of any out of
tree user.

Let's remove it.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1356191991-2251-1-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ba6fdda4

tracing: Fix unsigned int compare of zero in recursion check · d41032a8

由 Steven Rostedt 提交于 1月 24, 2013

Dan's smatch found a compare bug with the result of the
trace_test_and_set_recursion() and comparing to less than
zero. If the function fails, it returns -1, but was saved in
an unsigned int, which will never be less than zero and will
ignore the result of the test if a recursion did happen.

Luckily this is the last of the recursion tests, as the
infrastructure of ftrace would catch recursions before it
got here, except for some few exceptions.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d41032a8

23 1月, 2013 16 次提交

ring-buffer: Remove trace.h from ring_buffer.c · 0b07436d

由 Steven Rostedt 提交于 1月 22, 2013

ring_buffer.c use to require declarations from trace.h, but
these have moved to the generic header files. There's nothing
in trace.h that ring_buffer.c requires.

There's some headers that trace.h included that ring_buffer.c
needs, but it's best that it includes them directly, and not
include trace.h.

Also, some things may use ring_buffer.c without having tracing
configured. This removes the dependency that may come in the
future.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

0b07436d

ring-buffer: User context bit recursion checking · 567cd4da

由 Steven Rostedt 提交于 11月 02, 2012

Using context bit recursion checking, we can help increase the
performance of the ring buffer.

Before this patch:

 # echo function > /debug/tracing/current_tracer
 # for i in `seq 10`; do ./hackbench 50; done
Time: 10.285
Time: 10.407
Time: 10.243
Time: 10.372
Time: 10.380
Time: 10.198
Time: 10.272
Time: 10.354
Time: 10.248
Time: 10.253

(average: 10.3012)

Now we have:

 # echo function > /debug/tracing/current_tracer
 # for i in `seq 10`; do ./hackbench 50; done
Time: 9.712
Time: 9.824
Time: 9.861
Time: 9.827
Time: 9.962
Time: 9.905
Time: 9.886
Time: 10.088
Time: 9.861
Time: 9.834

(average: 9.876)

 a 4% savings!
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

567cd4da

ftrace: Use only the preempt version of function tracing · 897f68a4

由 Steven Rostedt 提交于 11月 02, 2012

The function tracer had two different versions of function tracing.

The disabling of irqs version and the preempt disable version.

As function tracing in very intrusive and can cause nasty recursion
issues, it has its own recursion protection. But the old method to
do this was a flat layer. If it detected that a recursion was happening
then it would just return without recording.

This made the preempt version (much faster than the irq disabling one)
not very useful, because if an interrupt were to occur after the
recursion flag was set, the interrupt would not be traced at all,
because every function that was traced would think it recursed on
itself (due to the context it preempted setting the recursive flag).

Now that we have a recursion flag for every context level, we
no longer need to worry about that. We can disable preemption,
set the current context recursion check bit, and go on. If an
interrupt were to come along, it would check its own context bit
and happily continue to trace.

As the preempt version is faster than the irq disable version,
there's no more reason to keep the preempt version around.
And the irq disable version still had an issue with missing
out on tracing NMI code.

Remove the irq disable function tracer version and have the
preempt disable version be the default (and only version).

Before this patch we had from running:

 # echo function > /debug/tracing/current_tracer
 # for i in `seq 10`; do ./hackbench 50; done
Time: 12.028
Time: 11.945
Time: 11.925
Time: 11.964
Time: 12.002
Time: 11.910
Time: 11.944
Time: 11.929
Time: 11.941
Time: 11.924

(average: 11.9512)

Now we have:

 # echo function > /debug/tracing/current_tracer
 # for i in `seq 10`; do ./hackbench 50; done
Time: 10.285
Time: 10.407
Time: 10.243
Time: 10.372
Time: 10.380
Time: 10.198
Time: 10.272
Time: 10.354
Time: 10.248
Time: 10.253

(average: 10.3012)

 a 13.8% savings!
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

897f68a4

tracing: Avoid unnecessary multiple recursion checks · edc15caf

由 Steven Rostedt 提交于 11月 02, 2012

When function tracing occurs, the following steps are made:
  If arch does not support a ftrace feature:
   call internal function (uses INTERNAL bits) which calls...
  If callback is registered to the "global" list, the list
   function is called and recursion checks the GLOBAL bits.
   then this function calls...
  The function callback, which can use the FTRACE bits to
   check for recursion.

Now if the arch does not suppport a feature, and it calls
the global list function which calls the ftrace callback
all three of these steps will do a recursion protection.
There's no reason to do one if the previous caller already
did. The recursion that we are protecting against will
go through the same steps again.

To prevent the multiple recursion checks, if a recursion
bit is set that is higher than the MAX bit of the current
check, then we know that the check was made by the previous
caller, and we can skip the current check.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

edc15caf

tracing: Make the trace recursion bits into enums · e46cbf75

由 Steven Rostedt 提交于 11月 02, 2012

Convert the bits into enums which makes the code a little easier
to maintain.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e46cbf75

ftrace: Add context level recursion bit checking · c29f122c

由 Steven Rostedt 提交于 11月 02, 2012

Currently for recursion checking in the function tracer, ftrace
tests a task_struct bit to determine if the function tracer had
recursed or not. If it has, then it will will return without going
further.

But this leads to races. If an interrupt came in after the bit
was set, the functions being traced would see that bit set and
think that the function tracer recursed on itself, and would return.

Instead add a bit for each context (normal, softirq, irq and nmi).

A check of which context the task is in is made before testing the
associated bit. Now if an interrupt preempts the function tracer
after the previous context has been set, the interrupt functions
can still be traced.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

c29f122c

ftrace: Optimize the function tracer list loop · 0a016409

由 Steven Rostedt 提交于 11月 02, 2012

There is lots of places that perform:

       op = rcu_dereference_raw(ftrace_control_list);
       while (op != &ftrace_list_end) {

Add a helper macro to do this, and also optimize for a single
entity. That is, gcc will optimize a loop for either no iterations
or more than one iteration. But usually only a single callback
is registered to the function tracer, thus the optimized case
should be a single pass. to do this we now do:

	op = rcu_dereference_raw(list);
	do {
		[...]
	} while (likely(op = rcu_dereference_raw((op)->next)) &&
	       unlikely((op) != &ftrace_list_end));

An op is always registered (ftrace_list_end when no callbacks is
registered), thus when a single callback is registered, the link
list looks like:

 top => callback => ftrace_list_end => NULL.

The likely(op = op->next) still must be performed due to the race
of removing the callback, where the first op assignment could
equal ftrace_list_end. In that case, the op->next would be NULL.
But this is unlikely (only happens in a race condition when
removing the callback).

But it is very likely that the next op would be ftrace_list_end,
unless more than one callback has been registered. This tells
gcc what the most common case is and makes the fast path with
the least amount of branches.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

0a016409

ftrace: Fix function tracing recursion self test · 9640388b

由 Steven Rostedt 提交于 11月 02, 2012

The function tracing recursion self test should not crash
the machine if the resursion test fails. If it detects that
the function tracing is recursing when it should not be, then
bail, don't go into an infinite recursive loop.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

9640388b

ftrace: Fix global function tracers that are not recursion safe · 63503794

由 Steven Rostedt 提交于 11月 02, 2012

If one of the function tracers set by the global ops is not recursion
safe, it can still be called directly without the added recursion
supplied by the ftrace infrastructure.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

63503794

tracing: Fix selftest function recursion accounting · 05cbbf64

由 Steven Rostedt 提交于 1月 22, 2013

The test that checks function recursion does things differently
if the arch does not support all ftrace features. But that really
doesn't make a difference with how the test runs, and either way
the count variable should be 2 at the end.

Currently the test wrongly fails for archs that don't support all
the ftrace features.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

05cbbf64

tracing: Fix race with max_tr and changing tracers · 34600f0e

由 Steven Rostedt 提交于 1月 22, 2013

There's a race condition between the setting of a new tracer and
the update of the max trace buffers (the swap). When a new tracer
is added, it sets current_trace to nop_trace before disabling
the old tracer. At this moment, if the old tracer uses update_max_tr(),
the update may trigger the warning against !current_trace->use_max-tr,
as nop_trace doesn't have that set.

As update_max_tr() requires that interrupts be disabled, we can
add a check to see if current_trace == nop_trace and bail if it
does. Then when disabling the current_trace, set it to nop_trace
and run synchronize_sched(). This will make sure all calls to
update_max_tr() have completed (it was called with interrupts disabled).

As a clean up, this commit also removes shrinking and recreating
the max_tr buffer if the old and new tracers both have use_max_tr set.
The old way use to always shrink the buffer, and then expand it
for the next tracer. This is a waste of time.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

34600f0e

async: fix __lowest_in_progress() · f56c3196

由 Tejun Heo 提交于 1月 22, 2013

Commit 083b804c ("async: use workqueue for worker pool") made it
possible that async jobs are moved from pending to running out-of-order.
While pending async jobs will be queued and dispatched for execution in
the same order, nothing guarantees they'll enter "1) move self to the
running queue" of async_run_entry_fn() in the same order.

Before the conversion, async implemented its own worker pool. An async
worker, upon being woken up, fetches the first item from the pending
list, which kept the executing lists sorted. The conversion to
workqueue was done by adding work_struct to each async_entry and async
just schedules the work item. The queueing and dispatching of such work
items are still in order but now each worker thread is associated with a
specific async_entry and moves that specific async_entry to the
executing list. So, depending on which worker reaches that point
earlier, which is non-deterministic, we may end up moving an async_entry
with larger cookie before one with smaller one.

This broke __lowest_in_progress(). running->domain may not be properly
sorted and is not guaranteed to contain lower cookies than pending list
when not empty. Fix it by ensuring sort-inserting to the running list
and always looking at both pending and running when trying to determine
the lowest cookie.

Over time, the async synchronization implementation became quite messy.
We better restructure it such that each async_entry is linked to two
lists - one global and one per domain - and not move it when execution
starts. There's no reason to distinguish pending and running. They
behave the same for synchronization purposes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f56c3196

wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task · 9067ac85

由 Oleg Nesterov 提交于 1月 21, 2013

wake_up_process() should never wakeup a TASK_STOPPED/TRACED task.
Change it to use TASK_NORMAL and add the WARN_ON().

TASK_ALL has no other users, probably can be killed.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9067ac85

ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL · 9899d11f

由 Oleg Nesterov 提交于 1月 21, 2013

putreg() assumes that the tracee is not running and pt_regs_access() can
safely play with its stack.  However a killed tracee can return from
ptrace_stop() to the low-level asm code and do RESTORE_REST, this means
that debugger can actually read/modify the kernel stack until the tracee
does SAVE_REST again.

set_task_blockstep() can race with SIGKILL too and in some sense this
race is even worse, the very fact the tracee can be woken up breaks the
logic.

As Linus suggested we can clear TASK_WAKEKILL around the arch_ptrace()
call, this ensures that nobody can ever wakeup the tracee while the
debugger looks at it.  Not only this fixes the mentioned problems, we
can do some cleanups/simplifications in arch_ptrace() paths.

Probably ptrace_unfreeze_traced() needs more callers, for example it
makes sense to make the tracee killable for oom-killer before
access_process_vm().

While at it, add the comment into may_ptrace_stop() to explain why
ptrace_stop() still can't rely on SIGKILL and signal_pending_state().
Reported-by: NSalman Qazi <sqazi@google.com>
Reported-by: NSuleiman Souhlal <suleiman@google.com>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9899d11f

tracing: Remove trace.h header from trace_clock.c · 0a71e4c6

由 Steven Rostedt 提交于 1月 22, 2013

As trace_clock is used by other things besides tracing, and it
does not require anything from trace.h, it is best not to include
the header file in trace_clock.c.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

0a71e4c6

ptrace: introduce signal_wake_up_state() and ptrace_signal_wake_up() · 910ffdb1

由 Oleg Nesterov 提交于 1月 21, 2013

Cleanup and preparation for the next change.

signal_wake_up(resume => true) is overused. None of ptrace/jctl callers
actually want to wakeup a TASK_WAKEKILL task, but they can't specify the
necessary mask.

Turn signal_wake_up() into signal_wake_up_state(state), reintroduce
signal_wake_up() as a trivial helper, and add ptrace_signal_wake_up()
which adds __TASK_TRACED.

This way ptrace_signal_wake_up() can work "inside" ptrace_request()
even if the tracee doesn't have the TASK_WAKEKILL bit set.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

910ffdb1

22 1月, 2013 11 次提交

tracing: Remove the extra 4 bytes of padding in events · b000c806

由 Steven Rostedt 提交于 1月 18, 2013

Due to a userspace issue with PowerTop v2beta, which hardcoded
the offset of event fields that it was using, it broke when
we removed the Big Kernel Lock counter from the event header.

 (commit e6e1e259 "tracing: Remove lock_depth from event entry")

Because this broke userspace, it was determined that we must
keep those 4 bytes around.

 (commit a3a4a5ac "Regression: partial revert "tracing: Remove lock_depth from event entry"")

This unfortunately wastes space in the ring buffer. 4 bytes per
event, where a lot of events are just 24 bytes. That's 16% of the
buffer wasted. A million events will add 4 megs of white space
into the buffer.

It was later noticed that PowerTop v2beta could not work on systems
where the kernel was 64 bit but the userspace was 32 bits.
The reason was because the offsets are different between the
two and the hard coded offset of one would not work with the other.

With PowerTop v2 final, it implemented the same interface that both
perf and trace-cmd use. That is, it reads the format file of
the event to find the offsets of the fields it needs. This fixes
the problem with running powertop on a 32 bit userspace running
on a 64 bit kernel. It also no longer requires the 4 byte padding.

As PowerTop v2 has been out for a while, and is included in all
major distributions, it is time that we can safely remove the
4 bytes of padding. Users of PowerTop v2beta should upgrade to
PowerTop v2 final.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

b000c806

kprobes/x86: Move ftrace-based kprobe code into kprobes-ftrace.c · e7dbfe34

由 Masami Hiramatsu 提交于 9月 28, 2012

Split ftrace-based kprobes code from kprobes, and introduce
CONFIG_(HAVE_)KPROBES_ON_FTRACE Kconfig flags.
For the cleanup reason, this also moves kprobe_ftrace check
into skip_singlestep.

Link: http://lkml.kernel.org/r/20120928081520.3560.25624.stgit@ltc138.sdl.hitachi.co.jp

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e7dbfe34

ftrace: Move ARCH_SUPPORTS_FTRACE_SAVE_REGS in Kconfig · 06aeaaea

由 Masami Hiramatsu 提交于 9月 28, 2012

Move SAVE_REGS support flag into Kconfig and rename
it to CONFIG_DYNAMIC_FTRACE_WITH_REGS. This also introduces
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS which indicates
the architecture depending part of ftrace has a code
that saves full registers.
On the other hand, CONFIG_DYNAMIC_FTRACE_WITH_REGS indicates
the code is enabled.

Link: http://lkml.kernel.org/r/20120928081516.3560.72534.stgit@ltc138.sdl.hitachi.co.jp

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

06aeaaea

tracing/fgraph: Add max_graph_depth to limit function_graph depth · 8741db53

由 Steven Rostedt 提交于 1月 16, 2013

Add the file max_graph_depth to the debug tracing directory that lets
the user define the depth of the function graph.

A very useful operation is to set the depth to 1. Then it traces only
the first function that is called when entering the kernel. This can
be used to determine what system operations interrupt a process.

For example, to work on NOHZ processes (single tasks running without
a timer tick), if any interrupt goes off and preempts that task, this
code will show it happening.

  # cd /sys/kernel/debug/tracing
  # echo 1 > max_graph_depth
  # echo function_graph > current_tracer
  # cat per_cpu/cpu/<cpu-of-process>/trace

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

8741db53

tracing: Remove unneeded check of max_tr->buffer before tracing_reset · 84c6cf0d

由 Steven Rostedt 提交于 12月 20, 2012

There's now a check in tracing_reset_online_cpus() if the buffer is
allocated or NULL. No need to do a check before calling it with max_tr.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

84c6cf0d

tracing: Add checks if tr->buffer is NULL in tracing_reset{_online_cpus} · a5416411

由 Hiraku Toyooka 提交于 12月 19, 2012

max_tr->buffer could be NULL in the tracing_reset{_online_cpus}. In this
case, a NULL pointer dereference happens, so we should return immediately
from these functions.

Note, the current code does not call tracing_reset*() with max_tr when
its buffer is NULL, but future code will. This patch is needed to prevent
the future code from crashing.

Link: http://lkml.kernel.org/r/20121219070234.31200.93863.stgit@liselsiaSigned-off-by: NHiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

a5416411

tracing/syscalls: Make local functions static · 6aea49cb

由 Fengguang Wu 提交于 11月 21, 2012

Some functions in the syscall tracing is used only locally to
the file, but they are labeled global. Convert them to static functions.
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

6aea49cb

tracing: Verify target file before registering a uprobe event · d24d7dbf

由 Jovi Zhang 提交于 7月 18, 2012

Without this patch, we can register a uprobe event for a directory.
Enabling such a uprobe event would anyway fail.

Example:
$ echo 'p /bin:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events

However dirctories cannot be valid targets for uprobe.
Hence verify if the target is a regular file during the probe
registration.

Link: http://lkml.kernel.org/r/20130103004212.690763002@goodmis.org

Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: NJovi Zhang <bookjovi@gmail.com>
Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
[ cleaned up whitespace and removed redundant IS_DIR() check ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d24d7dbf

tracing: Use this_cpu_ptr per-cpu helper · d8a0349c

由 Shan Wei 提交于 11月 13, 2012

typeof(&buffer) is a pointer to array of 1024 char, or char (*)[1024].
But, typeof(&buffer[0]) is a pointer to char which match the return type of get_trace_buf().
As well-known, the value of &buffer is equal to &buffer[0].
so return this_cpu_ptr(&percpu_buffer->buffer[0]) can avoid type cast.

Link: http://lkml.kernel.org/r/50A1A800.3020102@gmail.comReviewed-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NShan Wei <davidshan@tencent.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d8a0349c

ring-buffer: Remove unnecessary recusive call in rb_advance_iter() · 771e0384

由 Steven Rostedt 提交于 11月 30, 2012

The original ring-buffer code had special checks at the start
of rb_advance_iter() and instead of repeating them again at the
end of the function if a certain condition existed, I just did
a recursive call to rb_advance_iter() because the special condition
would cause rb_advance_iter() to return early (after the checks).

But as things have changed, the special checks no longer exist
and the only thing done for the special_condition is to call
rb_inc_iter() and return. Instead of doing a confusing recursive call,
just call rb_inc_iter instead.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

771e0384

ftrace: Be first to run code modification on modules · c1bf08ac

由 Steven Rostedt 提交于 12月 14, 2012

If some other kernel subsystem has a module notifier, and adds a kprobe
to a ftrace mcount point (now that kprobes work on ftrace points),
when the ftrace notifier runs it will fail and disable ftrace, as well
as kprobes that are attached to ftrace points.

Here's the error:

 WARNING: at kernel/trace/ftrace.c:1618 ftrace_bug+0x239/0x280()
 Hardware name: Bochs
 Modules linked in: fat(+) stap_56d28a51b3fe546293ca0700b10bcb29__8059(F) nfsv4 auth_rpcgss nfs dns_resolver fscache xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack lockd sunrpc ppdev parport_pc parport microcode virtio_net i2c_piix4 drm_kms_helper ttm drm i2c_core [last unloaded: bid_shared]
 Pid: 8068, comm: modprobe Tainted: GF            3.7.0-0.rc8.git0.1.fc19.x86_64 #1
 Call Trace:
  [<ffffffff8105e70f>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff81134106>] ? __probe_kernel_read+0x46/0x70
  [<ffffffffa0180000>] ? 0xffffffffa017ffff
  [<ffffffffa0180000>] ? 0xffffffffa017ffff
  [<ffffffff8105e76a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff810fd189>] ftrace_bug+0x239/0x280
  [<ffffffff810fd626>] ftrace_process_locs+0x376/0x520
  [<ffffffff810fefb7>] ftrace_module_notify+0x47/0x50
  [<ffffffff8163912d>] notifier_call_chain+0x4d/0x70
  [<ffffffff810882f8>] __blocking_notifier_call_chain+0x58/0x80
  [<ffffffff81088336>] blocking_notifier_call_chain+0x16/0x20
  [<ffffffff810c2a23>] sys_init_module+0x73/0x220
  [<ffffffff8163d719>] system_call_fastpath+0x16/0x1b
 ---[ end trace 9ef46351e53bbf80 ]---
 ftrace failed to modify [<ffffffffa0180000>] init_once+0x0/0x20 [fat]
  actual: cc:bb:d2:4b:e1

A kprobe was added to the init_once() function in the fat module on load.
But this happened before ftrace could have touched the code. As ftrace
didn't run yet, the kprobe system had no idea it was a ftrace point and
simply added a breakpoint to the code (0xcc in the cc:bb:d2:4b:e1).

Then when ftrace went to modify the location from a call to mcount/fentry
into a nop, it didn't see a call op, but instead it saw the breakpoint op
and not knowing what to do with it, ftrace shut itself down.

The solution is to simply give the ftrace module notifier the max priority.
This should have been done regardless, as the core code ftrace modification
also happens very early on in boot up. This makes the module modification
closer to core modification.

Link: http://lkml.kernel.org/r/20130107140333.593683061@goodmis.org

Cc: stable@vger.kernel.org
Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reported-by: NFrank Ch. Eigler <fche@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

c1bf08ac

21 1月, 2013 2 次提交

module: fix missing module_mutex unlock · ee61abb3

由 Linus Torvalds 提交于 1月 20, 2013

Commit 1fb9341a ("module: put modules in list much earlier") moved
some of the module initialization code around, and in the process
changed the exit paths too.  But for the duplicate export symbol error
case the change made the ddebug_cleanup path jump to after the module
mutex unlock, even though it happens with the mutex held.

Rusty has some patches to split this function up into some helper
functions, hopefully the mess of complex goto targets will go away
eventually.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ee61abb3

ia64: kill thread_matches(), unexport ptrace_check_attach() · edea0d03

由 Oleg Nesterov 提交于 1月 20, 2013

The ia64 function "thread_matches()" has no users since commit
e868a55c ("[IA64] remove find_thread_for_addr()").  Remove it.

This allows us to make ptrace_check_attach() static to kernel/ptrace.c,
which is good since we'll need to change the semantics of it and fix up
all the callers.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

edea0d03

20 1月, 2013 1 次提交
- A
  sys_clone() needs asmlinkage_protect · b1e0318b
  由 Al Viro 提交于 1月 19, 2013
```
Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  b1e0318b
17 1月, 2013 1 次提交

module, async: async_synchronize_full() on module init iff async is used · 774a1221

由 Tejun Heo 提交于 1月 15, 2013

If the default iosched is built as module, the kernel may deadlock
while trying to load the iosched module on device probe if the probing
was running off async.  This is because async_synchronize_full() at
the end of module init ends up waiting for the async job which
initiated the module loading.

 async A				modprobe

 1. finds a device
 2. registers the block device
 3. request_module(default iosched)
					4. modprobe in userland
					5. load and init module
					6. async_synchronize_full()

Async A waits for modprobe to finish in request_module() and modprobe
waits for async A to finish in async_synchronize_full().

Because there's no easy to track dependency once control goes out to
userland, implementing properly nested flushing is difficult.  For
now, make module init perform async_synchronize_full() iff module init
has queued async jobs as suggested by Linus.

This avoids the described deadlock because iosched module doesn't use
async and thus wouldn't invoke async_synchronize_full().  This is
hacky and incomplete.  It will deadlock if async module loading nests;
however, this works around the known problem case and seems to be the
best of bad options.

For more details, please refer to the following thread.

  http://thread.gmane.org/gmane.linux.kernel/1420814Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NAlex Riesen <raa.lkml@gmail.com>
Tested-by: NMing Lei <ming.lei@canonical.com>
Tested-by: NAlex Riesen <raa.lkml@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

774a1221

15 1月, 2013 1 次提交

tracing: Fix regression of trace_pipe · 250bfd3d

由 Liu Bo 提交于 1月 14, 2013

Commit 0fb9656d "tracing: Make tracing_enabled be equal to tracing_on"
changes the behaviour of trace_pipe, ie. it makes trace_pipe return if
we've read something and tracing is enabled, and this means that we have
to 'cat trace_pipe' again and again while running tests.

IMO the right way is if tracing is enabled, we always block and wait for
ring buffer, or we may lose what we want since ring buffer's size is limited.

Link: http://lkml.kernel.org/r/1358132051-5410-1-git-send-email-bo.li.liu@oracle.comSigned-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

250bfd3d

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多