提交 · 67d04bb2bcbd3e99f4c4daa58599c90a83ad314a · openanolis / cloud-kernel

17 2月, 2017 1 次提交

tracing: Remove outdated ring buffer comment · 67d04bb2

由 Joel Fernandes 提交于 2月 16, 2017

The comment about ring buffer's organization is outdated and the code sits
elsewhere, remove the comment.
Link: http://lkml.kernel.org/r/20170217041058.23904-1-joelaf@google.com

Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NJoel Fernandes <joelaf@google.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

67d04bb2

15 2月, 2017 6 次提交

tracing/probes: Fix a warning message to show correct maximum length · bef5da60

由 Masami Hiramatsu 提交于 2月 10, 2017

Since tracing/*probe_events will accept a probe definition
up to 4096 - 2 ('\n' and '\0') bytes, it must show 4094 instead
of 4096 in warning message.

Note that there is one possible case of exceed 4094. If user
prepare 4096 bytes null-terminated string and syscall write
it with the count == 4095, then it can be accepted. However,
if user puts a '\n' after that, it must rejected.
So IMHO, the warning message should indicate shorter one,
since it is safer.

Link: http://lkml.kernel.org/r/148673290462.2579.7966778294009665632.stgit@devboxSigned-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

bef5da60

tracing: Fix return value check in trace_benchmark_reg() · 8f0994bb

由 Wei Yongjun 提交于 1月 12, 2017

In case of error, the function kthread_run() returns ERR_PTR() and never
returns NULL. The NULL test in the return value check should be replaced
with IS_ERR().

Link: http://lkml.kernel.org/r/20170112135502.28556-1-weiyj.lk@gmail.com

Cc: stable@vger.kernel.org
Fixes: 81dc9f0e ("tracing: Add tracepoint benchmark tracepoint")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

8f0994bb

tracing: Use modern function declaration · eb583cd4

由 Arnd Bergmann 提交于 1月 23, 2017

We get a lot of harmless warnings about this header file at W=1 level
because of an unusual function declaration:

kernel/trace/trace.h:766:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]

This moves the inline statement where it normally belongs, avoiding the
warning.

Link: http://lkml.kernel.org/r/20170123122521.3389010-1-arnd@arndb.de

Fixes: 4046bf02 ("ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

eb583cd4

tracing/probe: Show subsystem name in messages · 72576341

由 Masami Hiramatsu 提交于 2月 07, 2017

Show "trace_probe:", "trace_kprobe:" and "trace_uprobe:"
headers for each warning/error/info message. This will
help people to notice that kprobe/uprobe events caused
those messages.

Link: http://lkml.kernel.org/r/148646647813.24658.16705315294927615333.stgit@devboxSigned-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

72576341

tracing/hwlat: Update old comment about migration · 8e0f1142

由 Luiz Capitulino 提交于 2月 13, 2017

The ftrace hwlat does support a cpumask.

Link: http://lkml.kernel.org/r/20170213122517.6e211955@redhat.comSigned-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

8e0f1142

tracing: Have traceprobe_probes_write() not access userspace unnecessarily · 1f9b3546

由 Steven Rostedt (VMware) 提交于 2月 09, 2017

The code in traceprobe_probes_write() reads up to 4096 bytes from userpace
for each line. If userspace passes in several lines to execute, the code
will do a large read for each line, even though, it is highly likely that
the first read from userspace received all of the lines at once.

I changed the logic to do a single read from userspace, and to only read
from userspace again if not all of the read from userspace made it in.

I tested this by adding printk()s and writing files that would test -1, ==,
and +1 the buffer size, to make sure that there's no overflows and that if a
single line is written with +1 the buffer size, that it fails properly.

Link: http://lkml.kernel.org/r/20170209180458.5c829ab2@gandalf.local.homeAcked-by: NMasami Hiramatsu <mhiramat@kernel.org>
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

1f9b3546

11 2月, 2017 1 次提交

tracing: Have COMM event filter key be treated as a string · 4c738413

由 Steven Rostedt (VMware) 提交于 2月 08, 2017

The GLOB operation "~" should be able to work with the COMM filter key in
order to trace programs with a glob. For example

  echo 'COMM ~ "systemd*"' > events/syscalls/filter
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

4c738413

03 2月, 2017 8 次提交

ftrace: Have set_graph_function handle multiple functions in one write · e704eff3

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

Currently, only one function can be written to set_graph_function and
set_graph_notrace. The last function in the list will have saved, even
though other functions will be added then removed.

Change the behavior to be the same as set_ftrace_function as to allow
multiple functions to be written. If any one fails, none of them will be
added. The addition of the functions are done at the end when the file is
closed.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

e704eff3

ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock · 649b988b

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

The hashs ftrace_graph_hash and ftrace_graph_notrace_hash are modified
within the graph_lock being held. Holding a pointer to them and passing them
along can lead to a use of a stale pointer (fgd->hash). Move assigning the
pointer and its use to within the holding of the lock. Note, it's an
rcu_sched protected data, and other instances of referencing them are done
with preemption disabled. But the file manipuation code must be protected by
the lock.

The fgd->hash pointer is set to NULL when the lock is being released.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

649b988b

tracing: Reset parser->buffer to allow multiple "puts" · 0e684b65

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

trace_parser_put() simply frees the allocated parser buffer. But it does not
reset the pointer that was freed. This means that if trace_parser_put() is
called on the same parser more than once, it will corrupt the allocation
system. Setting parser->buffer to NULL after free allows it to be called
more than once without any ill effect.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

0e684b65

ftrace: Have set_graph_functions handle write with RDWR · ae98d27a

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

Since reading the set_graph_functions uses seq functions, which sets the
file->private_data pointer to a seq_file descriptor. On writes the
ftrace_graph_data descriptor is set to file->private_data. But if the file
is opened for RDWR, the ftrace_graph_write() will incorrectly use the
file->private_data descriptor instead of
((struct seq_file *)file->private_data)->private pointer, and this can crash
the kernel.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

ae98d27a

ftrace: Reset fgd->hash in ftrace_graph_write() · d4ad9a1c

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

fgd->hash is saved and then freed, but is never reset to either
ftrace_graph_hash nor ftrace_graph_notrace_hash. But if multiple writes are
performed, then the freed hash could be accessed again.

 # cd /sys/kernel/debug/tracing
 # head -1000 available_filter_functions > /tmp/funcs
 # cat /tmp/funcs > set_graph_function

Causes:

 general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
 Modules linked in:  [...]
 CPU: 2 PID: 1337 Comm: cat Not tainted 4.10.0-rc2-test-00010-g6b052e9 #32
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 task: ffff880113a12200 task.stack: ffffc90001940000
 RIP: 0010:free_ftrace_hash+0x7c/0x160
 RSP: 0018:ffffc90001943db0 EFLAGS: 00010246
 RAX: 6b6b6b6b6b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: 6b6b6b6b6b6b6b6b
 RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffff8800ce1e1d40
 RBP: ffff8800ce1e1d50 R08: 0000000000000000 R09: 0000000000006400
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: ffff8800ce1e1d40 R14: 0000000000004000 R15: 0000000000000001
 FS:  00007f9408a07740(0000) GS:ffff88011e500000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000aee1f0 CR3: 0000000116bb4000 CR4: 00000000001406e0
 Call Trace:
  ? ftrace_graph_write+0x150/0x190
  ? __vfs_write+0x1f6/0x210
  ? __audit_syscall_entry+0x17f/0x200
  ? rw_verify_area+0xdb/0x210
  ? _cond_resched+0x2b/0x50
  ? __sb_start_write+0xb4/0x130
  ? vfs_write+0x1c8/0x330
  ? SyS_write+0x62/0xf0
  ? do_syscall_64+0xa3/0x1b0
  ? entry_SYSCALL64_slow_path+0x25/0x25
 Code: 01 48 85 db 0f 84 92 00 00 00 b8 01 00 00 00 d3 e0 85 c0 7e 3f 83 e8 01 48 8d 6f 10 45 31 e4 4c 8d 34 c5 08 00 00 00 49 8b 45 08 <4a> 8b 34 20 48 85 f6 74 13 48 8b 1e 48 89 ef e8 20 fa ff ff 48
 RIP: free_ftrace_hash+0x7c/0x160 RSP: ffffc90001943db0
 ---[ end trace 999b48216bf4b393 ]---
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

d4ad9a1c

ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY · 555fc781

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

When the set_graph_function or set_graph_notrace contains no records, a
banner is displayed of either "#### all functions enabled ####" or
"#### all functions disabled ####" respectively. To tell the seq operations
to do this, (void *)1 is passed as a return value. Instead of using a
hardcoded meaningless variable, define it as a macro.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

555fc781

ftrace: Create a slight optimization on searching the ftrace_hash · 2b2c279c

由 Steven Rostedt (VMware) 提交于 2月 01, 2017

This is a micro-optimization, but as it has to deal with a fast path of the
function tracer, these optimizations can be noticed.

The ftrace_lookup_ip() returns true if the given ip is found in the hash. If
it's not found or the hash is NULL, it returns false. But there's some cases
that a NULL hash is a true, and the ftrace_hash_empty() is tested before
calling ftrace_lookup_ip() in those cases. But as ftrace_lookup_ip() tests
that first, that adds a few extra unneeded instructions in those cases.

A new static "always_inlined" function is created that does not perform the
hash empty test. This most only be used by callers that do the check first
anyway, as an empty or NULL hash could cause a crash if a lookup is
performed on it.

Also add kernel doc for the ftrace_lookup_ip() main function.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

2b2c279c

tracing: Add ftrace_hash_key() helper function · 2b0cce0e

由 Steven Rostedt (VMware) 提交于 2月 01, 2017

Replace the couple of use cases that has small logic to produce the ftrace
function key id with a helper function. No need for duplicate code.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

2b0cce0e

21 1月, 2017 3 次提交

ftrace: Convert graph filter to use hash tables · b9b0c831

由 Namhyung Kim 提交于 1月 20, 2017

Use ftrace_hash instead of a static array of a fixed size. This is
useful when a graph filter pattern matches to a large number of
functions. Now hash lookup is done with preemption disabled to protect
from the hash being changed/freed.

Link: http://lkml.kernel.org/r/20170120024447.26097-3-namhyung@kernel.orgSigned-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

b9b0c831

ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip · 4046bf02

由 Namhyung Kim 提交于 1月 20, 2017

It will be used when checking graph filter hashes later.

Link: http://lkml.kernel.org/r/20170120024447.26097-2-namhyung@kernel.orgSigned-off-by: NNamhyung Kim <namhyung@kernel.org>
[ Moved ftrace_hash dec and functions outside of FUNCTION_GRAPH define ]
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

4046bf02

ftrace: Factor out __ftrace_hash_move() · 3e278c0d

由 Namhyung Kim 提交于 1月 20, 2017

The __ftrace_hash_move() is to allocates properly-sized hash and move
entries in the src ftrace_hash. It will be used to set function graph
filters which has nothing to do with the dyn_ftrace records.

Link: http://lkml.kernel.org/r/20170120024447.26097-1-namhyung@kernel.orgSigned-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

3e278c0d

19 1月, 2017 2 次提交

tracing: Add the constant count for branch tracer · 068f530b

由 Steven Rostedt (VMware) 提交于 1月 19, 2017

The unlikely/likely branch profiler now gets called even if the if statement
is a constant (always goes in one direction without a compare). Add a value
to denote this in the likely/unlikely tracer as well.
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

068f530b

tracing: Show number of constants profiled in likely profiler · 134e6a03

由 Steven Rostedt (VMware) 提交于 1月 19, 2017

Now that constants are traced, it is useful to see the number of constants
that are traced in the likely/unlikely profiler in order to know if they
should be ignored or not.

The likely/unlikely will display a number after the "correct" number if a
"constant" count exists.
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

134e6a03

18 1月, 2017 2 次提交

tracing: Process constants for (un)likely() profiler · d45ae1f7

由 Steven Rostedt (VMware) 提交于 1月 17, 2017

When running the likely/unlikely profiler, one of the results did not look
accurate. It noted that the unlikely() in link_path_walk() was 100%
incorrect. When I added a trace_printk() to see what was happening there, it
became 80% correct! Looking deeper into what whas happening, I found that
gcc split that if statement into two paths. One where the if statement
became a constant, the other path a variable. The other path had the if
statement always hit (making the unlikely there, always false), but since
the #define unlikely() has:

  #define unlikely() (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0))

Where constants are ignored by the branch profiler, the "constant" path
made by the compiler was ignored, even though it was hit 80% of the time.

By just passing the constant value to the __branch_check__() function and
tracing it out of line (as always correct, as likely/unlikely isn't a factor
for constants), then we get back the accurate readings of branches that were
optimized by gcc causing part of the execution to become constant.
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

d45ae1f7

uprobe: Find last occurrence of ':' when parsing uprobe PATH:OFFSET · 6496bb72

由 Kenny Yu 提交于 1月 13, 2017

Previously, `create_trace_uprobe` found the *first* occurence
of the ':' character when parsing `PATH:OFFSET` for a uprobe.
However, if the path contains a ':' character, then the function
would parse the path incorrectly. Even worse, if the path does not
exist, the subsequent call to `kern_path()` would set `ret` to
`ENOENT`, leading to very cryptic errno values in user space.

The fix is to find the *last* occurence of ':'.

How to repro:: The write fails with "No such file or directory", suggesting
incorrectly that the `uprobe_events` file does not exist.

  $ mkdir testing && cd testing
  $ cp /bin/bash .
  $ cp /bin/bash ./bash:with:colon
  $ echo "p:uprobes/p__root_testing_bash_0x6 /root/testing/bash:0x6" > /sys/kernel/debug/tracing/uprobe_events     # this works
  $ echo "p:uprobes/p__root_testing_bash_with_colon_0x6 /root/testing/bash:with:colon:0x6" >> /sys/kernel/debug/tracing/uprobe_events     # this doesn't
  -bash: echo: write error: No such file or directory

With the patch:

  $ echo "p:uprobes/p__root_testing_bash_0x6 /root/testing/bash:0x6" > /sys/kernel/debug/tracing/uprobe_events     # this still works
  $ echo "p:uprobes/p__root_testing_bash_with_colon_0x6 /root/testing/bash:with:colon:0x6" >> /sys/kernel/debug/tracing/uprobe_events     # this works now too!
  $ cat /sys/kernel/debug/tracing/uprobe_events
  p:uprobes/p__root_testing_bash_0x6 /root/testing/bash:0x0000000000000006
  p:uprobes/p__root_testing_bash_with_colon_0x6 /root/testing/bash:with:colon:0x0000000000000006

Link: http://lkml.kernel.org/r/20170113165834.4081016-1-kennyyu@fb.comSigned-off-by: NKenny Yu <kennyyu@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

6496bb72

25 12月, 2016 1 次提交

clocksource: Use a plain u64 instead of cycle_t · a5a1d1c2

由 Thomas Gleixner 提交于 12月 21, 2016

There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>

a5a1d1c2

13 12月, 2016 4 次提交

kprobes/trace: Fix kprobe selftest for newer gcc · d4d7ccc8

由 Marcin Nowakowski 提交于 12月 09, 2016

Commit 265a5b7e ("kprobes/trace: Fix kprobe selftest for gcc 4.6")
has added __used attribute to kprobe_trace_selftest_target to ensure
that the method is listed in kallsyms table.

However, even though the method remains in the kernel image, the actual
call is optimized away as there are no side effects and the return value
is never checked.

Add a return value check and a 'noinline' attribute to ensure that an
inlined copy of the method is not used by the caller. Also add checks
that verify that the kprobe was really hit, as at the moment the tests
show positive results despite the test method being optimized away.

Finally, add __init annotations to find_trace_probe_file() and
kprobe_trace_selftest_target() as they are only called from within an
__init method.

Link: http://lkml.kernel.org/r/1481293178-3128-2-git-send-email-marcin.nowakowski@imgtec.comAcked-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NMarcin Nowakowski <marcin.nowakowski@imgtec.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d4d7ccc8

tracing/kprobes: Add a helper method to return number of probe hits · f18f97ac

由 Marcin Nowakowski 提交于 12月 09, 2016

The number of probe hits is stored in a percpu variable and therefore
can't be read directly. Add a helper method trace_kprobe_nhit() that
performs the required calculation.

It will be used in a follow-up commit that changes kprobe selftests to
verify the number of probe hits.

Link: http://lkml.kernel.org/r/1481293178-3128-1-git-send-email-marcin.nowakowski@imgtec.comAcked-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NMarcin Nowakowski <marcin.nowakowski@imgtec.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f18f97ac

tracing/rb: Init the CPU mask on allocation · 99e6f6e8

由 Sebastian Andrzej Siewior 提交于 12月 07, 2016

Before commit b32614c0 ("tracing/rb: Convert to hotplug state
machine") the allocated cpumask was initialized to the mask of ONLINE or
POSSIBLE CPUs. After the CPU hotplug changes the buffer initialisation
moved to trace_rb_cpu_prepare() but I forgot to initially set the
cpumask to zero. This is done now.

Link: http://lkml.kernel.org/r/20161207133133.hzkcqfllxcdi3joz@linutronix.de

Fixes: b32614c0 ("tracing/rb: Convert to hotplug state machine")
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

99e6f6e8

tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results · c59f29cb

由 Pavankumar Kondeti 提交于 12月 09, 2016

The 's' flag is supposed to indicate that a softirq is running. This
can be detected by testing the preempt_count with SOFTIRQ_OFFSET.

The current code tests the preempt_count with SOFTIRQ_MASK, which
would be true even when softirqs are disabled but not serving a
softirq.

Link: http://lkml.kernel.org/r/1481300417-3564-1-git-send-email-pkondeti@codeaurora.orgSigned-off-by: NPavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

c59f29cb

09 12月, 2016 7 次提交

tracing/fgraph: Have wakeup and irqsoff tracers ignore graph functions too · 1a414428

由 Steven Rostedt (Red Hat) 提交于 12月 08, 2016

Currently both the wakeup and irqsoff traces do not handle set_graph_notrace
well. The ftrace infrastructure will ignore the return paths of all
functions leaving them hanging without an end:

  # echo '*spin*' > set_graph_notrace
  # cat trace
  [...]
          _raw_spin_lock() {
            preempt_count_add() {
            do_raw_spin_lock() {
          update_rq_clock();

Where the '*spin*' functions should have looked like this:

          _raw_spin_lock() {
            preempt_count_add();
            do_raw_spin_lock();
          }
          update_rq_clock();

Instead, have the wakeup and irqsoff tracers ignore the functions that are
set by the set_graph_notrace like the function_graph tracer does. Move
the logic in the function_graph tracer into a header to allow wakeup and
irqsoff tracers to use it as well.

Cc: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

1a414428

fgraph: Handle a case where a tracer ignores set_graph_notrace · 794de08a

由 Steven Rostedt (Red Hat) 提交于 12月 08, 2016

Both the wakeup and irqsoff tracers can use the function graph tracer when
the display-graph option is set. The problem is that they ignore the notrace
file, and record the entry of functions that would be ignored by the
function_graph tracer. This causes the trace->depth to be recorded into the
ring buffer. The set_graph_notrace uses a trick by adding a large negative
number to the trace->depth when a graph function is to be ignored.

On trace output, the graph function uses the depth to record a stack of
functions. But since the depth is negative, it accesses the array with a
negative number and causes an out of bounds access that can cause a kernel
oops or corrupt data.

Have the print functions handle cases where a tracer still records functions
even when they are in set_graph_notrace.

Also add warnings if the depth is below zero before accessing the array.

Note, the function graph logic will still prevent the return of these
functions from being recorded, which means that they will be left hanging
without a return. For example:

   # echo '*spin*' > set_graph_notrace
   # echo 1 > options/display-graph
   # echo wakeup > current_tracer
   # cat trace
   [...]
      _raw_spin_lock() {
        preempt_count_add() {
        do_raw_spin_lock() {
      update_rq_clock();

Where it should look like:

      _raw_spin_lock() {
        preempt_count_add();
        do_raw_spin_lock();
      }
      update_rq_clock();

Cc: stable@vger.kernel.org
Cc: Namhyung Kim <namhyung.kim@lge.com>
Fixes: 29ad23b0 ("ftrace: Add set_graph_notrace filter")
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

794de08a

tracing: Replace kmap with copy_from_user() in trace_marker writing · 656c7f0d

由 Steven Rostedt (Red Hat) 提交于 12月 08, 2016

Instead of using get_user_pages_fast() and kmap_atomic() when writing
to the trace_marker file, just allocate enough space on the ring buffer
directly, and write into it via copy_from_user().

Writing into the trace_marker file use to allocate a temporary buffer
to perform the copy_from_user(), as we didn't want to write into the
ring buffer if the copy failed. But as a trace_marker write is suppose
to be extremely fast, and allocating memory causes other tracepoints to
trigger, Peter Zijlstra suggested using get_user_pages_fast() and
kmap_atomic() to keep the user space pages in memory and reading it
directly. But Henrik Austad had issues with this because it required taking
the mm->mmap_sem and causing long delays with the write.

Instead, just allocate the space in the ring buffer and use
copy_from_user() directly. If it faults, return -EFAULT and write
"<faulted>" into the ring buffer.

Link: http://lkml.kernel.org/r/20161208124018.72dd0f86@gandalf.local.home

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Henrik Austad <henrik@austad.us>
Cc: Peter Zijlstra <peterz@infradead.org>
Updates: d696b58c "tracing: Do not allocate buffer for trace_marker"
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

656c7f0d

tracing: Allow benchmark to be enabled at early_initcall() · 9c1f6bb8

由 Steven Rostedt (Red Hat) 提交于 11月 28, 2016

The trace event start up selftests fails when the trace benchmark is
enabled, because it is disabled during boot. It really only needs to be
disabled before scheduling is set up, as it creates a thread.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

9c1f6bb8

tracing: Have system enable return error if one of the events fail · 989a0a3d

由 Steven Rostedt (Red Hat) 提交于 11月 28, 2016

If one of the events within a system fails to enable when "1" is written
to the system "enable" file, it should return an error. Note, some events
may still be enabled, but the user should know that something did go wrong.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

989a0a3d

tracing: Do not start benchmark on boot up · 1dd349ab

由 Steven Rostedt (Red Hat) 提交于 11月 28, 2016

Trace events are enabled very early on boot up via the boot command line
parameter. The benchmark tool creates a new thread to perform the trace
event benchmarking. But at start up, it is called before scheduling is set
up and because it creates a new thread before the init thread is created,
this crashes the kernel.

Have the benchmark fail to register when started via the kernel command
line.

Also, since the registering of a tracepoint now can handle failure cases,
return -ENOMEM instead of warning if the thread cannot be created.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

1dd349ab

tracing: Have the reg function allow to fail · 8cf868af

由 Steven Rostedt (Red Hat) 提交于 11月 28, 2016

Some tracepoints have a registration function that gets enabled when the
tracepoint is enabled. There may be cases that the registraction function
must fail (for example, can't allocate enough memory). In this case, the
tracepoint should also fail to register, otherwise the user would not know
why the tracepoint is not working.

Cc: David Howells <dhowells@redhat.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

8cf868af

07 12月, 2016 1 次提交

tracing/rb: Init the CPU mask on allocation · b18cc3de

由 Sebastian Andrzej Siewior 提交于 12月 07, 2016

Before commit b32614c0 ("tracing/rb: Convert to hotplug state machine")
the allocated cpumask was initialized to the mask of online or possible
CPUs. After the CPU hotplug changes the buffer initialization moved to
trace_rb_cpu_prepare() but the cpumask is allocated with alloc_cpumask()
and therefor has random content. As a consequence the cpu buffers are not
initialized and a later access dereferences a NULL pointer.

Use zalloc_cpumask() instead so trace_rb_cpu_prepare() initializes the
buffers properly.

Fixes: b32614c0 ("tracing/rb: Convert to hotplug state machine")
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20161207133133.hzkcqfllxcdi3joz@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

b18cc3de

02 12月, 2016 1 次提交

tracing/rb: Convert to hotplug state machine · b32614c0

由 Sebastian Andrzej Siewior 提交于 11月 27, 2016

Install the callbacks via the state machine. The notifier in struct
ring_buffer is replaced by the multi instance interface.  Upon
__ring_buffer_alloc() invocation, cpuhp_state_add_instance() will invoke
the trace_rb_cpu_prepare() on each CPU.

This callback may now fail. This means __ring_buffer_alloc() will fail and
cleanup (like previously) and during a CPU up event this failure will not
allow the CPU to come up.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161126231350.10321-7-bigeasy@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

b32614c0

30 11月, 2016 1 次提交

trace: Add an option for boot clock as trace clock · 80ec3552

由 Joel Fernandes 提交于 11月 28, 2016

Unlike monotonic clock, boot clock as a trace clock will account for
time spent in suspend useful for tracing suspend/resume. This uses
earlier introduced infrastructure for using the fast boot clock.
Signed-off-by: NJoel Fernandes <joelaf@google.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Link: http://lkml.kernel.org/r/1480372524-15181-7-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

80ec3552

24 11月, 2016 2 次提交

ring-buffer: Force rb_end_commit() and rb_set_commit_to_write() inline · 38e11df1

由 Steven Rostedt (Red Hat) 提交于 11月 23, 2016

Both rb_end_commit() and rb_set_commit_to_write() are in the fast path of
the ring buffer recording. Make sure they are always inlined.

Link: http://lkml.kernel.org/r/20161121183700.GW26852@two.firstfloor.orgReported-by: NAndi Kleen <andi@firstfloor.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

38e11df1

ring-buffer: Froce rb_update_write_stamp() to be inlined · babe3fce

由 Steven Rostedt (Red Hat) 提交于 11月 23, 2016

The function rb_update_write_stamp() is in the hotpath of the ring buffer
recording. Make sure that it is inlined as well. There's not many places
that call it.

Link: http://lkml.kernel.org/r/20161121183700.GW26852@two.firstfloor.orgReported-by: NAndi Kleen <andi@firstfloor.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

babe3fce

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功