提交 86236611 编写于 作者: L Linus Torvalds

Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits)
  Revert "x86, bts: reenable ptrace branch trace support"
  tracing: do not translate event helper macros in print format
  ftrace/documentation: fix typo in function grapher name
  tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK
  tracing: add protection around module events unload
  tracing: add trace_seq_vprint interface
  tracing: fix the block trace points print size
  tracing/events: convert block trace points to TRACE_EVENT()
  ring-buffer: fix ret in rb_add_time_stamp
  ring-buffer: pass in lockdep class key for reader_lock
  tracing: add annotation to what type of stack trace is recorded
  tracing: fix multiple use of __print_flags and __print_symbolic
  tracing/events: fix output format of user stack
  tracing/events: fix output format of kernel stack
  tracing/trace_stack: fix the number of entries in the header
  ring-buffer: discard timestamps that are at the start of the buffer
  ring-buffer: try to discard unneeded timestamps
  ring-buffer: fix bug in ring_buffer_discard_commit
  ftrace: do not profile functions when disabled
  tracing: make trace pipe recognize latency format flag
  ...
......@@ -13,7 +13,8 @@ DOCBOOKS := z8530book.xml mcabook.xml device-drivers.xml \
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
mac80211.xml debugobjects.xml sh.xml regulator.xml \
alsa-driver-api.xml writing-an-alsa-driver.xml
alsa-driver-api.xml writing-an-alsa-driver.xml \
tracepoint.xml
###
# The build process is as follows (targets):
......
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<book id="Tracepoints">
<bookinfo>
<title>The Linux Kernel Tracepoint API</title>
<authorgroup>
<author>
<firstname>Jason</firstname>
<surname>Baron</surname>
<affiliation>
<address>
<email>jbaron@redhat.com</email>
</address>
</affiliation>
</author>
</authorgroup>
<legalnotice>
<para>
This documentation is free software; you can redistribute
it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later
version.
</para>
<para>
This program is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
</para>
<para>
You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
</para>
<para>
For more details see the file COPYING in the source
distribution of Linux.
</para>
</legalnotice>
</bookinfo>
<toc></toc>
<chapter id="intro">
<title>Introduction</title>
<para>
Tracepoints are static probe points that are located in strategic points
throughout the kernel. 'Probes' register/unregister with tracepoints
via a callback mechanism. The 'probes' are strictly typed functions that
are passed a unique set of parameters defined by each tracepoint.
</para>
<para>
From this simple callback mechanism, 'probes' can be used to profile, debug,
and understand kernel behavior. There are a number of tools that provide a
framework for using 'probes'. These tools include Systemtap, ftrace, and
LTTng.
</para>
<para>
Tracepoints are defined in a number of header files via various macros. Thus,
the purpose of this document is to provide a clear accounting of the available
tracepoints. The intention is to understand not only what tracepoints are
available but also to understand where future tracepoints might be added.
</para>
<para>
The API presented has functions of the form:
<function>trace_tracepointname(function parameters)</function>. These are the
tracepoints callbacks that are found throughout the code. Registering and
unregistering probes with these callback sites is covered in the
<filename>Documentation/trace/*</filename> directory.
</para>
</chapter>
<chapter id="irq">
<title>IRQ</title>
!Iinclude/trace/events/irq.h
</chapter>
</book>
......@@ -56,7 +56,6 @@ parameter is applicable:
ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled.
JOY Appropriate joystick support is enabled.
KMEMTRACE kmemtrace is enabled.
LIBATA Libata driver is enabled
LP Printer support is enabled.
LOOP Loopback device support is enabled.
......@@ -754,12 +753,25 @@ and is between 256 and 4096 characters. It is defined in the file
ia64_pal_cache_flush instead of SAL_CACHE_FLUSH.
ftrace=[tracer]
[ftrace] will set and start the specified tracer
[FTRACE] will set and start the specified tracer
as early as possible in order to facilitate early
boot debugging.
ftrace_dump_on_oops
[ftrace] will dump the trace buffers on oops.
[FTRACE] will dump the trace buffers on oops.
ftrace_filter=[function-list]
[FTRACE] Limit the functions traced by the function
tracer at boot up. function-list is a comma separated
list of functions. This list can be changed at run
time by the set_ftrace_filter file in the debugfs
tracing directory.
ftrace_notrace=[function-list]
[FTRACE] Do not trace the functions specified in
function-list. This list can be changed at run time
by the set_ftrace_notrace file in the debugfs
tracing directory.
gamecon.map[2|3]=
[HW,JOY] Multisystem joystick and NES/SNES/PSX pad
......@@ -1056,15 +1068,6 @@ and is between 256 and 4096 characters. It is defined in the file
use the HighMem zone if it exists, and the Normal
zone if it does not.
kmemtrace.enable= [KNL,KMEMTRACE] Format: { yes | no }
Controls whether kmemtrace is enabled
at boot-time.
kmemtrace.subbufs=n [KNL,KMEMTRACE] Overrides the number of
subbufs kmemtrace's relay channel has. Set this
higher than default (KMEMTRACE_N_SUBBUFS in code) if
you experience buffer overruns.
kgdboc= [HW] kgdb over consoles.
Requires a tty driver that supports console polling.
(only serial suported for now)
......
Event Tracing
Documentation written by Theodore Ts'o
Updated by Li Zefan
1. Introduction
===============
Tracepoints (see Documentation/trace/tracepoints.txt) can be used
without creating custom kernel modules to register probe functions
using the event tracing infrastructure.
Not all tracepoints can be traced using the event tracing system;
the kernel developer must provide code snippets which define how the
tracing information is saved into the tracing buffer, and how the
tracing information should be printed.
2. Using Event Tracing
======================
2.1 Via the 'set_event' interface
---------------------------------
The events which are available for tracing can be found in the file
/debug/tracing/available_events.
To enable a particular event, such as 'sched_wakeup', simply echo it
to /debug/tracing/set_event. For example:
# echo sched_wakeup >> /debug/tracing/set_event
[ Note: '>>' is necessary, otherwise it will firstly disable
all the events. ]
To disable an event, echo the event name to the set_event file prefixed
with an exclamation point:
# echo '!sched_wakeup' >> /debug/tracing/set_event
To disable all events, echo an empty line to the set_event file:
# echo > /debug/tracing/set_event
To enable all events, echo '*:*' or '*:' to the set_event file:
# echo *:* > /debug/tracing/set_event
The events are organized into subsystems, such as ext4, irq, sched,
etc., and a full event name looks like this: <subsystem>:<event>. The
subsystem name is optional, but it is displayed in the available_events
file. All of the events in a subsystem can be specified via the syntax
"<subsystem>:*"; for example, to enable all irq events, you can use the
command:
# echo 'irq:*' > /debug/tracing/set_event
2.2 Via the 'enable' toggle
---------------------------
The events available are also listed in /debug/tracing/events/ hierarchy
of directories.
To enable event 'sched_wakeup':
# echo 1 > /debug/tracing/events/sched/sched_wakeup/enable
To disable it:
# echo 0 > /debug/tracing/events/sched/sched_wakeup/enable
To enable all events in sched subsystem:
# echo 1 > /debug/tracing/events/sched/enable
To eanble all events:
# echo 1 > /debug/tracing/events/enable
When reading one of these enable files, there are four results:
0 - all events this file affects are disabled
1 - all events this file affects are enabled
X - there is a mixture of events enabled and disabled
? - this file does not affect any event
3. Defining an event-enabled tracepoint
=======================================
See The example provided in samples/trace_events
......@@ -179,7 +179,7 @@ Here is the list of current tracers that may be configured.
Function call tracer to trace all kernel functions.
"function_graph_tracer"
"function_graph"
Similar to the function tracer except that the
function tracer probes the functions on their entry
......
The power tracer collects detailed information about C-state and P-state
transitions, instead of just looking at the high-level "average"
information.
There is a helper script found in scrips/tracing/power.pl in the kernel
sources which can be used to parse this information and create a
Scalable Vector Graphics (SVG) picture from the trace data.
To use this tracer:
echo 0 > /sys/kernel/debug/tracing/tracing_enabled
echo power > /sys/kernel/debug/tracing/current_tracer
echo 1 > /sys/kernel/debug/tracing/tracing_enabled
sleep 1
echo 0 > /sys/kernel/debug/tracing/tracing_enabled
cat /sys/kernel/debug/tracing/trace | \
perl scripts/tracing/power.pl > out.sv
......@@ -174,6 +174,15 @@ config IOMMU_LEAK
Add a simple leak tracer to the IOMMU code. This is useful when you
are debugging a buggy device driver that leaks IOMMU mappings.
config X86_DS_SELFTEST
bool "DS selftest"
default y
depends on DEBUG_KERNEL
depends on X86_DS
---help---
Perform Debug Store selftests at boot time.
If in doubt, say "N".
config HAVE_MMIOTRACE_SUPPORT
def_bool y
......
......@@ -15,8 +15,8 @@
* - buffer allocation (memory accounting)
*
*
* Copyright (C) 2007-2008 Intel Corporation.
* Markus Metzger <markus.t.metzger@intel.com>, 2007-2008
* Copyright (C) 2007-2009 Intel Corporation.
* Markus Metzger <markus.t.metzger@intel.com>, 2007-2009
*/
#ifndef _ASM_X86_DS_H
......@@ -83,8 +83,10 @@ enum ds_feature {
* The interrupt threshold is independent from the overflow callback
* to allow users to use their own overflow interrupt handling mechanism.
*
* task: the task to request recording for;
* NULL for per-cpu recording on the current cpu
* The function might sleep.
*
* task: the task to request recording for
* cpu: the cpu to request recording for
* base: the base pointer for the (non-pageable) buffer;
* size: the size of the provided buffer in bytes
* ovfl: pointer to a function to be called on buffer overflow;
......@@ -93,19 +95,28 @@ enum ds_feature {
* -1 if no interrupt threshold is requested.
* flags: a bit-mask of the above flags
*/
extern struct bts_tracer *ds_request_bts(struct task_struct *task,
void *base, size_t size,
bts_ovfl_callback_t ovfl,
size_t th, unsigned int flags);
extern struct pebs_tracer *ds_request_pebs(struct task_struct *task,
void *base, size_t size,
pebs_ovfl_callback_t ovfl,
size_t th, unsigned int flags);
extern struct bts_tracer *ds_request_bts_task(struct task_struct *task,
void *base, size_t size,
bts_ovfl_callback_t ovfl,
size_t th, unsigned int flags);
extern struct bts_tracer *ds_request_bts_cpu(int cpu, void *base, size_t size,
bts_ovfl_callback_t ovfl,
size_t th, unsigned int flags);
extern struct pebs_tracer *ds_request_pebs_task(struct task_struct *task,
void *base, size_t size,
pebs_ovfl_callback_t ovfl,
size_t th, unsigned int flags);
extern struct pebs_tracer *ds_request_pebs_cpu(int cpu,
void *base, size_t size,
pebs_ovfl_callback_t ovfl,
size_t th, unsigned int flags);
/*
* Release BTS or PEBS resources
* Suspend and resume BTS or PEBS tracing
*
* Must be called with irq's enabled.
*
* tracer: the tracer handle returned from ds_request_~()
*/
extern void ds_release_bts(struct bts_tracer *tracer);
......@@ -115,6 +126,28 @@ extern void ds_release_pebs(struct pebs_tracer *tracer);
extern void ds_suspend_pebs(struct pebs_tracer *tracer);
extern void ds_resume_pebs(struct pebs_tracer *tracer);
/*
* Release BTS or PEBS resources
* Suspend and resume BTS or PEBS tracing
*
* Cpu tracers must call this on the traced cpu.
* Task tracers must call ds_release_~_noirq() for themselves.
*
* May be called with irq's disabled.
*
* Returns 0 if successful;
* -EPERM if the cpu tracer does not trace the current cpu.
* -EPERM if the task tracer does not trace itself.
*
* tracer: the tracer handle returned from ds_request_~()
*/
extern int ds_release_bts_noirq(struct bts_tracer *tracer);
extern int ds_suspend_bts_noirq(struct bts_tracer *tracer);
extern int ds_resume_bts_noirq(struct bts_tracer *tracer);
extern int ds_release_pebs_noirq(struct pebs_tracer *tracer);
extern int ds_suspend_pebs_noirq(struct pebs_tracer *tracer);
extern int ds_resume_pebs_noirq(struct pebs_tracer *tracer);
/*
* The raw DS buffer state as it is used for BTS and PEBS recording.
......@@ -170,9 +203,9 @@ struct bts_struct {
} lbr;
/* BTS_TASK_ARRIVES or BTS_TASK_DEPARTS */
struct {
__u64 jiffies;
__u64 clock;
pid_t pid;
} timestamp;
} event;
} variant;
};
......@@ -201,8 +234,12 @@ struct bts_trace {
struct pebs_trace {
struct ds_trace ds;
/* the PEBS reset value */
unsigned long long reset_value;
/* the number of valid counters in the below array */
unsigned int counters;
#define MAX_PEBS_COUNTERS 4
/* the counter reset value */
unsigned long long counter_reset[MAX_PEBS_COUNTERS];
};
......@@ -237,9 +274,11 @@ extern int ds_reset_pebs(struct pebs_tracer *tracer);
* Returns 0 on success; -Eerrno on error
*
* tracer: the tracer handle returned from ds_request_pebs()
* counter: the index of the counter
* value: the new counter reset value
*/
extern int ds_set_pebs_reset(struct pebs_tracer *tracer, u64 value);
extern int ds_set_pebs_reset(struct pebs_tracer *tracer,
unsigned int counter, u64 value);
/*
* Initialization
......@@ -252,21 +291,12 @@ extern void __cpuinit ds_init_intel(struct cpuinfo_x86 *);
*/
extern void ds_switch_to(struct task_struct *prev, struct task_struct *next);
/*
* Task clone/init and cleanup work
*/
extern void ds_copy_thread(struct task_struct *tsk, struct task_struct *father);
extern void ds_exit_thread(struct task_struct *tsk);
#else /* CONFIG_X86_DS */
struct cpuinfo_x86;
static inline void __cpuinit ds_init_intel(struct cpuinfo_x86 *ignored) {}
static inline void ds_switch_to(struct task_struct *prev,
struct task_struct *next) {}
static inline void ds_copy_thread(struct task_struct *tsk,
struct task_struct *father) {}
static inline void ds_exit_thread(struct task_struct *tsk) {}
#endif /* CONFIG_X86_DS */
#endif /* _ASM_X86_DS_H */
......@@ -462,14 +462,8 @@ struct thread_struct {
unsigned io_bitmap_max;
/* MSR_IA32_DEBUGCTLMSR value to switch in if TIF_DEBUGCTLMSR is set. */
unsigned long debugctlmsr;
#ifdef CONFIG_X86_DS
/* Debug Store context; see include/asm-x86/ds.h; goes into MSR_IA32_DS_AREA */
/* Debug Store context; see asm/ds.h */
struct ds_context *ds_ctx;
#endif /* CONFIG_X86_DS */
#ifdef CONFIG_X86_PTRACE_BTS
/* the signal to send on a bts buffer overflow */
unsigned int bts_ovfl_signal;
#endif /* CONFIG_X86_PTRACE_BTS */
};
static inline unsigned long native_get_debugreg(int regno)
......@@ -797,6 +791,21 @@ static inline unsigned long get_debugctlmsr(void)
return debugctlmsr;
}
static inline unsigned long get_debugctlmsr_on_cpu(int cpu)
{
u64 debugctlmsr = 0;
u32 val1, val2;
#ifndef CONFIG_X86_DEBUGCTLMSR
if (boot_cpu_data.x86 < 6)
return 0;
#endif
rdmsr_on_cpu(cpu, MSR_IA32_DEBUGCTLMSR, &val1, &val2);
debugctlmsr = val1 | ((u64)val2 << 32);
return debugctlmsr;
}
static inline void update_debugctlmsr(unsigned long debugctlmsr)
{
#ifndef CONFIG_X86_DEBUGCTLMSR
......@@ -806,6 +815,18 @@ static inline void update_debugctlmsr(unsigned long debugctlmsr)
wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
}
static inline void update_debugctlmsr_on_cpu(int cpu,
unsigned long debugctlmsr)
{
#ifndef CONFIG_X86_DEBUGCTLMSR
if (boot_cpu_data.x86 < 6)
return;
#endif
wrmsr_on_cpu(cpu, MSR_IA32_DEBUGCTLMSR,
(u32)((u64)debugctlmsr),
(u32)((u64)debugctlmsr >> 32));
}
/*
* from system description table in BIOS. Mostly for MCA use, but
* others may find it useful:
......
......@@ -236,12 +236,11 @@ extern int do_get_thread_area(struct task_struct *p, int idx,
extern int do_set_thread_area(struct task_struct *p, int idx,
struct user_desc __user *info, int can_allocate);
extern void x86_ptrace_untrace(struct task_struct *);
extern void x86_ptrace_fork(struct task_struct *child,
unsigned long clone_flags);
#ifdef CONFIG_X86_PTRACE_BTS
extern void ptrace_bts_untrace(struct task_struct *tsk);
#define arch_ptrace_untrace(tsk) x86_ptrace_untrace(tsk)
#define arch_ptrace_fork(child, flags) x86_ptrace_fork(child, flags)
#define arch_ptrace_untrace(tsk) ptrace_bts_untrace(tsk)
#endif /* CONFIG_X86_PTRACE_BTS */
#endif /* __KERNEL__ */
......
......@@ -17,7 +17,7 @@
static inline void __native_flush_tlb(void)
{
write_cr3(read_cr3());
native_write_cr3(native_read_cr3());
}
static inline void __native_flush_tlb_global(void)
......@@ -32,11 +32,11 @@ static inline void __native_flush_tlb_global(void)
*/
raw_local_irq_save(flags);
cr4 = read_cr4();
cr4 = native_read_cr4();
/* clear PGE */
write_cr4(cr4 & ~X86_CR4_PGE);
native_write_cr4(cr4 & ~X86_CR4_PGE);
/* write old PGE again and flush TLBs */
write_cr4(cr4);
native_write_cr4(cr4);
raw_local_irq_restore(flags);
}
......
......@@ -44,6 +44,7 @@ obj-y += process.o
obj-y += i387.o xsave.o
obj-y += ptrace.o
obj-$(CONFIG_X86_DS) += ds.o
obj-$(CONFIG_X86_DS_SELFTEST) += ds_selftest.o
obj-$(CONFIG_X86_32) += tls.o
obj-$(CONFIG_IA32_EMULATION) += tls.o
obj-y += step.o
......
此差异已折叠。
/*
* Debug Store support - selftest
*
*
* Copyright (C) 2009 Intel Corporation.
* Markus Metzger <markus.t.metzger@intel.com>, 2009
*/
#include "ds_selftest.h"
#include <linux/kernel.h>
#include <linux/string.h>
#include <linux/smp.h>
#include <linux/cpu.h>
#include <asm/ds.h>
#define BUFFER_SIZE 521 /* Intentionally chose an odd size. */
#define SMALL_BUFFER_SIZE 24 /* A single bts entry. */
struct ds_selftest_bts_conf {
struct bts_tracer *tracer;
int error;
int (*suspend)(struct bts_tracer *);
int (*resume)(struct bts_tracer *);
};
static int ds_selftest_bts_consistency(const struct bts_trace *trace)
{
int error = 0;
if (!trace) {
printk(KERN_CONT "failed to access trace...");
/* Bail out. Other tests are pointless. */
return -1;
}
if (!trace->read) {
printk(KERN_CONT "bts read not available...");
error = -1;
}
/* Do some sanity checks on the trace configuration. */
if (!trace->ds.n) {
printk(KERN_CONT "empty bts buffer...");
error = -1;
}
if (!trace->ds.size) {
printk(KERN_CONT "bad bts trace setup...");
error = -1;
}
if (trace->ds.end !=
(char *)trace->ds.begin + (trace->ds.n * trace->ds.size)) {
printk(KERN_CONT "bad bts buffer setup...");
error = -1;
}
/*
* We allow top in [begin; end], since its not clear when the
* overflow adjustment happens: after the increment or before the
* write.
*/
if ((trace->ds.top < trace->ds.begin) ||
(trace->ds.end < trace->ds.top)) {
printk(KERN_CONT "bts top out of bounds...");
error = -1;
}
return error;
}
static int ds_selftest_bts_read(struct bts_tracer *tracer,
const struct bts_trace *trace,
const void *from, const void *to)
{
const unsigned char *at;
/*
* Check a few things which do not belong to this test.
* They should be covered by other tests.
*/
if (!trace)
return -1;
if (!trace->read)
return -1;
if (to < from)
return -1;
if (from < trace->ds.begin)
return -1;
if (trace->ds.end < to)
return -1;
if (!trace->ds.size)
return -1;
/* Now to the test itself. */
for (at = from; (void *)at < to; at += trace->ds.size) {
struct bts_struct bts;
unsigned long index;
int error;
if (((void *)at - trace->ds.begin) % trace->ds.size) {
printk(KERN_CONT
"read from non-integer index...");
return -1;
}
index = ((void *)at - trace->ds.begin) / trace->ds.size;
memset(&bts, 0, sizeof(bts));
error = trace->read(tracer, at, &bts);
if (error < 0) {
printk(KERN_CONT
"error reading bts trace at [%lu] (0x%p)...",
index, at);
return error;
}
switch (bts.qualifier) {
case BTS_BRANCH:
break;
default:
printk(KERN_CONT
"unexpected bts entry %llu at [%lu] (0x%p)...",
bts.qualifier, index, at);
return -1;
}
}
return 0;
}
static void ds_selftest_bts_cpu(void *arg)
{
struct ds_selftest_bts_conf *conf = arg;
const struct bts_trace *trace;
void *top;
if (IS_ERR(conf->tracer)) {
conf->error = PTR_ERR(conf->tracer);
conf->tracer = NULL;
printk(KERN_CONT
"initialization failed (err: %d)...", conf->error);
return;
}
/* We should meanwhile have enough trace. */
conf->error = conf->suspend(conf->tracer);
if (conf->error < 0)
return;
/* Let's see if we can access the trace. */
trace = ds_read_bts(conf->tracer);
conf->error = ds_selftest_bts_consistency(trace);
if (conf->error < 0)
return;
/* If everything went well, we should have a few trace entries. */
if (trace->ds.top == trace->ds.begin) {
/*
* It is possible but highly unlikely that we got a
* buffer overflow and end up at exactly the same
* position we started from.
* Let's issue a warning, but continue.
*/
printk(KERN_CONT "no trace/overflow...");
}
/* Let's try to read the trace we collected. */
conf->error =
ds_selftest_bts_read(conf->tracer, trace,
trace->ds.begin, trace->ds.top);
if (conf->error < 0)
return;
/*
* Let's read the trace again.
* Since we suspended tracing, we should get the same result.
*/
top = trace->ds.top;
trace = ds_read_bts(conf->tracer);
conf->error = ds_selftest_bts_consistency(trace);
if (conf->error < 0)
return;
if (top != trace->ds.top) {
printk(KERN_CONT "suspend not working...");
conf->error = -1;
return;
}
/* Let's collect some more trace - see if resume is working. */
conf->error = conf->resume(conf->tracer);
if (conf->error < 0)
return;
conf->error = conf->suspend(conf->tracer);
if (conf->error < 0)
return;
trace = ds_read_bts(conf->tracer);
conf->error = ds_selftest_bts_consistency(trace);
if (conf->error < 0)
return;
if (trace->ds.top == top) {
/*
* It is possible but highly unlikely that we got a
* buffer overflow and end up at exactly the same
* position we started from.
* Let's issue a warning and check the full trace.
*/
printk(KERN_CONT
"no resume progress/overflow...");
conf->error =
ds_selftest_bts_read(conf->tracer, trace,
trace->ds.begin, trace->ds.end);
} else if (trace->ds.top < top) {
/*
* We had a buffer overflow - the entire buffer should
* contain trace records.
*/
conf->error =
ds_selftest_bts_read(conf->tracer, trace,
trace->ds.begin, trace->ds.end);
} else {
/*
* It is quite likely that the buffer did not overflow.
* Let's just check the delta trace.
*/
conf->error =
ds_selftest_bts_read(conf->tracer, trace, top,
trace->ds.top);
}
if (conf->error < 0)
return;
conf->error = 0;
}
static int ds_suspend_bts_wrap(struct bts_tracer *tracer)
{
ds_suspend_bts(tracer);
return 0;
}
static int ds_resume_bts_wrap(struct bts_tracer *tracer)
{
ds_resume_bts(tracer);
return 0;
}
static void ds_release_bts_noirq_wrap(void *tracer)
{
(void)ds_release_bts_noirq(tracer);
}
static int ds_selftest_bts_bad_release_noirq(int cpu,
struct bts_tracer *tracer)
{
int error = -EPERM;
/* Try to release the tracer on the wrong cpu. */
get_cpu();
if (cpu != smp_processor_id()) {
error = ds_release_bts_noirq(tracer);
if (error != -EPERM)
printk(KERN_CONT "release on wrong cpu...");
}
put_cpu();
return error ? 0 : -1;
}
static int ds_selftest_bts_bad_request_cpu(int cpu, void *buffer)
{
struct bts_tracer *tracer;
int error;
/* Try to request cpu tracing while task tracing is active. */
tracer = ds_request_bts_cpu(cpu, buffer, BUFFER_SIZE, NULL,
(size_t)-1, BTS_KERNEL);
error = PTR_ERR(tracer);
if (!IS_ERR(tracer)) {
ds_release_bts(tracer);
error = 0;
}
if (error != -EPERM)
printk(KERN_CONT "cpu/task tracing overlap...");
return error ? 0 : -1;
}
static int ds_selftest_bts_bad_request_task(void *buffer)
{
struct bts_tracer *tracer;
int error;
/* Try to request cpu tracing while task tracing is active. */
tracer = ds_request_bts_task(current, buffer, BUFFER_SIZE, NULL,
(size_t)-1, BTS_KERNEL);
error = PTR_ERR(tracer);
if (!IS_ERR(tracer)) {
error = 0;
ds_release_bts(tracer);
}
if (error != -EPERM)
printk(KERN_CONT "task/cpu tracing overlap...");
return error ? 0 : -1;
}
int ds_selftest_bts(void)
{
struct ds_selftest_bts_conf conf;
unsigned char buffer[BUFFER_SIZE], *small_buffer;
unsigned long irq;
int cpu;
printk(KERN_INFO "[ds] bts selftest...");
conf.error = 0;
small_buffer = (unsigned char *)ALIGN((unsigned long)buffer, 8) + 8;
get_online_cpus();
for_each_online_cpu(cpu) {
conf.suspend = ds_suspend_bts_wrap;
conf.resume = ds_resume_bts_wrap;
conf.tracer =
ds_request_bts_cpu(cpu, buffer, BUFFER_SIZE,
NULL, (size_t)-1, BTS_KERNEL);
ds_selftest_bts_cpu(&conf);
if (conf.error >= 0)
conf.error = ds_selftest_bts_bad_request_task(buffer);
ds_release_bts(conf.tracer);
if (conf.error < 0)
goto out;
conf.suspend = ds_suspend_bts_noirq;
conf.resume = ds_resume_bts_noirq;
conf.tracer =
ds_request_bts_cpu(cpu, buffer, BUFFER_SIZE,
NULL, (size_t)-1, BTS_KERNEL);
smp_call_function_single(cpu, ds_selftest_bts_cpu, &conf, 1);
if (conf.error >= 0) {
conf.error =
ds_selftest_bts_bad_release_noirq(cpu,
conf.tracer);
/* We must not release the tracer twice. */
if (conf.error < 0)
conf.tracer = NULL;
}
if (conf.error >= 0)
conf.error = ds_selftest_bts_bad_request_task(buffer);
smp_call_function_single(cpu, ds_release_bts_noirq_wrap,
conf.tracer, 1);
if (conf.error < 0)
goto out;
}
conf.suspend = ds_suspend_bts_wrap;
conf.resume = ds_resume_bts_wrap;
conf.tracer =
ds_request_bts_task(current, buffer, BUFFER_SIZE,
NULL, (size_t)-1, BTS_KERNEL);
ds_selftest_bts_cpu(&conf);
if (conf.error >= 0)
conf.error = ds_selftest_bts_bad_request_cpu(0, buffer);
ds_release_bts(conf.tracer);
if (conf.error < 0)
goto out;
conf.suspend = ds_suspend_bts_noirq;
conf.resume = ds_resume_bts_noirq;
conf.tracer =
ds_request_bts_task(current, small_buffer, SMALL_BUFFER_SIZE,
NULL, (size_t)-1, BTS_KERNEL);
local_irq_save(irq);
ds_selftest_bts_cpu(&conf);
if (conf.error >= 0)
conf.error = ds_selftest_bts_bad_request_cpu(0, buffer);
ds_release_bts_noirq(conf.tracer);
local_irq_restore(irq);
if (conf.error < 0)
goto out;
conf.error = 0;
out:
put_online_cpus();
printk(KERN_CONT "%s.\n", (conf.error ? "failed" : "passed"));
return conf.error;
}
int ds_selftest_pebs(void)
{
return 0;
}
/*
* Debug Store support - selftest
*
*
* Copyright (C) 2009 Intel Corporation.
* Markus Metzger <markus.t.metzger@intel.com>, 2009
*/
#ifdef CONFIG_X86_DS_SELFTEST
extern int ds_selftest_bts(void);
extern int ds_selftest_pebs(void);
#else
static inline int ds_selftest_bts(void) { return 0; }
static inline int ds_selftest_pebs(void) { return 0; }
#endif
......@@ -147,27 +147,14 @@ END(ftrace_graph_caller)
GLOBAL(return_to_handler)
subq $80, %rsp
/* Save the return values */
movq %rax, (%rsp)
movq %rcx, 8(%rsp)
movq %rdx, 16(%rsp)
movq %rsi, 24(%rsp)
movq %rdi, 32(%rsp)
movq %r8, 40(%rsp)
movq %r9, 48(%rsp)
movq %r10, 56(%rsp)
movq %r11, 64(%rsp)
movq %rdx, 8(%rsp)
call ftrace_return_to_handler
movq %rax, 72(%rsp)
movq 64(%rsp), %r11
movq 56(%rsp), %r10
movq 48(%rsp), %r9
movq 40(%rsp), %r8
movq 32(%rsp), %rdi
movq 24(%rsp), %rsi
movq 16(%rsp), %rdx
movq 8(%rsp), %rcx
movq 8(%rsp), %rdx
movq (%rsp), %rax
addq $72, %rsp
retq
......
......@@ -16,6 +16,7 @@
#include <asm/idle.h>
#include <asm/uaccess.h>
#include <asm/i387.h>
#include <asm/ds.h>
unsigned long idle_halt;
EXPORT_SYMBOL(idle_halt);
......@@ -47,6 +48,8 @@ void free_thread_xstate(struct task_struct *tsk)
kmem_cache_free(task_xstate_cachep, tsk->thread.xstate);
tsk->thread.xstate = NULL;
}
WARN(tsk->thread.ds_ctx, "leaking DS context\n");
}
void free_thread_info(struct thread_info *ti)
......@@ -85,8 +88,6 @@ void exit_thread(void)
put_cpu();
kfree(bp);
}
ds_exit_thread(current);
}
void flush_thread(void)
......
......@@ -287,7 +287,8 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
p->thread.io_bitmap_max = 0;
}
ds_copy_thread(p, current);
clear_tsk_thread_flag(p, TIF_DS_AREA_MSR);
p->thread.ds_ctx = NULL;
clear_tsk_thread_flag(p, TIF_DEBUGCTLMSR);
p->thread.debugctlmsr = 0;
......
......@@ -332,7 +332,8 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
goto out;
}
ds_copy_thread(p, me);
clear_tsk_thread_flag(p, TIF_DS_AREA_MSR);
p->thread.ds_ctx = NULL;
clear_tsk_thread_flag(p, TIF_DEBUGCTLMSR);
p->thread.debugctlmsr = 0;
......
......@@ -21,6 +21,7 @@
#include <linux/audit.h>
#include <linux/seccomp.h>
#include <linux/signal.h>
#include <linux/workqueue.h>
#include <asm/uaccess.h>
#include <asm/pgtable.h>
......@@ -578,17 +579,130 @@ static int ioperm_get(struct task_struct *target,
}
#ifdef CONFIG_X86_PTRACE_BTS
/*
* A branch trace store context.
*
* Contexts may only be installed by ptrace_bts_config() and only for
* ptraced tasks.
*
* Contexts are destroyed when the tracee is detached from the tracer.
* The actual destruction work requires interrupts enabled, so the
* work is deferred and will be scheduled during __ptrace_unlink().
*
* Contexts hold an additional task_struct reference on the traced
* task, as well as a reference on the tracer's mm.
*
* Ptrace already holds a task_struct for the duration of ptrace operations,
* but since destruction is deferred, it may be executed after both
* tracer and tracee exited.
*/
struct bts_context {
/* The branch trace handle. */
struct bts_tracer *tracer;
/* The buffer used to store the branch trace and its size. */
void *buffer;
unsigned int size;
/* The mm that paid for the above buffer. */
struct mm_struct *mm;
/* The task this context belongs to. */
struct task_struct *task;
/* The signal to send on a bts buffer overflow. */
unsigned int bts_ovfl_signal;
/* The work struct to destroy a context. */
struct work_struct work;
};
static int alloc_bts_buffer(struct bts_context *context, unsigned int size)
{
void *buffer = NULL;
int err = -ENOMEM;
err = account_locked_memory(current->mm, current->signal->rlim, size);
if (err < 0)
return err;
buffer = kzalloc(size, GFP_KERNEL);
if (!buffer)
goto out_refund;
context->buffer = buffer;
context->size = size;
context->mm = get_task_mm(current);
return 0;
out_refund:
refund_locked_memory(current->mm, size);
return err;
}
static inline void free_bts_buffer(struct bts_context *context)
{
if (!context->buffer)
return;
kfree(context->buffer);
context->buffer = NULL;
refund_locked_memory(context->mm, context->size);
context->size = 0;
mmput(context->mm);
context->mm = NULL;
}
static void free_bts_context_work(struct work_struct *w)
{
struct bts_context *context;
context = container_of(w, struct bts_context, work);
ds_release_bts(context->tracer);
put_task_struct(context->task);
free_bts_buffer(context);
kfree(context);
}
static inline void free_bts_context(struct bts_context *context)
{
INIT_WORK(&context->work, free_bts_context_work);
schedule_work(&context->work);
}
static inline struct bts_context *alloc_bts_context(struct task_struct *task)
{
struct bts_context *context = kzalloc(sizeof(*context), GFP_KERNEL);
if (context) {
context->task = task;
task->bts = context;
get_task_struct(task);
}
return context;
}
static int ptrace_bts_read_record(struct task_struct *child, size_t index,
struct bts_struct __user *out)
{
struct bts_context *context;
const struct bts_trace *trace;
struct bts_struct bts;
const unsigned char *at;
int error;
trace = ds_read_bts(child->bts);
context = child->bts;
if (!context)
return -ESRCH;
trace = ds_read_bts(context->tracer);
if (!trace)
return -EPERM;
return -ESRCH;
at = trace->ds.top - ((index + 1) * trace->ds.size);
if ((void *)at < trace->ds.begin)
......@@ -597,7 +711,7 @@ static int ptrace_bts_read_record(struct task_struct *child, size_t index,
if (!trace->read)
return -EOPNOTSUPP;
error = trace->read(child->bts, at, &bts);
error = trace->read(context->tracer, at, &bts);
if (error < 0)
return error;
......@@ -611,13 +725,18 @@ static int ptrace_bts_drain(struct task_struct *child,
long size,
struct bts_struct __user *out)
{
struct bts_context *context;
const struct bts_trace *trace;
const unsigned char *at;
int error, drained = 0;
trace = ds_read_bts(child->bts);
context = child->bts;
if (!context)
return -ESRCH;
trace = ds_read_bts(context->tracer);
if (!trace)
return -EPERM;
return -ESRCH;
if (!trace->read)
return -EOPNOTSUPP;
......@@ -628,9 +747,8 @@ static int ptrace_bts_drain(struct task_struct *child,
for (at = trace->ds.begin; (void *)at < trace->ds.top;
out++, drained++, at += trace->ds.size) {
struct bts_struct bts;
int error;
error = trace->read(child->bts, at, &bts);
error = trace->read(context->tracer, at, &bts);
if (error < 0)
return error;
......@@ -640,35 +758,18 @@ static int ptrace_bts_drain(struct task_struct *child,
memset(trace->ds.begin, 0, trace->ds.n * trace->ds.size);
error = ds_reset_bts(child->bts);
error = ds_reset_bts(context->tracer);
if (error < 0)
return error;
return drained;
}
static int ptrace_bts_allocate_buffer(struct task_struct *child, size_t size)
{
child->bts_buffer = alloc_locked_buffer(size);
if (!child->bts_buffer)
return -ENOMEM;
child->bts_size = size;
return 0;
}
static void ptrace_bts_free_buffer(struct task_struct *child)
{
free_locked_buffer(child->bts_buffer, child->bts_size);
child->bts_buffer = NULL;
child->bts_size = 0;
}
static int ptrace_bts_config(struct task_struct *child,
long cfg_size,
const struct ptrace_bts_config __user *ucfg)
{
struct bts_context *context;
struct ptrace_bts_config cfg;
unsigned int flags = 0;
......@@ -678,28 +779,33 @@ static int ptrace_bts_config(struct task_struct *child,
if (copy_from_user(&cfg, ucfg, sizeof(cfg)))
return -EFAULT;
if (child->bts) {
ds_release_bts(child->bts);
child->bts = NULL;
}
context = child->bts;
if (!context)
context = alloc_bts_context(child);
if (!context)
return -ENOMEM;
if (cfg.flags & PTRACE_BTS_O_SIGNAL) {
if (!cfg.signal)
return -EINVAL;
child->thread.bts_ovfl_signal = cfg.signal;
return -EOPNOTSUPP;
context->bts_ovfl_signal = cfg.signal;
}
if ((cfg.flags & PTRACE_BTS_O_ALLOC) &&
(cfg.size != child->bts_size)) {
int error;
ds_release_bts(context->tracer);
context->tracer = NULL;
ptrace_bts_free_buffer(child);
if ((cfg.flags & PTRACE_BTS_O_ALLOC) && (cfg.size != context->size)) {
int err;
error = ptrace_bts_allocate_buffer(child, cfg.size);
if (error < 0)
return error;
free_bts_buffer(context);
if (!cfg.size)
return 0;
err = alloc_bts_buffer(context, cfg.size);
if (err < 0)
return err;
}
if (cfg.flags & PTRACE_BTS_O_TRACE)
......@@ -708,15 +814,14 @@ static int ptrace_bts_config(struct task_struct *child,
if (cfg.flags & PTRACE_BTS_O_SCHED)
flags |= BTS_TIMESTAMPS;
child->bts = ds_request_bts(child, child->bts_buffer, child->bts_size,
/* ovfl = */ NULL, /* th = */ (size_t)-1,
flags);
if (IS_ERR(child->bts)) {
int error = PTR_ERR(child->bts);
ptrace_bts_free_buffer(child);
child->bts = NULL;
context->tracer =
ds_request_bts_task(child, context->buffer, context->size,
NULL, (size_t)-1, flags);
if (unlikely(IS_ERR(context->tracer))) {
int error = PTR_ERR(context->tracer);
free_bts_buffer(context);
context->tracer = NULL;
return error;
}
......@@ -727,20 +832,25 @@ static int ptrace_bts_status(struct task_struct *child,
long cfg_size,
struct ptrace_bts_config __user *ucfg)
{
struct bts_context *context;
const struct bts_trace *trace;
struct ptrace_bts_config cfg;
context = child->bts;
if (!context)
return -ESRCH;
if (cfg_size < sizeof(cfg))
return -EIO;
trace = ds_read_bts(child->bts);
trace = ds_read_bts(context->tracer);
if (!trace)
return -EPERM;
return -ESRCH;
memset(&cfg, 0, sizeof(cfg));
cfg.size = trace->ds.end - trace->ds.begin;
cfg.signal = child->thread.bts_ovfl_signal;
cfg.bts_size = sizeof(struct bts_struct);
cfg.size = trace->ds.end - trace->ds.begin;
cfg.signal = context->bts_ovfl_signal;
cfg.bts_size = sizeof(struct bts_struct);
if (cfg.signal)
cfg.flags |= PTRACE_BTS_O_SIGNAL;
......@@ -759,80 +869,51 @@ static int ptrace_bts_status(struct task_struct *child,
static int ptrace_bts_clear(struct task_struct *child)
{
struct bts_context *context;
const struct bts_trace *trace;
trace = ds_read_bts(child->bts);
context = child->bts;
if (!context)
return -ESRCH;
trace = ds_read_bts(context->tracer);
if (!trace)
return -EPERM;
return -ESRCH;
memset(trace->ds.begin, 0, trace->ds.n * trace->ds.size);
return ds_reset_bts(child->bts);
return ds_reset_bts(context->tracer);
}
static int ptrace_bts_size(struct task_struct *child)
{
struct bts_context *context;
const struct bts_trace *trace;
trace = ds_read_bts(child->bts);
context = child->bts;
if (!context)
return -ESRCH;
trace = ds_read_bts(context->tracer);
if (!trace)
return -EPERM;
return -ESRCH;
return (trace->ds.top - trace->ds.begin) / trace->ds.size;
}
static void ptrace_bts_fork(struct task_struct *tsk)
{
tsk->bts = NULL;
tsk->bts_buffer = NULL;
tsk->bts_size = 0;
tsk->thread.bts_ovfl_signal = 0;
}
static void ptrace_bts_untrace(struct task_struct *child)
/*
* Called from __ptrace_unlink() after the child has been moved back
* to its original parent.
*/
void ptrace_bts_untrace(struct task_struct *child)
{
if (unlikely(child->bts)) {
ds_release_bts(child->bts);
free_bts_context(child->bts);
child->bts = NULL;
/* We cannot update total_vm and locked_vm since
child's mm is already gone. But we can reclaim the
memory. */
kfree(child->bts_buffer);
child->bts_buffer = NULL;
child->bts_size = 0;
}
}
static void ptrace_bts_detach(struct task_struct *child)
{
/*
* Ptrace_detach() races with ptrace_untrace() in case
* the child dies and is reaped by another thread.
*
* We only do the memory accounting at this point and
* leave the buffer deallocation and the bts tracer
* release to ptrace_bts_untrace() which will be called
* later on with tasklist_lock held.
*/
release_locked_buffer(child->bts_buffer, child->bts_size);
}
#else
static inline void ptrace_bts_fork(struct task_struct *tsk) {}
static inline void ptrace_bts_detach(struct task_struct *child) {}
static inline void ptrace_bts_untrace(struct task_struct *child) {}
#endif /* CONFIG_X86_PTRACE_BTS */
void x86_ptrace_fork(struct task_struct *child, unsigned long clone_flags)
{
ptrace_bts_fork(child);
}
void x86_ptrace_untrace(struct task_struct *child)
{
ptrace_bts_untrace(child);
}
/*
* Called by kernel/ptrace.c when detaching..
*
......@@ -844,7 +925,6 @@ void ptrace_disable(struct task_struct *child)
#ifdef TIF_SYSCALL_EMU
clear_tsk_thread_flag(child, TIF_SYSCALL_EMU);
#endif
ptrace_bts_detach(child);
}
#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
......
......@@ -20,7 +20,7 @@ save_stack_warning_symbol(void *data, char *msg, unsigned long symbol)
static int save_stack_stack(void *data, char *name)
{
return -1;
return 0;
}
static void save_stack_address(void *data, unsigned long addr, int reliable)
......
......@@ -32,7 +32,7 @@ struct kmmio_fault_page {
struct list_head list;
struct kmmio_fault_page *release_next;
unsigned long page; /* location of the fault page */
bool old_presence; /* page presence prior to arming */
pteval_t old_presence; /* page presence prior to arming */
bool armed;
/*
......@@ -97,60 +97,62 @@ static struct kmmio_probe *get_kmmio_probe(unsigned long addr)
static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long page)
{
struct list_head *head;
struct kmmio_fault_page *p;
struct kmmio_fault_page *f;
page &= PAGE_MASK;
head = kmmio_page_list(page);
list_for_each_entry_rcu(p, head, list) {
if (p->page == page)
return p;
list_for_each_entry_rcu(f, head, list) {
if (f->page == page)
return f;
}
return NULL;
}
static void set_pmd_presence(pmd_t *pmd, bool present, bool *old)
static void clear_pmd_presence(pmd_t *pmd, bool clear, pmdval_t *old)
{
pmdval_t v = pmd_val(*pmd);
*old = !!(v & _PAGE_PRESENT);
v &= ~_PAGE_PRESENT;
if (present)
v |= _PAGE_PRESENT;
if (clear) {
*old = v & _PAGE_PRESENT;
v &= ~_PAGE_PRESENT;
} else /* presume this has been called with clear==true previously */
v |= *old;
set_pmd(pmd, __pmd(v));
}
static void set_pte_presence(pte_t *pte, bool present, bool *old)
static void clear_pte_presence(pte_t *pte, bool clear, pteval_t *old)
{
pteval_t v = pte_val(*pte);
*old = !!(v & _PAGE_PRESENT);
v &= ~_PAGE_PRESENT;
if (present)
v |= _PAGE_PRESENT;
if (clear) {
*old = v & _PAGE_PRESENT;
v &= ~_PAGE_PRESENT;
} else /* presume this has been called with clear==true previously */
v |= *old;
set_pte_atomic(pte, __pte(v));
}
static int set_page_presence(unsigned long addr, bool present, bool *old)
static int clear_page_presence(struct kmmio_fault_page *f, bool clear)
{
unsigned int level;
pte_t *pte = lookup_address(addr, &level);
pte_t *pte = lookup_address(f->page, &level);
if (!pte) {
pr_err("kmmio: no pte for page 0x%08lx\n", addr);
pr_err("kmmio: no pte for page 0x%08lx\n", f->page);
return -1;
}
switch (level) {
case PG_LEVEL_2M:
set_pmd_presence((pmd_t *)pte, present, old);
clear_pmd_presence((pmd_t *)pte, clear, &f->old_presence);
break;
case PG_LEVEL_4K:
set_pte_presence(pte, present, old);
clear_pte_presence(pte, clear, &f->old_presence);
break;
default:
pr_err("kmmio: unexpected page level 0x%x.\n", level);
return -1;
}
__flush_tlb_one(addr);
__flush_tlb_one(f->page);
return 0;
}
......@@ -171,9 +173,9 @@ static int arm_kmmio_fault_page(struct kmmio_fault_page *f)
WARN_ONCE(f->armed, KERN_ERR "kmmio page already armed.\n");
if (f->armed) {
pr_warning("kmmio double-arm: page 0x%08lx, ref %d, old %d\n",
f->page, f->count, f->old_presence);
f->page, f->count, !!f->old_presence);
}
ret = set_page_presence(f->page, false, &f->old_presence);
ret = clear_page_presence(f, true);
WARN_ONCE(ret < 0, KERN_ERR "kmmio arming 0x%08lx failed.\n", f->page);
f->armed = true;
return ret;
......@@ -182,8 +184,7 @@ static int arm_kmmio_fault_page(struct kmmio_fault_page *f)
/** Restore the given page to saved presence state. */
static void disarm_kmmio_fault_page(struct kmmio_fault_page *f)
{
bool tmp;
int ret = set_page_presence(f->page, f->old_presence, &tmp);
int ret = clear_page_presence(f, false);
WARN_ONCE(ret < 0,
KERN_ERR "kmmio disarming 0x%08lx failed.\n", f->page);
f->armed = false;
......@@ -310,7 +311,12 @@ static int post_kmmio_handler(unsigned long condition, struct pt_regs *regs)
struct kmmio_context *ctx = &get_cpu_var(kmmio_ctx);
if (!ctx->active) {
pr_debug("kmmio: spurious debug trap on CPU %d.\n",
/*
* debug traps without an active context are due to either
* something external causing them (f.e. using a debugger while
* mmio tracing enabled), or erroneous behaviour
*/
pr_warning("kmmio: unexpected debug trap on CPU %d.\n",
smp_processor_id());
goto out;
}
......@@ -439,12 +445,12 @@ static void rcu_free_kmmio_fault_pages(struct rcu_head *head)
head,
struct kmmio_delayed_release,
rcu);
struct kmmio_fault_page *p = dr->release_list;
while (p) {
struct kmmio_fault_page *next = p->release_next;
BUG_ON(p->count);
kfree(p);
p = next;
struct kmmio_fault_page *f = dr->release_list;
while (f) {
struct kmmio_fault_page *next = f->release_next;
BUG_ON(f->count);
kfree(f);
f = next;
}
kfree(dr);
}
......@@ -453,19 +459,19 @@ static void remove_kmmio_fault_pages(struct rcu_head *head)
{
struct kmmio_delayed_release *dr =
container_of(head, struct kmmio_delayed_release, rcu);
struct kmmio_fault_page *p = dr->release_list;
struct kmmio_fault_page *f = dr->release_list;
struct kmmio_fault_page **prevp = &dr->release_list;
unsigned long flags;
spin_lock_irqsave(&kmmio_lock, flags);
while (p) {
if (!p->count) {
list_del_rcu(&p->list);
prevp = &p->release_next;
while (f) {
if (!f->count) {
list_del_rcu(&f->list);
prevp = &f->release_next;
} else {
*prevp = p->release_next;
*prevp = f->release_next;
}
p = p->release_next;
f = f->release_next;
}
spin_unlock_irqrestore(&kmmio_lock, flags);
......@@ -528,8 +534,8 @@ void unregister_kmmio_probe(struct kmmio_probe *p)
}
EXPORT_SYMBOL(unregister_kmmio_probe);
static int kmmio_die_notifier(struct notifier_block *nb, unsigned long val,
void *args)
static int
kmmio_die_notifier(struct notifier_block *nb, unsigned long val, void *args)
{
struct die_args *arg = args;
......@@ -544,11 +550,23 @@ static struct notifier_block nb_die = {
.notifier_call = kmmio_die_notifier
};
static int __init init_kmmio(void)
int kmmio_init(void)
{
int i;
for (i = 0; i < KMMIO_PAGE_TABLE_SIZE; i++)
INIT_LIST_HEAD(&kmmio_page_table[i]);
return register_die_notifier(&nb_die);
}
fs_initcall(init_kmmio); /* should be before device_initcall() */
void kmmio_cleanup(void)
{
int i;
unregister_die_notifier(&nb_die);
for (i = 0; i < KMMIO_PAGE_TABLE_SIZE; i++) {
WARN_ONCE(!list_empty(&kmmio_page_table[i]),
KERN_ERR "kmmio_page_table not empty at cleanup, any further tracing will leak memory.\n");
}
}
......@@ -451,6 +451,7 @@ void enable_mmiotrace(void)
if (nommiotrace)
pr_info(NAME "MMIO tracing disabled.\n");
kmmio_init();
enter_uniprocessor();
spin_lock_irq(&trace_lock);
atomic_inc(&mmiotrace_enabled);
......@@ -473,6 +474,7 @@ void disable_mmiotrace(void)
clear_trace_list(); /* guarantees: no more kmmio callbacks */
leave_uniprocessor();
kmmio_cleanup();
pr_info(NAME "disabled.\n");
out:
mutex_unlock(&mmiotrace_mutex);
......
......@@ -28,22 +28,14 @@
#include <linux/task_io_accounting_ops.h>
#include <linux/blktrace_api.h>
#include <linux/fault-inject.h>
#include <trace/block.h>
#define CREATE_TRACE_POINTS
#include <trace/events/block.h>
#include "blk.h"
DEFINE_TRACE(block_plug);
DEFINE_TRACE(block_unplug_io);
DEFINE_TRACE(block_unplug_timer);
DEFINE_TRACE(block_getrq);
DEFINE_TRACE(block_sleeprq);
DEFINE_TRACE(block_rq_requeue);
DEFINE_TRACE(block_bio_backmerge);
DEFINE_TRACE(block_bio_frontmerge);
DEFINE_TRACE(block_bio_queue);
DEFINE_TRACE(block_rq_complete);
DEFINE_TRACE(block_remap); /* Also used in drivers/md/dm.c */
EXPORT_TRACEPOINT_SYMBOL_GPL(block_remap);
EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete);
static int __make_request(struct request_queue *q, struct bio *bio);
......@@ -1277,7 +1269,7 @@ static inline void blk_partition_remap(struct bio *bio)
bio->bi_bdev = bdev->bd_contains;
trace_block_remap(bdev_get_queue(bio->bi_bdev), bio,
bdev->bd_dev, bio->bi_sector,
bdev->bd_dev,
bio->bi_sector - p->start_sect);
}
}
......@@ -1446,8 +1438,7 @@ static inline void __generic_make_request(struct bio *bio)
goto end_io;
if (old_sector != -1)
trace_block_remap(q, bio, old_dev, bio->bi_sector,
old_sector);
trace_block_remap(q, bio, old_dev, old_sector);
trace_block_bio_queue(q, bio);
......
......@@ -383,16 +383,21 @@ struct kobj_type blk_queue_ktype = {
int blk_register_queue(struct gendisk *disk)
{
int ret;
struct device *dev = disk_to_dev(disk);
struct request_queue *q = disk->queue;
if (WARN_ON(!q))
return -ENXIO;
ret = blk_trace_init_sysfs(dev);
if (ret)
return ret;
if (!q->request_fn)
return 0;
ret = kobject_add(&q->kobj, kobject_get(&disk_to_dev(disk)->kobj),
ret = kobject_add(&q->kobj, kobject_get(&dev->kobj),
"%s", "queue");
if (ret < 0)
return ret;
......
......@@ -568,7 +568,7 @@ static int compat_blk_trace_setup(struct block_device *bdev, char __user *arg)
memcpy(&buts.name, &cbuts.name, 32);
mutex_lock(&bdev->bd_mutex);
ret = do_blk_trace_setup(q, b, bdev->bd_dev, &buts);
ret = do_blk_trace_setup(q, b, bdev->bd_dev, bdev, &buts);
mutex_unlock(&bdev->bd_mutex);
if (ret)
return ret;
......
......@@ -33,17 +33,16 @@
#include <linux/compiler.h>
#include <linux/delay.h>
#include <linux/blktrace_api.h>
#include <trace/block.h>
#include <linux/hash.h>
#include <linux/uaccess.h>
#include <trace/events/block.h>
#include "blk.h"
static DEFINE_SPINLOCK(elv_list_lock);
static LIST_HEAD(elv_list);
DEFINE_TRACE(block_rq_abort);
/*
* Merge hash stuff.
*/
......@@ -55,9 +54,6 @@ static const int elv_hash_shift = 6;
#define rq_hash_key(rq) ((rq)->sector + (rq)->nr_sectors)
#define ELV_ON_HASH(rq) (!hlist_unhashed(&(rq)->hash))
DEFINE_TRACE(block_rq_insert);
DEFINE_TRACE(block_rq_issue);
/*
* Query io scheduler to see if the current process issuing bio may be
* merged with rq.
......
......@@ -20,7 +20,8 @@
#include <linux/idr.h>
#include <linux/hdreg.h>
#include <linux/blktrace_api.h>
#include <trace/block.h>
#include <trace/events/block.h>
#define DM_MSG_PREFIX "core"
......@@ -53,8 +54,6 @@ struct dm_target_io {
union map_info info;
};
DEFINE_TRACE(block_bio_complete);
/*
* For request-based dm.
* One of these is allocated per request.
......@@ -656,8 +655,7 @@ static void __map_bio(struct dm_target *ti, struct bio *clone,
/* the bio has been remapped so dispatch it */
trace_block_remap(bdev_get_queue(clone->bi_bdev), clone,
tio->io->bio->bi_bdev->bd_dev,
clone->bi_sector, sector);
tio->io->bio->bi_bdev->bd_dev, sector);
generic_make_request(clone);
} else if (r < 0 || r == DM_MAPIO_REQUEUE) {
......
......@@ -1065,6 +1065,7 @@ sg_ioctl(struct inode *inode, struct file *filp,
return blk_trace_setup(sdp->device->request_queue,
sdp->disk->disk_name,
MKDEV(SCSI_GENERIC_MAJOR, sdp->index),
NULL,
(char *)arg);
case BLKTRACESTART:
return blk_trace_startstop(sdp->device->request_queue, 1);
......
......@@ -26,10 +26,9 @@
#include <linux/mempool.h>
#include <linux/workqueue.h>
#include <linux/blktrace_api.h>
#include <trace/block.h>
#include <scsi/sg.h> /* for struct sg_iovec */
DEFINE_TRACE(block_split);
#include <trace/events/block.h>
/*
* Test patch to inline a certain number of bi_io_vec's inside the bio
......
......@@ -63,7 +63,7 @@
#define BRANCH_PROFILE()
#endif
#ifdef CONFIG_EVENT_TRACER
#ifdef CONFIG_EVENT_TRACING
#define FTRACE_EVENTS() VMLINUX_SYMBOL(__start_ftrace_events) = .; \
*(_ftrace_events) \
VMLINUX_SYMBOL(__stop_ftrace_events) = .;
......
......@@ -116,9 +116,9 @@ struct blk_io_trace {
* The remap event
*/
struct blk_io_trace_remap {
__be32 device;
__be32 device_from;
__be64 sector;
__be32 device_to;
__be64 sector_from;
};
enum {
......@@ -165,8 +165,9 @@ struct blk_trace {
extern int blk_trace_ioctl(struct block_device *, unsigned, char __user *);
extern void blk_trace_shutdown(struct request_queue *);
extern int do_blk_trace_setup(struct request_queue *q,
char *name, dev_t dev, struct blk_user_trace_setup *buts);
extern int do_blk_trace_setup(struct request_queue *q, char *name,
dev_t dev, struct block_device *bdev,
struct blk_user_trace_setup *buts);
extern void __trace_note_message(struct blk_trace *, const char *fmt, ...);
/**
......@@ -193,22 +194,42 @@ extern void __trace_note_message(struct blk_trace *, const char *fmt, ...);
extern void blk_add_driver_data(struct request_queue *q, struct request *rq,
void *data, size_t len);
extern int blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
struct block_device *bdev,
char __user *arg);
extern int blk_trace_startstop(struct request_queue *q, int start);
extern int blk_trace_remove(struct request_queue *q);
extern int blk_trace_init_sysfs(struct device *dev);
extern struct attribute_group blk_trace_attr_group;
#else /* !CONFIG_BLK_DEV_IO_TRACE */
#define blk_trace_ioctl(bdev, cmd, arg) (-ENOTTY)
#define blk_trace_shutdown(q) do { } while (0)
#define do_blk_trace_setup(q, name, dev, buts) (-ENOTTY)
#define blk_add_driver_data(q, rq, data, len) do {} while (0)
#define blk_trace_setup(q, name, dev, arg) (-ENOTTY)
#define blk_trace_startstop(q, start) (-ENOTTY)
#define blk_trace_remove(q) (-ENOTTY)
#define blk_add_trace_msg(q, fmt, ...) do { } while (0)
# define blk_trace_ioctl(bdev, cmd, arg) (-ENOTTY)
# define blk_trace_shutdown(q) do { } while (0)
# define do_blk_trace_setup(q, name, dev, bdev, buts) (-ENOTTY)
# define blk_add_driver_data(q, rq, data, len) do {} while (0)
# define blk_trace_setup(q, name, dev, bdev, arg) (-ENOTTY)
# define blk_trace_startstop(q, start) (-ENOTTY)
# define blk_trace_remove(q) (-ENOTTY)
# define blk_add_trace_msg(q, fmt, ...) do { } while (0)
static inline int blk_trace_init_sysfs(struct device *dev)
{
return 0;
}
#endif /* CONFIG_BLK_DEV_IO_TRACE */
#if defined(CONFIG_EVENT_TRACING) && defined(CONFIG_BLOCK)
static inline int blk_cmd_buf_len(struct request *rq)
{
return blk_pc_request(rq) ? rq->cmd_len * 3 : 1;
}
extern void blk_dump_cmd(char *buf, struct request *rq);
extern void blk_fill_rwbs(char *rwbs, u32 rw, int bytes);
extern void blk_fill_rwbs_rq(char *rwbs, struct request *rq);
#endif /* CONFIG_EVENT_TRACING && CONFIG_BLOCK */
#endif /* __KERNEL__ */
#endif
......@@ -233,8 +233,6 @@ extern int ftrace_arch_read_dyn_info(char *buf, int size);
extern int skip_trace(unsigned long ip);
extern void ftrace_release(void *start, unsigned long size);
extern void ftrace_disable_daemon(void);
extern void ftrace_enable_daemon(void);
#else
......@@ -325,13 +323,8 @@ static inline void __ftrace_enabled_restore(int enabled)
#ifdef CONFIG_FTRACE_MCOUNT_RECORD
extern void ftrace_init(void);
extern void ftrace_init_module(struct module *mod,
unsigned long *start, unsigned long *end);
#else
static inline void ftrace_init(void) { }
static inline void
ftrace_init_module(struct module *mod,
unsigned long *start, unsigned long *end) { }
#endif
/*
......@@ -368,6 +361,7 @@ struct ftrace_ret_stack {
unsigned long ret;
unsigned long func;
unsigned long long calltime;
unsigned long long subtime;
};
/*
......@@ -379,8 +373,6 @@ extern void return_to_handler(void);
extern int
ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth);
extern void
ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret);
/*
* Sometimes we don't want to trace a function with the function
......@@ -496,8 +488,15 @@ static inline int test_tsk_trace_graph(struct task_struct *tsk)
extern int ftrace_dump_on_oops;
#ifdef CONFIG_PREEMPT
#define INIT_TRACE_RECURSION .trace_recursion = 0,
#endif
#endif /* CONFIG_TRACING */
#ifndef INIT_TRACE_RECURSION
#define INIT_TRACE_RECURSION
#endif
#ifdef CONFIG_HW_BRANCH_TRACER
......
#ifndef _LINUX_FTRACE_EVENT_H
#define _LINUX_FTRACE_EVENT_H
#include <linux/trace_seq.h>
#include <linux/ring_buffer.h>
#include <linux/percpu.h>
struct trace_array;
struct tracer;
struct dentry;
DECLARE_PER_CPU(struct trace_seq, ftrace_event_seq);
struct trace_print_flags {
unsigned long mask;
const char *name;
};
const char *ftrace_print_flags_seq(struct trace_seq *p, const char *delim,
unsigned long flags,
const struct trace_print_flags *flag_array);
const char *ftrace_print_symbols_seq(struct trace_seq *p, unsigned long val,
const struct trace_print_flags *symbol_array);
/*
* The trace entry - the most basic unit of tracing. This is what
* is printed in the end as a single line in the trace output, such as:
*
* bash-15816 [01] 235.197585: idle_cpu <- irq_enter
*/
struct trace_entry {
unsigned short type;
unsigned char flags;
unsigned char preempt_count;
int pid;
int tgid;
};
#define FTRACE_MAX_EVENT \
((1 << (sizeof(((struct trace_entry *)0)->type) * 8)) - 1)
/*
* Trace iterator - used by printout routines who present trace
* results to users and which routines might sleep, etc:
*/
struct trace_iterator {
struct trace_array *tr;
struct tracer *trace;
void *private;
int cpu_file;
struct mutex mutex;
struct ring_buffer_iter *buffer_iter[NR_CPUS];
unsigned long iter_flags;
/* The below is zeroed out in pipe_read */
struct trace_seq seq;
struct trace_entry *ent;
int cpu;
u64 ts;
loff_t pos;
long idx;
cpumask_var_t started;
};
typedef enum print_line_t (*trace_print_func)(struct trace_iterator *iter,
int flags);
struct trace_event {
struct hlist_node node;
struct list_head list;
int type;
trace_print_func trace;
trace_print_func raw;
trace_print_func hex;
trace_print_func binary;
};
extern int register_ftrace_event(struct trace_event *event);
extern int unregister_ftrace_event(struct trace_event *event);
/* Return values for print_line callback */
enum print_line_t {
TRACE_TYPE_PARTIAL_LINE = 0, /* Retry after flushing the seq */
TRACE_TYPE_HANDLED = 1,
TRACE_TYPE_UNHANDLED = 2, /* Relay to other output functions */
TRACE_TYPE_NO_CONSUME = 3 /* Handled but ask to not consume */
};
struct ring_buffer_event *
trace_current_buffer_lock_reserve(int type, unsigned long len,
unsigned long flags, int pc);
void trace_current_buffer_unlock_commit(struct ring_buffer_event *event,
unsigned long flags, int pc);
void trace_nowake_buffer_unlock_commit(struct ring_buffer_event *event,
unsigned long flags, int pc);
void trace_current_buffer_discard_commit(struct ring_buffer_event *event);
void tracing_record_cmdline(struct task_struct *tsk);
struct ftrace_event_call {
struct list_head list;
char *name;
char *system;
struct dentry *dir;
struct trace_event *event;
int enabled;
int (*regfunc)(void);
void (*unregfunc)(void);
int id;
int (*raw_init)(void);
int (*show_format)(struct trace_seq *s);
int (*define_fields)(void);
struct list_head fields;
int filter_active;
void *filter;
void *mod;
#ifdef CONFIG_EVENT_PROFILE
atomic_t profile_count;
int (*profile_enable)(struct ftrace_event_call *);
void (*profile_disable)(struct ftrace_event_call *);
#endif
};
#define MAX_FILTER_PRED 32
#define MAX_FILTER_STR_VAL 128
extern int init_preds(struct ftrace_event_call *call);
extern void destroy_preds(struct ftrace_event_call *call);
extern int filter_match_preds(struct ftrace_event_call *call, void *rec);
extern int filter_current_check_discard(struct ftrace_event_call *call,
void *rec,
struct ring_buffer_event *event);
extern int trace_define_field(struct ftrace_event_call *call, char *type,
char *name, int offset, int size, int is_signed);
#define is_signed_type(type) (((type)(-1)) < 0)
int trace_set_clr_event(const char *system, const char *event, int set);
/*
* The double __builtin_constant_p is because gcc will give us an error
* if we try to allocate the static variable to fmt if it is not a
* constant. Even with the outer if statement optimizing out.
*/
#define event_trace_printk(ip, fmt, args...) \
do { \
__trace_printk_check_format(fmt, ##args); \
tracing_record_cmdline(current); \
if (__builtin_constant_p(fmt)) { \
static const char *trace_printk_fmt \
__attribute__((section("__trace_printk_fmt"))) = \
__builtin_constant_p(fmt) ? fmt : NULL; \
\
__trace_bprintk(ip, trace_printk_fmt, ##args); \
} else \
__trace_printk(ip, fmt, ##args); \
} while (0)
#define __common_field(type, item, is_signed) \
ret = trace_define_field(event_call, #type, "common_" #item, \
offsetof(typeof(field.ent), item), \
sizeof(field.ent.item), is_signed); \
if (ret) \
return ret;
#endif /* _LINUX_FTRACE_EVENT_H */
......@@ -174,6 +174,7 @@ extern struct cred init_cred;
INIT_TRACE_IRQFLAGS \
INIT_LOCKDEP \
INIT_FTRACE_GRAPH \
INIT_TRACE_RECURSION \
}
......
/*
* Copyright (C) 2008 Eduard - Gabriel Munteanu
*
* This file is released under GPL version 2.
*/
#ifndef _LINUX_KMEMTRACE_H
#define _LINUX_KMEMTRACE_H
#ifdef __KERNEL__
#include <trace/events/kmem.h>
#ifdef CONFIG_KMEMTRACE
extern void kmemtrace_init(void);
#else
static inline void kmemtrace_init(void)
{
}
#endif
#endif /* __KERNEL__ */
#endif /* _LINUX_KMEMTRACE_H */
......@@ -19,6 +19,7 @@ struct anon_vma;
struct file_ra_state;
struct user_struct;
struct writeback_control;
struct rlimit;
#ifndef CONFIG_DISCONTIGMEM /* Don't use mapnrs, do it properly */
extern unsigned long max_mapnr;
......@@ -1317,8 +1318,8 @@ int vmemmap_populate_basepages(struct page *start_page,
int vmemmap_populate(struct page *start_page, unsigned long pages, int node);
void vmemmap_populate_print_last(void);
extern void *alloc_locked_buffer(size_t size);
extern void free_locked_buffer(void *buffer, size_t size);
extern void release_locked_buffer(void *buffer, size_t size);
extern int account_locked_memory(struct mm_struct *mm, struct rlimit *rlim,
size_t size);
extern void refund_locked_memory(struct mm_struct *mm, size_t size);
#endif /* __KERNEL__ */
#endif /* _LINUX_MM_H */
......@@ -30,6 +30,8 @@ extern unsigned int kmmio_count;
extern int register_kmmio_probe(struct kmmio_probe *p);
extern void unregister_kmmio_probe(struct kmmio_probe *p);
extern int kmmio_init(void);
extern void kmmio_cleanup(void);
#ifdef CONFIG_MMIOTRACE
/* kmmio is active by some kmmio_probes? */
......
......@@ -337,6 +337,14 @@ struct module
const char **trace_bprintk_fmt_start;
unsigned int num_trace_bprintk_fmt;
#endif
#ifdef CONFIG_EVENT_TRACING
struct ftrace_event_call *trace_events;
unsigned int num_trace_events;
#endif
#ifdef CONFIG_FTRACE_MCOUNT_RECORD
unsigned long *ftrace_callsites;
unsigned int num_ftrace_callsites;
#endif
#ifdef CONFIG_MODULE_UNLOAD
/* What modules depend on me? */
......
......@@ -95,7 +95,6 @@ extern void __ptrace_link(struct task_struct *child,
struct task_struct *new_parent);
extern void __ptrace_unlink(struct task_struct *child);
extern void exit_ptrace(struct task_struct *tracer);
extern void ptrace_fork(struct task_struct *task, unsigned long clone_flags);
#define PTRACE_MODE_READ 1
#define PTRACE_MODE_ATTACH 2
/* Returns 0 on success, -errno on denial. */
......@@ -327,15 +326,6 @@ static inline void user_enable_block_step(struct task_struct *task)
#define arch_ptrace_untrace(task) do { } while (0)
#endif
#ifndef arch_ptrace_fork
/*
* Do machine-specific work to initialize a new task.
*
* This is called from copy_process().
*/
#define arch_ptrace_fork(child, clone_flags) do { } while (0)
#endif
extern int task_current_syscall(struct task_struct *target, long *callno,
unsigned long args[6], unsigned int maxargs,
unsigned long *sp, unsigned long *pc);
......
......@@ -11,7 +11,7 @@ struct ring_buffer_iter;
* Don't refer to this struct directly, use functions below.
*/
struct ring_buffer_event {
u32 type:2, len:3, time_delta:27;
u32 type_len:5, time_delta:27;
u32 array[];
};
......@@ -24,7 +24,8 @@ struct ring_buffer_event {
* size is variable depending on how much
* padding is needed
* If time_delta is non zero:
* everything else same as RINGBUF_TYPE_DATA
* array[0] holds the actual length
* size = 4 + length (bytes)
*
* @RINGBUF_TYPE_TIME_EXTEND: Extend the time delta
* array[0] = time delta (28 .. 59)
......@@ -35,22 +36,23 @@ struct ring_buffer_event {
* array[1..2] = tv_sec
* size = 16 bytes
*
* @RINGBUF_TYPE_DATA: Data record
* If len is zero:
* <= @RINGBUF_TYPE_DATA_TYPE_LEN_MAX:
* Data record
* If type_len is zero:
* array[0] holds the actual length
* array[1..(length+3)/4] holds data
* size = 4 + 4 + length (bytes)
* size = 4 + length (bytes)
* else
* length = len << 2
* length = type_len << 2
* array[0..(length+3)/4-1] holds data
* size = 4 + length (bytes)
*/
enum ring_buffer_type {
RINGBUF_TYPE_DATA_TYPE_LEN_MAX = 28,
RINGBUF_TYPE_PADDING,
RINGBUF_TYPE_TIME_EXTEND,
/* FIXME: RINGBUF_TYPE_TIME_STAMP not implemented */
RINGBUF_TYPE_TIME_STAMP,
RINGBUF_TYPE_DATA,
};
unsigned ring_buffer_event_length(struct ring_buffer_event *event);
......@@ -68,13 +70,54 @@ ring_buffer_event_time_delta(struct ring_buffer_event *event)
return event->time_delta;
}
/*
* ring_buffer_event_discard can discard any event in the ring buffer.
* it is up to the caller to protect against a reader from
* consuming it or a writer from wrapping and replacing it.
*
* No external protection is needed if this is called before
* the event is commited. But in that case it would be better to
* use ring_buffer_discard_commit.
*
* Note, if an event that has not been committed is discarded
* with ring_buffer_event_discard, it must still be committed.
*/
void ring_buffer_event_discard(struct ring_buffer_event *event);
/*
* ring_buffer_discard_commit will remove an event that has not
* ben committed yet. If this is used, then ring_buffer_unlock_commit
* must not be called on the discarded event. This function
* will try to remove the event from the ring buffer completely
* if another event has not been written after it.
*
* Example use:
*
* if (some_condition)
* ring_buffer_discard_commit(buffer, event);
* else
* ring_buffer_unlock_commit(buffer, event);
*/
void ring_buffer_discard_commit(struct ring_buffer *buffer,
struct ring_buffer_event *event);
/*
* size is in bytes for each per CPU buffer.
*/
struct ring_buffer *
ring_buffer_alloc(unsigned long size, unsigned flags);
__ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *key);
/*
* Because the ring buffer is generic, if other users of the ring buffer get
* traced by ftrace, it can produce lockdep warnings. We need to keep each
* ring buffer's lock class separate.
*/
#define ring_buffer_alloc(size, flags) \
({ \
static struct lock_class_key __key; \
__ring_buffer_alloc((size), (flags), &__key); \
})
void ring_buffer_free(struct ring_buffer *buffer);
int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
......@@ -122,6 +165,8 @@ unsigned long ring_buffer_entries(struct ring_buffer *buffer);
unsigned long ring_buffer_overruns(struct ring_buffer *buffer);
unsigned long ring_buffer_entries_cpu(struct ring_buffer *buffer, int cpu);
unsigned long ring_buffer_overrun_cpu(struct ring_buffer *buffer, int cpu);
unsigned long ring_buffer_commit_overrun_cpu(struct ring_buffer *buffer, int cpu);
unsigned long ring_buffer_nmi_dropped_cpu(struct ring_buffer *buffer, int cpu);
u64 ring_buffer_time_stamp(struct ring_buffer *buffer, int cpu);
void ring_buffer_normalize_time_stamp(struct ring_buffer *buffer,
......@@ -137,6 +182,11 @@ void ring_buffer_free_read_page(struct ring_buffer *buffer, void *data);
int ring_buffer_read_page(struct ring_buffer *buffer, void **data_page,
size_t len, int cpu, int full);
struct trace_seq;
int ring_buffer_print_entry_header(struct trace_seq *s);
int ring_buffer_print_page_header(struct trace_seq *s);
enum ring_buffer_flags {
RB_FL_OVERWRITE = 1 << 0,
};
......
......@@ -97,8 +97,8 @@ struct exec_domain;
struct futex_pi_state;
struct robust_list_head;
struct bio;
struct bts_tracer;
struct fs_struct;
struct bts_context;
/*
* List of flags we want to share for kernel threads,
......@@ -1230,18 +1230,11 @@ struct task_struct {
struct list_head ptraced;
struct list_head ptrace_entry;
#ifdef CONFIG_X86_PTRACE_BTS
/*
* This is the tracer handle for the ptrace BTS extension.
* This field actually belongs to the ptracer task.
*/
struct bts_tracer *bts;
/*
* The buffer to hold the BTS data.
*/
void *bts_buffer;
size_t bts_size;
#endif /* CONFIG_X86_PTRACE_BTS */
struct bts_context *bts;
/* PID/PID hash table linkage. */
struct pid_link pids[PIDTYPE_MAX];
......@@ -1449,7 +1442,9 @@ struct task_struct {
#ifdef CONFIG_TRACING
/* state flags for use by tracers */
unsigned long trace;
#endif
/* bitmask of trace recursion */
unsigned long trace_recursion;
#endif /* CONFIG_TRACING */
};
/* Future-safe accessor for struct task_struct's cpus_allowed. */
......@@ -2022,8 +2017,10 @@ extern void set_task_comm(struct task_struct *tsk, char *from);
extern char *get_task_comm(char *to, struct task_struct *tsk);
#ifdef CONFIG_SMP
extern void wait_task_context_switch(struct task_struct *p);
extern unsigned long wait_task_inactive(struct task_struct *, long match_state);
#else
static inline void wait_task_context_switch(struct task_struct *p) {}
static inline unsigned long wait_task_inactive(struct task_struct *p,
long match_state)
{
......
......@@ -14,7 +14,7 @@
#include <asm/page.h> /* kmalloc_sizes.h needs PAGE_SIZE */
#include <asm/cache.h> /* kmalloc_sizes.h needs L1_CACHE_BYTES */
#include <linux/compiler.h>
#include <trace/kmemtrace.h>
#include <linux/kmemtrace.h>
/* Size description struct for general caches. */
struct cache_sizes {
......
......@@ -10,7 +10,7 @@
#include <linux/gfp.h>
#include <linux/workqueue.h>
#include <linux/kobject.h>
#include <trace/kmemtrace.h>
#include <linux/kmemtrace.h>
enum stat_item {
ALLOC_FASTPATH, /* Allocation from cpu slab */
......
#ifndef _LINUX_TRACE_SEQ_H
#define _LINUX_TRACE_SEQ_H
#include <linux/fs.h>
/*
* Trace sequences are used to allow a function to call several other functions
* to create a string of data to use (up to a max of PAGE_SIZE.
*/
struct trace_seq {
unsigned char buffer[PAGE_SIZE];
unsigned int len;
unsigned int readpos;
};
static inline void
trace_seq_init(struct trace_seq *s)
{
s->len = 0;
s->readpos = 0;
}
/*
* Currently only defined when tracing is enabled.
*/
#ifdef CONFIG_TRACING
extern int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
__attribute__ ((format (printf, 2, 3)));
extern int trace_seq_vprintf(struct trace_seq *s, const char *fmt, va_list args)
__attribute__ ((format (printf, 2, 0)));
extern int
trace_seq_bprintf(struct trace_seq *s, const char *fmt, const u32 *binary);
extern void trace_print_seq(struct seq_file *m, struct trace_seq *s);
extern ssize_t trace_seq_to_user(struct trace_seq *s, char __user *ubuf,
size_t cnt);
extern int trace_seq_puts(struct trace_seq *s, const char *str);
extern int trace_seq_putc(struct trace_seq *s, unsigned char c);
extern int trace_seq_putmem(struct trace_seq *s, const void *mem, size_t len);
extern int trace_seq_putmem_hex(struct trace_seq *s, const void *mem,
size_t len);
extern void *trace_seq_reserve(struct trace_seq *s, size_t len);
extern int trace_seq_path(struct trace_seq *s, struct path *path);
#else /* CONFIG_TRACING */
static inline int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
{
return 0;
}
static inline int
trace_seq_bprintf(struct trace_seq *s, const char *fmt, const u32 *binary)
{
return 0;
}
static inline void trace_print_seq(struct seq_file *m, struct trace_seq *s)
{
}
static inline ssize_t trace_seq_to_user(struct trace_seq *s, char __user *ubuf,
size_t cnt)
{
return 0;
}
static inline int trace_seq_puts(struct trace_seq *s, const char *str)
{
return 0;
}
static inline int trace_seq_putc(struct trace_seq *s, unsigned char c)
{
return 0;
}
static inline int
trace_seq_putmem(struct trace_seq *s, const void *mem, size_t len)
{
return 0;
}
static inline int trace_seq_putmem_hex(struct trace_seq *s, const void *mem,
size_t len)
{
return 0;
}
static inline void *trace_seq_reserve(struct trace_seq *s, size_t len)
{
return NULL;
}
static inline int trace_seq_path(struct trace_seq *s, struct path *path)
{
return 0;
}
#endif /* CONFIG_TRACING */
#endif /* _LINUX_TRACE_SEQ_H */
......@@ -31,6 +31,8 @@ struct tracepoint {
* Keep in sync with vmlinux.lds.h.
*/
#ifndef DECLARE_TRACE
#define TP_PROTO(args...) args
#define TP_ARGS(args...) args
......@@ -114,6 +116,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
struct tracepoint *end)
{ }
#endif /* CONFIG_TRACEPOINTS */
#endif /* DECLARE_TRACE */
/*
* Connect a probe to a tracepoint.
......@@ -154,10 +157,8 @@ static inline void tracepoint_synchronize_unregister(void)
}
#define PARAMS(args...) args
#define TRACE_FORMAT(name, proto, args, fmt) \
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
#ifndef TRACE_EVENT
/*
* For use with the TRACE_EVENT macro:
*
......@@ -262,5 +263,6 @@ static inline void tracepoint_synchronize_unregister(void)
#define TRACE_EVENT(name, proto, args, struct, assign, print) \
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
#endif
#endif
#ifndef _TRACE_BLOCK_H
#define _TRACE_BLOCK_H
#include <linux/blkdev.h>
#include <linux/tracepoint.h>
DECLARE_TRACE(block_rq_abort,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq));
DECLARE_TRACE(block_rq_insert,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq));
DECLARE_TRACE(block_rq_issue,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq));
DECLARE_TRACE(block_rq_requeue,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq));
DECLARE_TRACE(block_rq_complete,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq));
DECLARE_TRACE(block_bio_bounce,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio));
DECLARE_TRACE(block_bio_complete,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio));
DECLARE_TRACE(block_bio_backmerge,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio));
DECLARE_TRACE(block_bio_frontmerge,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio));
DECLARE_TRACE(block_bio_queue,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio));
DECLARE_TRACE(block_getrq,
TP_PROTO(struct request_queue *q, struct bio *bio, int rw),
TP_ARGS(q, bio, rw));
DECLARE_TRACE(block_sleeprq,
TP_PROTO(struct request_queue *q, struct bio *bio, int rw),
TP_ARGS(q, bio, rw));
DECLARE_TRACE(block_plug,
TP_PROTO(struct request_queue *q),
TP_ARGS(q));
DECLARE_TRACE(block_unplug_timer,
TP_PROTO(struct request_queue *q),
TP_ARGS(q));
DECLARE_TRACE(block_unplug_io,
TP_PROTO(struct request_queue *q),
TP_ARGS(q));
DECLARE_TRACE(block_split,
TP_PROTO(struct request_queue *q, struct bio *bio, unsigned int pdu),
TP_ARGS(q, bio, pdu));
DECLARE_TRACE(block_remap,
TP_PROTO(struct request_queue *q, struct bio *bio, dev_t dev,
sector_t from, sector_t to),
TP_ARGS(q, bio, dev, from, to));
#endif
/*
* Trace files that want to automate creationg of all tracepoints defined
* in their file should include this file. The following are macros that the
* trace file may define:
*
* TRACE_SYSTEM defines the system the tracepoint is for
*
* TRACE_INCLUDE_FILE if the file name is something other than TRACE_SYSTEM.h
* This macro may be defined to tell define_trace.h what file to include.
* Note, leave off the ".h".
*
* TRACE_INCLUDE_PATH if the path is something other than core kernel include/trace
* then this macro can define the path to use. Note, the path is relative to
* define_trace.h, not the file including it. Full path names for out of tree
* modules must be used.
*/
#ifdef CREATE_TRACE_POINTS
/* Prevent recursion */
#undef CREATE_TRACE_POINTS
#include <linux/stringify.h>
#undef TRACE_EVENT
#define TRACE_EVENT(name, proto, args, tstruct, assign, print) \
DEFINE_TRACE(name)
#undef DECLARE_TRACE
#define DECLARE_TRACE(name, proto, args) \
DEFINE_TRACE(name)
#undef TRACE_INCLUDE
#undef __TRACE_INCLUDE
#ifndef TRACE_INCLUDE_FILE
# define TRACE_INCLUDE_FILE TRACE_SYSTEM
# define UNDEF_TRACE_INCLUDE_FILE
#endif
#ifndef TRACE_INCLUDE_PATH
# define __TRACE_INCLUDE(system) <trace/events/system.h>
# define UNDEF_TRACE_INCLUDE_PATH
#else
# define __TRACE_INCLUDE(system) __stringify(TRACE_INCLUDE_PATH/system.h)
#endif
# define TRACE_INCLUDE(system) __TRACE_INCLUDE(system)
/* Let the trace headers be reread */
#define TRACE_HEADER_MULTI_READ
#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
#ifdef CONFIG_EVENT_TRACING
#include <trace/ftrace.h>
#endif
#undef TRACE_HEADER_MULTI_READ
/* Only undef what we defined in this file */
#ifdef UNDEF_TRACE_INCLUDE_FILE
# undef TRACE_INCLUDE_FILE
# undef UNDEF_TRACE_INCLUDE_FILE
#endif
#ifdef UNDEF_TRACE_INCLUDE_PATH
# undef TRACE_INCLUDE_PATH
# undef UNDEF_TRACE_INCLUDE_PATH
#endif
/* We may be processing more files */
#define CREATE_TRACE_POINTS
#endif /* CREATE_TRACE_POINTS */
#if !defined(_TRACE_BLOCK_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_BLOCK_H
#include <linux/blktrace_api.h>
#include <linux/blkdev.h>
#include <linux/tracepoint.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM block
TRACE_EVENT(block_rq_abort,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__field( int, errors )
__array( char, rwbs, 6 )
__dynamic_array( char, cmd, blk_cmd_buf_len(rq) )
),
TP_fast_assign(
__entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
__entry->sector = blk_pc_request(rq) ? 0 : rq->hard_sector;
__entry->nr_sector = blk_pc_request(rq) ?
0 : rq->hard_nr_sectors;
__entry->errors = rq->errors;
blk_fill_rwbs_rq(__entry->rwbs, rq);
blk_dump_cmd(__get_str(cmd), rq);
),
TP_printk("%d,%d %s (%s) %llu + %u [%d]",
MAJOR(__entry->dev), MINOR(__entry->dev),
__entry->rwbs, __get_str(cmd),
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->errors)
);
TRACE_EVENT(block_rq_insert,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__field( unsigned int, bytes )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
__dynamic_array( char, cmd, blk_cmd_buf_len(rq) )
),
TP_fast_assign(
__entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
__entry->sector = blk_pc_request(rq) ? 0 : rq->hard_sector;
__entry->nr_sector = blk_pc_request(rq) ?
0 : rq->hard_nr_sectors;
__entry->bytes = blk_pc_request(rq) ? rq->data_len : 0;
blk_fill_rwbs_rq(__entry->rwbs, rq);
blk_dump_cmd(__get_str(cmd), rq);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %u (%s) %llu + %u [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev),
__entry->rwbs, __entry->bytes, __get_str(cmd),
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->comm)
);
TRACE_EVENT(block_rq_issue,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__field( unsigned int, bytes )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
__dynamic_array( char, cmd, blk_cmd_buf_len(rq) )
),
TP_fast_assign(
__entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
__entry->sector = blk_pc_request(rq) ? 0 : rq->hard_sector;
__entry->nr_sector = blk_pc_request(rq) ?
0 : rq->hard_nr_sectors;
__entry->bytes = blk_pc_request(rq) ? rq->data_len : 0;
blk_fill_rwbs_rq(__entry->rwbs, rq);
blk_dump_cmd(__get_str(cmd), rq);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %u (%s) %llu + %u [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev),
__entry->rwbs, __entry->bytes, __get_str(cmd),
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->comm)
);
TRACE_EVENT(block_rq_requeue,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__field( int, errors )
__array( char, rwbs, 6 )
__dynamic_array( char, cmd, blk_cmd_buf_len(rq) )
),
TP_fast_assign(
__entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
__entry->sector = blk_pc_request(rq) ? 0 : rq->hard_sector;
__entry->nr_sector = blk_pc_request(rq) ?
0 : rq->hard_nr_sectors;
__entry->errors = rq->errors;
blk_fill_rwbs_rq(__entry->rwbs, rq);
blk_dump_cmd(__get_str(cmd), rq);
),
TP_printk("%d,%d %s (%s) %llu + %u [%d]",
MAJOR(__entry->dev), MINOR(__entry->dev),
__entry->rwbs, __get_str(cmd),
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->errors)
);
TRACE_EVENT(block_rq_complete,
TP_PROTO(struct request_queue *q, struct request *rq),
TP_ARGS(q, rq),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__field( int, errors )
__array( char, rwbs, 6 )
__dynamic_array( char, cmd, blk_cmd_buf_len(rq) )
),
TP_fast_assign(
__entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
__entry->sector = blk_pc_request(rq) ? 0 : rq->hard_sector;
__entry->nr_sector = blk_pc_request(rq) ?
0 : rq->hard_nr_sectors;
__entry->errors = rq->errors;
blk_fill_rwbs_rq(__entry->rwbs, rq);
blk_dump_cmd(__get_str(cmd), rq);
),
TP_printk("%d,%d %s (%s) %llu + %u [%d]",
MAJOR(__entry->dev), MINOR(__entry->dev),
__entry->rwbs, __get_str(cmd),
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->errors)
);
TRACE_EVENT(block_bio_bounce,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->dev = bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_sector;
__entry->nr_sector = bio->bi_size >> 9;
blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %llu + %u [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->comm)
);
TRACE_EVENT(block_bio_complete,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned, nr_sector )
__field( int, error )
__array( char, rwbs, 6 )
),
TP_fast_assign(
__entry->dev = bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_sector;
__entry->nr_sector = bio->bi_size >> 9;
blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
),
TP_printk("%d,%d %s %llu + %u [%d]",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->error)
);
TRACE_EVENT(block_bio_backmerge,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->dev = bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_sector;
__entry->nr_sector = bio->bi_size >> 9;
blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %llu + %u [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->comm)
);
TRACE_EVENT(block_bio_frontmerge,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned, nr_sector )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->dev = bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_sector;
__entry->nr_sector = bio->bi_size >> 9;
blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %llu + %u [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->comm)
);
TRACE_EVENT(block_bio_queue,
TP_PROTO(struct request_queue *q, struct bio *bio),
TP_ARGS(q, bio),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->dev = bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_sector;
__entry->nr_sector = bio->bi_size >> 9;
blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %llu + %u [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->comm)
);
TRACE_EVENT(block_getrq,
TP_PROTO(struct request_queue *q, struct bio *bio, int rw),
TP_ARGS(q, bio, rw),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->dev = bio ? bio->bi_bdev->bd_dev : 0;
__entry->sector = bio ? bio->bi_sector : 0;
__entry->nr_sector = bio ? bio->bi_size >> 9 : 0;
blk_fill_rwbs(__entry->rwbs,
bio ? bio->bi_rw : 0, __entry->nr_sector);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %llu + %u [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->comm)
);
TRACE_EVENT(block_sleeprq,
TP_PROTO(struct request_queue *q, struct bio *bio, int rw),
TP_ARGS(q, bio, rw),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->dev = bio ? bio->bi_bdev->bd_dev : 0;
__entry->sector = bio ? bio->bi_sector : 0;
__entry->nr_sector = bio ? bio->bi_size >> 9 : 0;
blk_fill_rwbs(__entry->rwbs,
bio ? bio->bi_rw : 0, __entry->nr_sector);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %llu + %u [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
__entry->nr_sector, __entry->comm)
);
TRACE_EVENT(block_plug,
TP_PROTO(struct request_queue *q),
TP_ARGS(q),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("[%s]", __entry->comm)
);
TRACE_EVENT(block_unplug_timer,
TP_PROTO(struct request_queue *q),
TP_ARGS(q),
TP_STRUCT__entry(
__field( int, nr_rq )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->nr_rq = q->rq.count[READ] + q->rq.count[WRITE];
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("[%s] %d", __entry->comm, __entry->nr_rq)
);
TRACE_EVENT(block_unplug_io,
TP_PROTO(struct request_queue *q),
TP_ARGS(q),
TP_STRUCT__entry(
__field( int, nr_rq )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->nr_rq = q->rq.count[READ] + q->rq.count[WRITE];
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("[%s] %d", __entry->comm, __entry->nr_rq)
);
TRACE_EVENT(block_split,
TP_PROTO(struct request_queue *q, struct bio *bio,
unsigned int new_sector),
TP_ARGS(q, bio, new_sector),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( sector_t, new_sector )
__array( char, rwbs, 6 )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->dev = bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_sector;
__entry->new_sector = new_sector;
blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
TP_printk("%d,%d %s %llu / %llu [%s]",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
(unsigned long long)__entry->new_sector,
__entry->comm)
);
TRACE_EVENT(block_remap,
TP_PROTO(struct request_queue *q, struct bio *bio, dev_t dev,
sector_t from),
TP_ARGS(q, bio, dev, from),
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
__field( dev_t, old_dev )
__field( sector_t, old_sector )
__array( char, rwbs, 6 )
),
TP_fast_assign(
__entry->dev = bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_sector;
__entry->nr_sector = bio->bi_size >> 9;
__entry->old_dev = dev;
__entry->old_sector = from;
blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
),
TP_printk("%d,%d %s %llu + %u <- (%d,%d) %llu",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
(unsigned long long)__entry->sector,
__entry->nr_sector,
MAJOR(__entry->old_dev), MINOR(__entry->old_dev),
(unsigned long long)__entry->old_sector)
);
#endif /* _TRACE_BLOCK_H */
/* This part must be outside protection */
#include <trace/define_trace.h>
#if !defined(_TRACE_IRQ_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_IRQ_H
#include <linux/tracepoint.h>
#include <linux/interrupt.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM irq
#define softirq_name(sirq) { sirq##_SOFTIRQ, #sirq }
#define show_softirq_name(val) \
__print_symbolic(val, \
softirq_name(HI), \
softirq_name(TIMER), \
softirq_name(NET_TX), \
softirq_name(NET_RX), \
softirq_name(BLOCK), \
softirq_name(TASKLET), \
softirq_name(SCHED), \
softirq_name(HRTIMER), \
softirq_name(RCU))
/**
* irq_handler_entry - called immediately before the irq action handler
* @irq: irq number
* @action: pointer to struct irqaction
*
* The struct irqaction pointed to by @action contains various
* information about the handler, including the device name,
* @action->name, and the device id, @action->dev_id. When used in
* conjunction with the irq_handler_exit tracepoint, we can figure
* out irq handler latencies.
*/
TRACE_EVENT(irq_handler_entry,
TP_PROTO(int irq, struct irqaction *action),
TP_ARGS(irq, action),
TP_STRUCT__entry(
__field( int, irq )
__string( name, action->name )
),
TP_fast_assign(
__entry->irq = irq;
__assign_str(name, action->name);
),
TP_printk("irq=%d handler=%s", __entry->irq, __get_str(name))
);
/**
* irq_handler_exit - called immediately after the irq action handler returns
* @irq: irq number
* @action: pointer to struct irqaction
* @ret: return value
*
* If the @ret value is set to IRQ_HANDLED, then we know that the corresponding
* @action->handler scuccessully handled this irq. Otherwise, the irq might be
* a shared irq line, or the irq was not handled successfully. Can be used in
* conjunction with the irq_handler_entry to understand irq handler latencies.
*/
TRACE_EVENT(irq_handler_exit,
TP_PROTO(int irq, struct irqaction *action, int ret),
TP_ARGS(irq, action, ret),
TP_STRUCT__entry(
__field( int, irq )
__field( int, ret )
),
TP_fast_assign(
__entry->irq = irq;
__entry->ret = ret;
),
TP_printk("irq=%d return=%s",
__entry->irq, __entry->ret ? "handled" : "unhandled")
);
/**
* softirq_entry - called immediately before the softirq handler
* @h: pointer to struct softirq_action
* @vec: pointer to first struct softirq_action in softirq_vec array
*
* The @h parameter, contains a pointer to the struct softirq_action
* which has a pointer to the action handler that is called. By subtracting
* the @vec pointer from the @h pointer, we can determine the softirq
* number. Also, when used in combination with the softirq_exit tracepoint
* we can determine the softirq latency.
*/
TRACE_EVENT(softirq_entry,
TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
TP_ARGS(h, vec),
TP_STRUCT__entry(
__field( int, vec )
),
TP_fast_assign(
__entry->vec = (int)(h - vec);
),
TP_printk("softirq=%d action=%s", __entry->vec,
show_softirq_name(__entry->vec))
);
/**
* softirq_exit - called immediately after the softirq handler returns
* @h: pointer to struct softirq_action
* @vec: pointer to first struct softirq_action in softirq_vec array
*
* The @h parameter contains a pointer to the struct softirq_action
* that has handled the softirq. By subtracting the @vec pointer from
* the @h pointer, we can determine the softirq number. Also, when used in
* combination with the softirq_entry tracepoint we can determine the softirq
* latency.
*/
TRACE_EVENT(softirq_exit,
TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
TP_ARGS(h, vec),
TP_STRUCT__entry(
__field( int, vec )
),
TP_fast_assign(
__entry->vec = (int)(h - vec);
),
TP_printk("softirq=%d action=%s", __entry->vec,
show_softirq_name(__entry->vec))
);
#endif /* _TRACE_IRQ_H */
/* This part must be outside protection */
#include <trace/define_trace.h>
#if !defined(_TRACE_KMEM_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_KMEM_H
#include <linux/types.h>
#include <linux/tracepoint.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kmem
/*
* The order of these masks is important. Matching masks will be seen
* first and the left over flags will end up showing by themselves.
*
* For example, if we have GFP_KERNEL before GFP_USER we wil get:
*
* GFP_KERNEL|GFP_HARDWALL
*
* Thus most bits set go first.
*/
#define show_gfp_flags(flags) \
(flags) ? __print_flags(flags, "|", \
{(unsigned long)GFP_HIGHUSER_MOVABLE, "GFP_HIGHUSER_MOVABLE"}, \
{(unsigned long)GFP_HIGHUSER, "GFP_HIGHUSER"}, \
{(unsigned long)GFP_USER, "GFP_USER"}, \
{(unsigned long)GFP_TEMPORARY, "GFP_TEMPORARY"}, \
{(unsigned long)GFP_KERNEL, "GFP_KERNEL"}, \
{(unsigned long)GFP_NOFS, "GFP_NOFS"}, \
{(unsigned long)GFP_ATOMIC, "GFP_ATOMIC"}, \
{(unsigned long)GFP_NOIO, "GFP_NOIO"}, \
{(unsigned long)__GFP_HIGH, "GFP_HIGH"}, \
{(unsigned long)__GFP_WAIT, "GFP_WAIT"}, \
{(unsigned long)__GFP_IO, "GFP_IO"}, \
{(unsigned long)__GFP_COLD, "GFP_COLD"}, \
{(unsigned long)__GFP_NOWARN, "GFP_NOWARN"}, \
{(unsigned long)__GFP_REPEAT, "GFP_REPEAT"}, \
{(unsigned long)__GFP_NOFAIL, "GFP_NOFAIL"}, \
{(unsigned long)__GFP_NORETRY, "GFP_NORETRY"}, \
{(unsigned long)__GFP_COMP, "GFP_COMP"}, \
{(unsigned long)__GFP_ZERO, "GFP_ZERO"}, \
{(unsigned long)__GFP_NOMEMALLOC, "GFP_NOMEMALLOC"}, \
{(unsigned long)__GFP_HARDWALL, "GFP_HARDWALL"}, \
{(unsigned long)__GFP_THISNODE, "GFP_THISNODE"}, \
{(unsigned long)__GFP_RECLAIMABLE, "GFP_RECLAIMABLE"}, \
{(unsigned long)__GFP_MOVABLE, "GFP_MOVABLE"} \
) : "GFP_NOWAIT"
TRACE_EVENT(kmalloc,
TP_PROTO(unsigned long call_site,
const void *ptr,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags),
TP_ARGS(call_site, ptr, bytes_req, bytes_alloc, gfp_flags),
TP_STRUCT__entry(
__field( unsigned long, call_site )
__field( const void *, ptr )
__field( size_t, bytes_req )
__field( size_t, bytes_alloc )
__field( gfp_t, gfp_flags )
),
TP_fast_assign(
__entry->call_site = call_site;
__entry->ptr = ptr;
__entry->bytes_req = bytes_req;
__entry->bytes_alloc = bytes_alloc;
__entry->gfp_flags = gfp_flags;
),
TP_printk("call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s",
__entry->call_site,
__entry->ptr,
__entry->bytes_req,
__entry->bytes_alloc,
show_gfp_flags(__entry->gfp_flags))
);
TRACE_EVENT(kmem_cache_alloc,
TP_PROTO(unsigned long call_site,
const void *ptr,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags),
TP_ARGS(call_site, ptr, bytes_req, bytes_alloc, gfp_flags),
TP_STRUCT__entry(
__field( unsigned long, call_site )
__field( const void *, ptr )
__field( size_t, bytes_req )
__field( size_t, bytes_alloc )
__field( gfp_t, gfp_flags )
),
TP_fast_assign(
__entry->call_site = call_site;
__entry->ptr = ptr;
__entry->bytes_req = bytes_req;
__entry->bytes_alloc = bytes_alloc;
__entry->gfp_flags = gfp_flags;
),
TP_printk("call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s",
__entry->call_site,
__entry->ptr,
__entry->bytes_req,
__entry->bytes_alloc,
show_gfp_flags(__entry->gfp_flags))
);
TRACE_EVENT(kmalloc_node,
TP_PROTO(unsigned long call_site,
const void *ptr,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags,
int node),
TP_ARGS(call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node),
TP_STRUCT__entry(
__field( unsigned long, call_site )
__field( const void *, ptr )
__field( size_t, bytes_req )
__field( size_t, bytes_alloc )
__field( gfp_t, gfp_flags )
__field( int, node )
),
TP_fast_assign(
__entry->call_site = call_site;
__entry->ptr = ptr;
__entry->bytes_req = bytes_req;
__entry->bytes_alloc = bytes_alloc;
__entry->gfp_flags = gfp_flags;
__entry->node = node;
),
TP_printk("call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d",
__entry->call_site,
__entry->ptr,
__entry->bytes_req,
__entry->bytes_alloc,
show_gfp_flags(__entry->gfp_flags),
__entry->node)
);
TRACE_EVENT(kmem_cache_alloc_node,
TP_PROTO(unsigned long call_site,
const void *ptr,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags,
int node),
TP_ARGS(call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node),
TP_STRUCT__entry(
__field( unsigned long, call_site )
__field( const void *, ptr )
__field( size_t, bytes_req )
__field( size_t, bytes_alloc )
__field( gfp_t, gfp_flags )
__field( int, node )
),
TP_fast_assign(
__entry->call_site = call_site;
__entry->ptr = ptr;
__entry->bytes_req = bytes_req;
__entry->bytes_alloc = bytes_alloc;
__entry->gfp_flags = gfp_flags;
__entry->node = node;
),
TP_printk("call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d",
__entry->call_site,
__entry->ptr,
__entry->bytes_req,
__entry->bytes_alloc,
show_gfp_flags(__entry->gfp_flags),
__entry->node)
);
TRACE_EVENT(kfree,
TP_PROTO(unsigned long call_site, const void *ptr),
TP_ARGS(call_site, ptr),
TP_STRUCT__entry(
__field( unsigned long, call_site )
__field( const void *, ptr )
),
TP_fast_assign(
__entry->call_site = call_site;
__entry->ptr = ptr;
),
TP_printk("call_site=%lx ptr=%p", __entry->call_site, __entry->ptr)
);
TRACE_EVENT(kmem_cache_free,
TP_PROTO(unsigned long call_site, const void *ptr),
TP_ARGS(call_site, ptr),
TP_STRUCT__entry(
__field( unsigned long, call_site )
__field( const void *, ptr )
),
TP_fast_assign(
__entry->call_site = call_site;
__entry->ptr = ptr;
),
TP_printk("call_site=%lx ptr=%p", __entry->call_site, __entry->ptr)
);
#endif /* _TRACE_KMEM_H */
/* This part must be outside protection */
#include <trace/define_trace.h>
此差异已折叠。
#if !defined(_TRACE_SCHED_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_SCHED_H
/* use <trace/sched.h> instead */
#ifndef TRACE_EVENT
# error Do not include this file directly.
# error Unless you know what you are doing.
#endif
#include <linux/sched.h>
#include <linux/tracepoint.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM sched
......@@ -157,6 +156,7 @@ TRACE_EVENT(sched_switch,
__array( char, prev_comm, TASK_COMM_LEN )
__field( pid_t, prev_pid )
__field( int, prev_prio )
__field( long, prev_state )
__array( char, next_comm, TASK_COMM_LEN )
__field( pid_t, next_pid )
__field( int, next_prio )
......@@ -166,13 +166,19 @@ TRACE_EVENT(sched_switch,
memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
__entry->prev_pid = prev->pid;
__entry->prev_prio = prev->prio;
__entry->prev_state = prev->state;
memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
__entry->next_pid = next->pid;
__entry->next_prio = next->prio;
),
TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
TP_printk("task %s:%d [%d] (%s) ==> %s:%d [%d]",
__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
__entry->prev_state ?
__print_flags(__entry->prev_state, "|",
{ 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" },
{ 16, "Z" }, { 32, "X" }, { 64, "x" },
{ 128, "W" }) : "R",
__entry->next_comm, __entry->next_pid, __entry->next_prio)
);
......@@ -181,9 +187,9 @@ TRACE_EVENT(sched_switch,
*/
TRACE_EVENT(sched_migrate_task,
TP_PROTO(struct task_struct *p, int orig_cpu, int dest_cpu),
TP_PROTO(struct task_struct *p, int dest_cpu),
TP_ARGS(p, orig_cpu, dest_cpu),
TP_ARGS(p, dest_cpu),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
......@@ -197,7 +203,7 @@ TRACE_EVENT(sched_migrate_task,
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio;
__entry->orig_cpu = orig_cpu;
__entry->orig_cpu = task_cpu(p);
__entry->dest_cpu = dest_cpu;
),
......@@ -334,4 +340,7 @@ TRACE_EVENT(sched_signal_send,
__entry->sig, __entry->comm, __entry->pid)
);
#undef TRACE_SYSTEM
#endif /* _TRACE_SCHED_H */
/* This part must be outside protection */
#include <trace/define_trace.h>
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
/* trace/<type>_event_types.h here */
#include <trace/sched_event_types.h>
#include <trace/irq_event_types.h>
#include <trace/lockdep_event_types.h>
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册