1. 28 2月, 2009 7 次提交
    • S
      tracing: add raw trace point recording infrastructure · c32e827b
      Steven Rostedt 提交于
      Impact: lower overhead tracing
      
      The current event tracer can automatically pick up trace points
      that are registered with the TRACE_FORMAT macro. But it required
      a printf format string and parsing. Although, this adds the ability
      to get guaranteed information like task names and such, it took
      a hit in overhead processing. This processing can add about 500-1000
      nanoseconds overhead, but in some cases that too is considered
      too much and we want to shave off as much from this overhead as
      possible.
      
      Tom Zanussi recently posted tracing patches to lkml that are based
      on a nice idea about capturing the data via C structs using
      STRUCT_ENTER, STRUCT_EXIT type of macros.
      
      I liked that method very much, but did not like the implementation
      that required a developer to add data/code in several disjoint
      locations.
      
      This patch extends the event_tracer macros to do a similar "raw C"
      approach that Tom Zanussi did. But instead of having the developers
      needing to tweak a bunch of code all over the place, they can do it
      all in one macro - preferably placed near the code that it is
      tracing. That makes it much more likely that tracepoints will be
      maintained on an ongoing basis by the code they modify.
      
      The new macro TRACE_EVENT_FORMAT is created for this approach. (Note,
      a developer may still utilize the more low level DECLARE_TRACE macros
      if they don't care about getting their traces automatically in the event
      tracer.)
      
      They can also use the existing TRACE_FORMAT if they don't need to code
      the tracepoint in C, but just want to use the convenience of printf.
      
      So if the developer wants to "hardwire" a tracepoint in the fastest
      possible way, and wants to acquire their data via a user space utility
      in a raw binary format, or wants to see it in the trace output but not
      sacrifice any performance, then they can implement the faster but
      more complex TRACE_EVENT_FORMAT macro.
      
      Here's what usage looks like:
      
        TRACE_EVENT_FORMAT(name,
      	TPPROTO(proto),
      	TPARGS(args),
      	TPFMT(fmt, fmt_args),
      	TRACE_STUCT(
      		TRACE_FIELD(type1, item1, assign1)
      		TRACE_FIELD(type2, item2, assign2)
      			[...]
      	),
      	TPRAWFMT(raw_fmt)
      	);
      
      Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT
      uses.
      
       name: is the unique identifier of the trace point
       proto: The proto type that the trace point uses
       args: the args in the proto type
       fmt: printf format to use with the event printf tracer
       fmt_args: the printf argments to match fmt
      
       TRACE_STRUCT starts the ability to create a structure.
       Each item in the structure is defined with a TRACE_FIELD
      
        TRACE_FIELD(type, item, assign)
      
       type: the C type of item.
       item: the name of the item in the stucture
       assign: what to assign the item in the trace point callback
      
       raw_fmt is a way to pretty print the struct. It must match
        the order of the items are added in TRACE_STUCT
      
       An example of this would be:
      
       TRACE_EVENT_FORMAT(sched_wakeup,
      	TPPROTO(struct rq *rq, struct task_struct *p, int success),
      	TPARGS(rq, p, success),
      	TPFMT("task %s:%d %s",
      	      p->comm, p->pid, success?"succeeded":"failed"),
      	TRACE_STRUCT(
      		TRACE_FIELD(pid_t, pid, p->pid)
      		TRACE_FIELD(int, success, success)
      	),
      	TPRAWFMT("task %d success=%d")
      	);
      
       This creates us a unique struct of:
      
       struct {
      	pid_t		pid;
      	int		success;
       };
      
       And the way the call back would assign these values would be:
      
      	entry->pid = p->pid;
      	entry->success = success;
      
      The nice part about this is that the creation of the assignent is done
      via macro magic in the event tracer.  Once the TRACE_EVENT_FORMAT is
      created, the developer will then have a faster method to record
      into the ring buffer. They do not need to worry about the tracer itself.
      
      The developer would only need to touch the files in include/trace/*.h
      
      Again, I would like to give special thanks to Tom Zanussi for this
      nice idea.
      
      Idea-from: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      c32e827b
    • S
      tracing: add interface to write into current tracer buffer · ef5580d0
      Steven Rostedt 提交于
      Right now all tracers must manage their own trace buffers. This was
      to enforce tracers to be independent in case we finally decide to
      allow each tracer to have their own trace buffer.
      
      But now we are adding event tracing that writes to the current tracer's
      buffer. This adds an interface to allow events to write to the current
      tracer buffer without having to manage its own. Since event tracing
      has no "tracer", and is just a way to hook into any other tracer.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      ef5580d0
    • S
      tracing: add subsystem sched for sched events · 3d7ba938
      Steven Rostedt 提交于
      Add the TRACE_SYSTEM sched for the sched events.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      3d7ba938
    • S
      tracing: add subsystem irq for irq events · 0ec2ef15
      Steven Rostedt 提交于
      Add the TRACE_SYSTEM irq for the irq events.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      0ec2ef15
    • S
      tracing: make the set_event and available_events subsystem aware · b628b3e6
      Steven Rostedt 提交于
      This patch makes the event files, set_event and available_events
      aware of the subsystem.
      
      Now you can enable an entire subsystem with:
      
        echo 'irq:*' > set_event
      
      Note: the '*' is not needed.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      b628b3e6
    • S
      tracing: add subsystem level to trace events · 6ecc2d1c
      Steven Rostedt 提交于
      If a trace point header defines TRACE_SYSTEM, then it will add the
      following trace points into that event system.
      
      If include/trace/irq_event_types.h has:
      
       #define TRACE_SYSTEM irq
      
      at the top and
      
       #undef TRACE_SYSTEM
      
      at the bottom, then a directory "irq" will be created in the
      /debug/tracing/events directory. Inside that directory will contain the
      two trace points that are defined in include/trace/irq_event_types.h.
      
      Only adding the above to irq and not to sched, we get:
      
       # ls /debug/tracing/events/
      irq                     sched_process_exit  sched_signal_send  sched_wakeup_new
      sched_kthread_stop      sched_process_fork  sched_switch
      sched_kthread_stop_ret  sched_process_free  sched_wait_task
      sched_migrate_task      sched_process_wait  sched_wakeup
      
       # ls /debug/tracing/events/irq
      irq_handler_entry  irq_handler_exit
      
      If we add #define TRACE_SYSTEM sched to the trace/sched_event_types.h
      then the rest of the trace events will be put in a sched directory
      within the events directory.
      
      I've been playing with this idea of the subsystem for a while, but
      recently Tom Zanussi posted some patches to lkml that included this
      method. Tom's approach was clean and got me to finally put some effort
      to clean up the event trace points.
      
      Thanks to Tom Zanussi for demonstrating how nice the subsystem
      method is.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      6ecc2d1c
    • S
      tracing: move trace point formats to files in include/trace directory · eb594e45
      Steven Rostedt 提交于
      Impact: clean up
      
      To further facilitate the ease of adding trace points for developers, this
      patch creates include/trace/trace_events.h and
      include/trace/trace_event_types.h.
      
      The former file will hold the trace/<type>.h files and the latter will hold
      the trace/<type>_event_types.h files.
      
      To create new tracepoints and to have them automatically
      appear in the event tracer, a developer makes the trace/<type>.h file
      which includes <linux/tracepoint.h> and the trace/<type>_event_types.h file.
      
      The trace/<type>_event_types.h file will hold the TRACE_FORMAT
      macros.
      
      Then add the trace/<type>.h file to trace/trace_events.h,
      and add the trace/<type>_event_types.h to the trace_event_types.h file.
      
      No need to modify files elsewhere.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      eb594e45
  2. 27 2月, 2009 12 次提交
  3. 26 2月, 2009 5 次提交
  4. 25 2月, 2009 16 次提交
    • F
      tracing/core: make the read callbacks reentrants · d7350c3f
      Frederic Weisbecker 提交于
      Now that several per-cpu files can be read or spliced at the
      same, we want the read/splice callbacks for tracing files to be
      reentrants.
      
      Until now, a single global mutex (trace_types_lock) serialized
      the access to tracing_read_pipe(), tracing_splice_read_pipe(),
      and the seq helpers.
      
      Ie: it means that if a user tries to read trace_pipe0 and
      trace_pipe1 at the same time, the access to the function
      tracing_read_pipe() is contended and one reader must wait for
      the other to finish its read call.
      
      The trace_type_lock mutex is mostly here to serialize the access
      to the global current tracer (current_trace), which can be
      changed concurrently. Although the iter struct keeps a private
      pointer to this tracer, its callbacks can be changed by another
      function.
      
      The method used here is to not keep anymore private reference to
      the tracer inside the iterator but to make a copy of it inside
      the iterator. Then it checks on subsequents read calls if the
      tracer has changed. This is not costly because the current
      tracer is not expected to be changed often, so we use a branch
      prediction for that.
      
      Moreover, we add a private mutex to the iterator (there is one
      iterator per file descriptor) to serialize the accesses in case
      of multiple consumers per file descriptor (which would be a
      silly idea from the user). Note that this is not to protect the
      ring buffer, since the ring buffer already serializes the
      readers accesses. This is to prevent from traces weirdness in
      case of concurrent consumers. But these mutexes can be dropped
      anyway, that would not result in any crash. Just tell me what
      you think about it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d7350c3f
    • F
      tracing/core: introduce per cpu tracing files · b04cc6b1
      Frederic Weisbecker 提交于
      Impact: split up tracing output per cpu
      
      Currently, on the tracing debugfs directory, three files are
      available to the user to let him extracting the trace output:
      
      - trace is an iterator through the ring-buffer. It's a reader
        but not a consumer It doesn't block when no more traces are
        available.
      
      - trace pretty similar to the former, except that it adds more
        informations such as prempt count, irq flag, ...
      
      - trace_pipe is a reader and a consumer, it will also block
        waiting for traces if necessary (heh, yes it's a pipe).
      
      The traces coming from different cpus are curretly mixed up
      inside these files. Sometimes it messes up the informations,
      sometimes it's useful, depending on what does the tracer
      capture.
      
      The tracing_cpumask file is useful to filter the output and
      select only the traces captured a custom defined set of cpus.
      But still it is not enough powerful to extract at the same time
      one trace buffer per cpu.
      
      So this patch creates a new directory: /debug/tracing/per_cpu/.
      
      Inside this directory, you will now find one trace_pipe file and
      one trace file per cpu.
      
      Which means if you have two cpus, you will have:
      
       trace0
       trace1
       trace_pipe0
       trace_pipe1
      
      And of course, reading these files will have the same effect
      than with the usual tracing files, except that you will only see
      the traces from the given cpu.
      
      The original all-in-one cpu trace file are still available on
      their original place.
      
      Until now, only one consumer was allowed on trace_pipe to avoid
      racy consuming on the ring-buffer. Now the approach changed a
      bit, you can have only one consumer per cpu.
      
      Which means you are allowed to read concurrently trace_pipe0 and
      trace_pipe1 But you can't have two readers on trace_pipe0 or
      trace_pipe1.
      
      Following the same logic, if there is one reader on the common
      trace_pipe, you can not have at the same time another reader on
      trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
      a consumer in all cpu buffers in essence.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b04cc6b1
    • I
      Merge branch 'tip/tracing/ftrace' of... · 2b1b858f
      Ingo Molnar 提交于
      Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
      2b1b858f
    • I
      tracing: remove /debug/tracing/latency_trace · 886b5b73
      Ingo Molnar 提交于
      Impact: remove old debug/tracing API
      
      /debug/tracing/latency_trace is an old legacy format we kept from
      the old latency tracer. Remove the file for now. If there's any
      useful bit missing then we'll propagate any useful output bits into
      the /debug/tracing/trace output.
      Reported-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      886b5b73
    • I
      tracing/hw-branch-tracing: convert bts-tracer mutex to a spinlock · 2d542cf3
      Ingo Molnar 提交于
      Impact: fix CPU hotplug lockup
      
      bts_hotcpu_handler() is called with irqs disabled, so using mutex_lock()
      is a no-no.
      
      All the BTS codepaths here are atomic (they do not schedule), so using
      a spinlock is the right solution.
      
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d542cf3
    • S
      tracing: make event directory structure · 1473e441
      Steven Rostedt 提交于
      This patch adds the directory /debug/tracing/events/ that will contain
      all the registered trace points.
      
       # ls /debug/tracing/events/
      sched_kthread_stop      sched_process_fork  sched_switch
      sched_kthread_stop_ret  sched_process_free  sched_wait_task
      sched_migrate_task      sched_process_wait  sched_wakeup
      sched_process_exit      sched_signal_send   sched_wakeup_new
      
       # ls /debug/tracing/events/sched_switch/
      enable
      
       # cat /debug/tracing/events/sched_switch/enable
      1
      
       # cat /debug/tracing/set_event
      sched_switch
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      1473e441
    • S
      tracing: add schedule events to event trace · f3fe8e4a
      Steven Rostedt 提交于
      This patch changes the trace/sched.h to use the DECLARE_TRACE_FMT
      such that they are automatically registered with the event tracer.
      
      And it also adds the tracing sched headers to kernel/trace/events.c
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      f3fe8e4a
    • S
      tracing: add event trace infrastructure · b77e38aa
      Steven Rostedt 提交于
      This patch creates the event tracing infrastructure of ftrace.
      It will create the files:
      
       /debug/tracing/available_events
       /debug/tracing/set_event
      
      The available_events will list the trace points that have been
      registered with the event tracer.
      
      set_events will allow the user to enable or disable an event hook.
      
      example:
      
       # echo sched_wakeup > /debug/tracing/set_event
      
      Will enable the sched_wakeup event (if it is registered).
      
       # echo "!sched_wakeup" >> /debug/tracing/set_event
      
      Will disable the sched_wakeup event (and only that event).
      
       # echo > /debug/tracing/set_event
      
      Will disable all events (notice the '>')
      
       # cat /debug/tracing/available_events > /debug/tracing/set_event
      
      Will enable all registered event hooks.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      b77e38aa
    • S
      tracing: add DEFINE_TRACE_FMT to tracepoint.h · 7c37730c
      Steven Rostedt 提交于
      This patch creates a DEFINE_TRACE_FMT to map to DECLARE_TRACE.
      This allows for the developers to place format strings and
      args in with their tracepoint declaration. A tracer may now
      override the DEFINE_TRACE_FMT macro and use it to record
      a default format.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      7c37730c
    • L
      Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc · 694593e3
      Linus Torvalds 提交于
      * 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
        proc: fix PG_locked reporting in /proc/kpageflags
      694593e3
    • L
      Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6 · 21209b61
      Linus Torvalds 提交于
      * 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
        Add i2c_board_info for RiscPC PCF8583
        i2c: Make sure i2c_algo_bit_data.timeout is HZ-independent
        i2c-dev: Clarify the unit of ioctl I2C_TIMEOUT
        i2c: Timeouts reach -1
        i2c: Fix misplaced parentheses
      21209b61
    • L
      Merge branch 'firedtv-merge' of... · a792cd12
      Linus Torvalds 提交于
      Merge branch 'firedtv-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
      
      * 'firedtv-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
        firedtv: dvb_frontend_info for FireDTV S2, fix "frequency limits undefined" error
        firedtv: massive refactoring
        firedtv: rename files, variables, functions from firesat to firedtv
        firedtv: Use DEFINE_SPINLOCK
        firedtv: fix registration - adapter number could only be zero
        firedtv: use length_field() of PMT as length
        firedtv: fix returned struct for ca_info
        firedtv: cleanups and minor fixes
        ieee1394: remove superfluous assertions
        ieee1394: inherit ud vendor_id from node vendor_id
        ieee1394: add hpsb_node_read() and hpsb_node_lock()
        ieee1394: use correct barrier types between accesses of nodeid and generation
        firesat: copyrights, rename to firedtv, API conversions, fix remote control input
        firesat: avc resend
        firesat: update isochronous interface, add CI support
        firesat: add DVB-S support for DVB-S2 devices
        firesat: fix DVB-S2 device recognition
        DVB: add firesat driver
      a792cd12
    • L
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 4daa0682
      Linus Torvalds 提交于
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: Fix deadlock in ext4_write_begin() and ext4_da_write_begin()
        ext4: Add fallback for find_group_flex
      4daa0682
    • R
      Add i2c_board_info for RiscPC PCF8583 · 531660ef
      Russell King 提交于
      Add the necessary i2c_board_info structure to fix the lack of PCF8583
      RTC on RiscPC.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      531660ef
    • J
      i2c: Make sure i2c_algo_bit_data.timeout is HZ-independent · 082a4cf8
      Jean Delvare 提交于
      i2c_algo_bit_data.timeout is supposed to be in jiffies, so drivers
      should use set this value in terms of HZ.
      
      Ultimately I think this field should be discarded in favor of
      i2c_adapter.timeout, but that's left for a future patch.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: NLennert Buytenhek <kernel@wantstofly.org>
      Acked-by: NLen Sorensen <lsorense@csclub.uwaterloo.ca>
      082a4cf8
    • J
      i2c-dev: Clarify the unit of ioctl I2C_TIMEOUT · cd97f39b
      Jean Delvare 提交于
      The unit in which user-space can set the bus timeout value is jiffies
      for historical reasons (back when HZ was always 100.) This is however
      not good because user-space doesn't know how long a jiffy lasts. The
      timeout value should instead be set in a fixed time unit. Given the
      original value of HZ, this unit should be 10 ms, for compatibility.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Acked-by: NWolfram Sang <w.sang@pengutronix.de>
      cd97f39b