1. 07 5月, 2009 6 次提交
    • S
      tracing: reset ring buffer when removing modules with events · 9456f0fa
      Steven Rostedt 提交于
      Li Zefan found that there's a race using the event ids of events and
      modules. When a module is loaded, an event id is incremented. We only
      have 16 bits for event ids (65536) and there is a possible (but highly
      unlikely) race that we could load and unload a module that registers
      events so many times that the event id counter overflows.
      
      When it overflows, it then restarts and goes looking for available
      ids. An id is available if it was added by a module and released.
      
      The race is if you have one module add an id, and then is removed.
      Another module loaded can use that same event id. But if the old module
      still had events in the ring buffer, the new module's call back would
      get bogus data.  At best (and most likely) the output would just be
      garbage. But if the module for some reason used pointers (not recommended)
      then this could potentially crash.
      
      The safest thing to do is just reset the ring buffer if a module that
      registered events is removed.
      
      [ Impact: prevent unpredictable results of event id overflows ]
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <49FEAFD0.30106@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9456f0fa
    • S
      tracing: update sample with TRACE_INCLUDE_FILE · 71e1c8ac
      Steven Rostedt 提交于
      When creating trace events for ftrace, the header file with the TRACE_EVENT
      macros must also have a macro called TRACE_SYSTEM. This macro describes
      the name of the system the TRACE_EVENTS are defined for. It also doubles
      as a way for the define_trace.h file to include the file that included
      it.
      
      For example:
      
      in irq.h
      
       #define TRACE_SYSTEM irq
      
      [...]
      
       #include <trace/define_trace.h>
      
      The define_trace will use TRACE_SYSTEM to include irq.h. But if the name
      of the trace system does not match the name of the trace header file,
      one can override it with:
      
      Which will change define_trace.h to inclued foo_trace.h instead of foo.h
      
      The sample comments this, but people that use the sample code will more
      likely use the code and not read the comments. This patch changes the
      sample code to use the TRACE_INCLUDE_FILE to better show developers how to
      use it.
      
      [ Impact: make sample less confusing to developers ]
      Reported-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      71e1c8ac
    • S
      ring-buffer: change test to be more latency friendly · 3e07a4f6
      Steven Rostedt 提交于
      The ring buffer benchmark/test runs a producer for 10 seconds.
      This is done with preemption and interrupts enabled. But if the kernel
      is not compiled with CONFIG_PREEMPT, it basically stops everything
      but interrupts for 10 seconds.
      
      Although this is just a test and is not for production, this attribute
      can be quite annoying. It can also spawn badness elsewhere.
      
      This patch solves the issues by calling "cond_resched" when the system
      is not compiled with CONFIG_PREEMPT. It also keeps track of the time
      spent to call cond_resched such that it does not go against the
      time calculations. That is, if the task schedules away, the time scheduled
      out is removed from the test data. Note, this only works for non PREEMPT
      because we do not know when the task is scheduled out if we have PREEMPT
      enabled.
      
      [ Impact: prevent test from stopping the world for 10 seconds ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3e07a4f6
    • S
      ring-buffer: make moving the tail page a separate function · 6634ff26
      Steven Rostedt 提交于
      Ingo Molnar thought the code would be cleaner if we used a function call
      instead of a goto for moving the tail page. After implementing this,
      it seems that gcc still inlines the result and the output is pretty much
      the same. Since this is considered a cleaner approach, might as well
      implement it.
      
      [ Impact: code clean up ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      6634ff26
    • S
      ring-buffer: check for failed allocation in ring buffer benchmark · 00c81a58
      Steven Rostedt 提交于
      The result of the allocation of the ring buffer read page in the
      ring buffer bench mark does not check the return to see if a page
      was actually allocated. This patch fixes that.
      
      [ Impact: avoid NULL dereference ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      00c81a58
    • S
      ring-buffer: remove unneeded conditional in rb_reserve_next · 8e7abf1c
      Steven Rostedt 提交于
      The code in __rb_reserve_next checks on page overflow if it is the
      original commiter and then resets the page back to the original
      setting.  Although this is fine, and the code is correct, it is
      a bit fragil. Some experimental work I did breaks it easily.
      
      The better and more robust solution is to have all commiters that
      overflow the page, simply subtract what they added.
      
      [ Impact: more robust ring buffer account management ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8e7abf1c
  2. 06 5月, 2009 20 次提交
    • C
      tracing: small trave_events sample Makefile cleanup · 35cf723e
      Christoph Hellwig 提交于
      Use -I$(src) to add the current directory the include path.
      
      [ Impact: cleanup ]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      35cf723e
    • J
      tracing: trace_output.c, fix false positive compiler warning · 48dd0fed
      Jaswinder Singh Rajput 提交于
      This compiler warning:
      
        CC      kernel/trace/trace_output.o
       kernel/trace/trace_output.c: In function ‘register_ftrace_event’:
       kernel/trace/trace_output.c:544: warning: ‘list’ may be used uninitialized in this function
      
      Is wrong as 'list' is always initialized - but GCC (4.3.2) does not
      recognize this relationship properly.
      
      Work around the warning by initializing the variable to NULL.
      
      [ Impact: fix false positive compiler warning ]
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      48dd0fed
    • A
      blktrace: from-sector redundant in trace_block_remap · 22a7c31a
      Alan D. Brunelle 提交于
      Remove redundant from-sector parameter: it's /always/ the bio's sector
      passed in.
      
      [ Impact: cleanup ]
      Signed-off-by: NAlan D. Brunelle <alan.brunelle@hp.com>
      Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <49FF517C.7000503@hp.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      22a7c31a
    • A
      blktrace: correct remap names · a42aaa3b
      Alan D. Brunelle 提交于
      This attempts to clarify names utilized during block I/O remap
      operations (partition, volume manager). It correctly matches up the
      /from/ information for both device & sector. This takes in the concept
      from Kosaki Motohiro and extends it to include better naming for the
      "device_from" field.
      
      [ Impact: cleanup ]
      Signed-off-by: NAlan D. Brunelle <alan.brunelle@hp.com>
      Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <49FF4FAE.3000301@hp.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a42aaa3b
    • M
      tracepoint: trace_sched_migrate_task(): remove parameter · de1d7286
      Mathieu Desnoyers 提交于
      The orig_cpu parameter in trace_sched_migrate_task() is not necessary,
      it can be got by using task_cpu(p) in the probe.
      
      [ Impact: micro-optimization ]
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      [ modified from Mathieu's patch. The original patch is at:
        http://marc.info/?l=linux-kernel&m=123791201716239&w=2 ]
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: zhaolei@cn.fujitsu.com
      Cc: laijs@cn.fujitsu.com
      LKML-Reference: <49FFFDB7.1050402@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      de1d7286
    • L
      tracing/events: fix concurrent access to ftrace_events list · 20c8928a
      Li Zefan 提交于
      A module will add/remove its trace events when it gets loaded/unloaded, so
      the ftrace_events list is not "const", and concurrent access needs to be
      protected.
      
      This patch thus fixes races between loading/unloding modules and read
      'available_events' or read/write 'set_event', etc.
      
      Below shows how to reproduce the race:
      
       # for ((; ;)) { cat /mnt/tracing/available_events; } > /dev/null &
       # for ((; ;)) { insmod trace-events-sample.ko; rmmod sample; } &
      
      After a while:
      
      BUG: unable to handle kernel paging request at 0010011c
      IP: [<c1080f27>] t_next+0x1b/0x2d
      ...
      Call Trace:
       [<c10c90e6>] ? seq_read+0x217/0x30d
       [<c10c8ecf>] ? seq_read+0x0/0x30d
       [<c10b4c19>] ? vfs_read+0x8f/0x136
       [<c10b4fc3>] ? sys_read+0x40/0x65
       [<c1002a68>] ? sysenter_do_call+0x12/0x36
      
      [ Impact: fix races when concurrent accessing ftrace_events list ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4A00F709.3080800@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      20c8928a
    • L
      tracing/events: fix memory leak when unloading module · 2df75e41
      Li Zefan 提交于
      When unloading a module, memory allocated by init_preds() and
      trace_define_field() is not freed.
      
      [ Impact: fix memory leak ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4A00F6E0.3040503@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2df75e41
    • L
      tracing/events: make SAMPLE_TRACE_EVENTS default to n · 96d17980
      Li Zefan 提交于
      Normally a config should be default to n. This patch also makes the
      sample module-only, like SAMPLE_MARKERS and SAMPLE_TRACEPOINTS.
      
      [ Impact: don't build trace event sample by default ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A00F6C0.8090803@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      96d17980
    • L
      tracing/events: don't say hi when loading the trace event sample · fd6da10a
      Li Zefan 提交于
      The sample is useful for testing, and I'm using it. But after
      loading the module, it keeps saying hi every 10 seconds, this may
      be disturbing.
      
      Also Steven said commenting out the "hi" helped in causing races. :)
      
      [ Impact: make testing a bit easier ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A00F6AD.2070008@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fd6da10a
    • S
      ring-buffer: add benchmark and tester · 5092dbc9
      Steven Rostedt 提交于
      This patch adds code that can benchmark the ring buffer as well as
      test it. This code can be compiled into the kernel (not recommended)
      or as a module.
      
      A separate ring buffer is used to not interfer with other users, like
      ftrace. It creates a producer and a consumer (option to disable creation
      of the consumer) and will run for 10 seconds, then sleep for 10 seconds
      and then repeat.
      
      While running, the producer will write 10 byte loads into the ring
      buffer with just putting in the current CPU number. The reader will
      continually try to read the buffer. The reader will alternate from reading
      the buffer via event by event, or by full pages.
      
      The output is a pr_info, thus it will fill up the syslogs.
      
        Starting ring buffer hammer
        End ring buffer hammer
        Time:     9000349 (usecs)
        Overruns: 12578640
        Read:     5358440  (by events)
        Entries:  0
        Total:    17937080
        Missed:   0
        Hit:      17937080
        Entries per millisec: 1993
        501 ns per entry
        Sleeping for 10 secs
        Starting ring buffer hammer
        End ring buffer hammer
        Time:     9936350 (usecs)
        Overruns: 0
        Read:     28146644  (by pages)
        Entries:  74
        Total:    28146718
        Missed:   0
        Hit:      28146718
        Entries per millisec: 2832
        353 ns per entry
        Sleeping for 10 secs
      
      Time:      is the time the test ran
      Overruns:  the number of events that were overwritten and not read
      Read:      the number of events read (either by pages or events)
      Entries:   the number of entries left in the buffer
                       (the by pages will only read full pages)
      Total:     Entries + Read + Overruns
      Missed:    the number of entries that failed to write
      Hit:       the number of entries that were written
      
      The above example shows that it takes ~353 nanosecs per entry when
      there is a reader, reading by pages (and no overruns)
      
      The event by event reader slowed the producer down to 501 nanosecs.
      
      [ Impact: see how changes to the ring buffer affect stability and performance ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5092dbc9
    • S
      ring-buffer: move big if statement down · aa20ae84
      Steven Rostedt 提交于
      In the hot path of the ring buffer "__rb_reserve_next" there's a big
      if statement that does not even return back to the work flow.
      
      	code;
      
      	if (cross to next page) {
      
      		[ lots of code ]
      
      		return;
      	}
      
      	more code;
      
      The condition is even the unlikely path, although we do not denote it
      with an unlikely because gcc is fine with it. The condition is true when
      the write crosses a page boundary, and we need to start at a new page.
      
      Having this if statement makes it hard to read, but calling another
      function to do the work is also not appropriate, because we are using a lot
      of variables that were set before the if statement, and we do not want to
      send them as parameters.
      
      This patch changes it to a goto:
      
      	code;
      
      	if (cross to next page)
      		goto next_page;
      
      	more code;
      
      	return;
      
      next_page:
      
      	[ lots of code]
      
      This makes the code easier to understand, and a bit more obvious.
      
      The output from gcc is practically identical. For some reason, gcc decided
      to use different registers when I switched it to a goto. But other than that,
      the logic is the same.
      
      [ Impact: easier to read code ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      aa20ae84
    • S
      tracing: use proper export symbol for tracing api · 94487d6d
      Steven Rostedt 提交于
      When adding the EXPORT_SYMBOL to some of the tracing API, I accidently
      used EXPORT_SYMBOL instead of EXPORT_SYMBOL_GPL. This patch fixes
      that mistake.
      
      [ Impact: export the tracing code only for GPL modules ]
      Reported-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      94487d6d
    • T
      ftrace: use .sched.text, not .text.sched in recordmcount.pl · 31b6e76e
      Tim Abbott 提交于
      The only references in the kernel to the .text.sched section are in
      recordmcount.pl.  Since the code it has is intended to be example code
      it should refer to real kernel sections.  So change it to .sched.text
      instead.
      
      [ Impact: consistency in comments ]
      Signed-off-by: NTim Abbott <tabbott@mit.edu>
      LKML-Reference: <1241136371-10768-1-git-send-email-tabbott@mit.edu>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      31b6e76e
    • S
      ring-buffer: disable writers when resetting buffers · 41ede23e
      Steven Rostedt 提交于
      As a precaution, it is best to disable writing to the ring buffers
      when reseting them.
      
      [ Impact: prevent weird things if write happens during reset ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      41ede23e
    • S
      ring-buffer: have read page swap increment counter with page entries · afbab76a
      Steven Rostedt 提交于
      In the swap page ring buffer code that is used by the ftrace splice code,
      we scan the page to increment the counter of entries read.
      
      With the number of entries already in the page we simply need to add it.
      
      [ Impact: speed up reading page from ring buffer ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      afbab76a
    • S
      ring-buffer: record page entries in buffer page descriptor · 778c55d4
      Steven Rostedt 提交于
      Currently, when the ring buffer writer overflows the buffer and must
      write over non consumed data, we increment the overrun counter by
      reading the entries on the page we are about to overwrite. This reads
      the entries one by one.
      
      This is not very effecient. This patch adds another entry counter
      into each buffer page descriptor that keeps track of the number of
      entries on the page. Now on overwrite, the overrun counter simply
      needs to add the number of entries that is on the page it is about
      to overwrite.
      
      [ Impact: speed up of ring buffer in overwrite mode ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      778c55d4
    • S
      ring-buffer: convert cpu buffer entries to local_t · e4906eff
      Steven Rostedt 提交于
      The entries counter in cpu buffer is not atomic. It can be updated by
      other interrupts or from another CPU (readers).
      
      But making entries into "atomic_t" causes an atomic operation that can
      hurt performance. Instead we convert it to a local_t that will increment
      a counter with a local CPU atomic operation (if the arch supports it).
      
      Instead of fighting with readers and overwrites that decrement the counter,
      I added a "read" counter. Every time a reader reads an entry it is
      incremented.
      
      We already have a overrun counter and with that, the entries counter and
      the read counter, we can calculate the total number of entries in the
      buffer with:
      
        (entries - overrun) - read
      
      As long as the total number of entries in the ring buffer is less than
      the word size, this will work. But since the entries counter was previously
      a long, this is no different than what we had before.
      
      Thanks to Andrew Morton for pointing out in the first version that
      atomic_t does not replace unsigned long. I switched to atomic_long_t
      even though it is signed. A negative count is most likely a bug.
      
      [ Impact: keep accurate count of cpu buffer entries ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e4906eff
    • S
      tracing: export stats of ring buffers to userspace · c8d77183
      Steven Rostedt 提交于
      This patch adds stats to the ftrace ring buffers:
      
       # cat /debugfs/tracing/per_cpu/cpu0/stats
       entries: 42360
       overrun: 30509326
       commit overrun: 0
       nmi dropped: 0
      
      Where entries are the total number of data entries in the buffer.
      
      overrun is the number of entries not consumed and were overwritten by
      the writer.
      
      commit overrun is the number of entries dropped due to nested writers
      wrapping the buffer before the initial writer finished the commit.
      
      nmi dropped is the number of entries dropped due to the ring buffer
      lock being held when an nmi was going to write to the ring buffer.
      Note, this field will be meaningless and will go away when the ring
      buffer becomes lockless.
      
      [ Impact: let userspace know what is happening in the ring buffers ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c8d77183
    • S
      ring-buffer: add counters for commit overrun and nmi dropped entries · f0d2c681
      Steven Rostedt 提交于
      The WARN_ON in the ring buffer when a commit is preempted and the
      buffer is filled by preceding writes can happen in normal operations.
      The WARN_ON makes it look like a bug, not to mention, because
      it does not stop tracing and calls printk which can also recurse, this
      is prone to deadlock (the WARN_ON is not in a position to recurse).
      
      This patch removes the WARN_ON and replaces it with a counter that
      can be retrieved by a tracer. This counter is called commit_overrun.
      
      While at it, I added a nmi_dropped counter to count any time an NMI entry
      is dropped because the NMI could not take the spinlock.
      
      [ Impact: prevent deadlock by printing normal case warning ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f0d2c681
    • S
      ring-buffer: export symbols · d6ce96da
      Steven Rostedt 提交于
      I'm adding a module to do a series of tests on the ring buffer as well
      as benchmarks. This module needs to have more of the ring buffer API
      exported. There's nothing wrong with reading the ring buffer from a
      module.
      
      [ Impact: allow modules to read pages from the ring buffer ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d6ce96da
  3. 01 5月, 2009 3 次提交
  4. 29 4月, 2009 11 次提交
    • H
      tracing: fix build failure on s390 · a0e39ed3
      Heiko Carstens 提交于
      "tracing: create automated trace defines" causes this compile error on s390,
      as reported by Sachin Sant against linux-next:
      
       kernel/built-in.o: In function `__do_softirq':
       (.text+0x1c680): undefined reference to `__tracepoint_softirq_entry'
      
      This happens because the definitions of the softirq tracepoints were moved
      from kernel/softirq.c to kernel/irq/handle.c. Since s390 doesn't support
      generic hardirqs handle.c doesn't get compiled and the definitions are
      missing.
      
      So move the tracepoints to softirq.c again.
      
      [ Impact: fix build failure on s390 ]
      Reported-by: NSachin Sant <sachinp@in.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      LKML-Reference: <20090429135139.5fac79b8@osiris.boeblingen.de.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a0e39ed3
    • T
      tracing/filters: a better event parser · 8b372562
      Tom Zanussi 提交于
      Replace the current event parser hack with a better one.  Filters are
      no longer specified predicate by predicate, but all at once and can
      use parens and any of the following operators:
      
      numeric fields:
      
      ==, !=, <, <=, >, >=
      
      string fields:
      
      ==, !=
      
      predicates can be combined with the logical operators:
      
      &&, ||
      
      examples:
      
      "common_preempt_count > 4" > filter
      
      "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
      
      If there was an error, the erroneous string along with an error
      message can be seen by looking at the filter e.g.:
      
      ((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
      ^
      parse_error: Field not found
      
      Currently the caret for an error always appears at the beginning of
      the filter; a real position should be used, but the error message
      should be useful even without it.
      
      To clear a filter, '0' can be written to the filter file.
      
      Filters can also be set or cleared for a complete subsystem by writing
      the same filter as would be written to an individual event to the
      filter file at the root of the subsytem.  Note however, that if any
      event in the subsystem lacks a field specified in the filter being
      set, the set will fail and all filters in the subsytem are
      automatically cleared.  This change from the previous version was made
      because using only the fields that happen to exist for a given event
      would most likely result in a meaningless filter.
      
      Because the logical operators are now implemented as predicates, the
      maximum number of predicates in a filter was increased from 8 to 16.
      
      [ Impact: add new, extended trace-filter implementation ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905899.6416.121.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b372562
    • T
      tracing/filters: distinguish between signed and unsigned fields · a118e4d1
      Tom Zanussi 提交于
      The new filter comparison ops need to be able to distinguish between
      signed and unsigned field types, so add an is_signed flag/param to the
      event field struct/trace_define_fields().  Also define a simple macro,
      is_signed_type() to determine the signedness at compile time, used in the
      trace macros.  If the is_signed_type() macro won't work with a specific
      type, a new slightly modified version of TRACE_FIELD() called
      TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.
      
      [ Impact: extend trace-filter code for new feature ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905893.6416.120.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a118e4d1
    • T
      tracing/filters: move preds into event_filter object · 30e673b2
      Tom Zanussi 提交于
      Create a new event_filter object, and move the pred-related members
      out of the call and subsystem objects and into the filter object - the
      details of the filter implementation don't need to be exposed in the
      call and subsystem in any case, and it will also help make the new
      parser implementation a little cleaner.
      
      [ Impact: refactor trace-filter code to prepare for new features ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905887.6416.119.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      30e673b2
    • S
      tracing: x86, mmiotrace: only register for die notifier when tracer active · 0f9a623d
      Stuart Bennett 提交于
      Follow up to afcfe024 in Linus' tree
      ("x86: mmiotrace: quieten spurious warning message")
      Signed-off-by: NStuart Bennett <stuart@freedesktop.org>
      Acked-by: NPekka Paalanen <pq@iki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1240946271-7083-5-git-send-email-stuart@freedesktop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0f9a623d
    • S
      tracing: x86, mmiotrace: refactor clearing/restore of page presence · 46e91d00
      Stuart Bennett 提交于
      * change function names to clear_* from set_*: in reality we only clear
        and restore page presence, and never unconditionally set present.
        Using clear_*({true, false}, ...) is therefore more honest than
        set_*({false, true}, ...)
      
      * upgrade presence storage to pteval_t: doing user-space tracing will
        require saving and manipulation of the _PAGE_PROTNONE bit, in addition
        to the existing _PAGE_PRESENT changes, and having multiple bools stored
        and passed around does not seem optimal
      
      [ Impact: refactor, clean up mmiotrace code ]
      Signed-off-by: NStuart Bennett <stuart@freedesktop.org>
      Acked-by: NPekka Paalanen <pq@iki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1240946271-7083-4-git-send-email-stuart@freedesktop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      46e91d00
    • S
      tracing: x86, mmiotrace: code consistency/legibility improvement · 0492e1bb
      Stuart Bennett 提交于
      kmmio_probe being *p and kmmio_fault_page being sometimes *f and
      sometimes *p is not helpful.
      
      [ Impact: cleanup ]
      Signed-off-by: NStuart Bennett <stuart@freedesktop.org>
      Acked-by: NPekka Paalanen <pq@iki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1240946271-7083-3-git-send-email-stuart@freedesktop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0492e1bb
    • S
      ring-buffer: fix printk output · 7d7d2b80
      Steven Rostedt 提交于
      The warning output in trace_recursive_lock uses %d for a long when
      it should be %ld.
      
      [ Impact: fix compile warning ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7d7d2b80
    • S
      tracing: have splice only copy full pages · f2957f1f
      Steven Rostedt 提交于
      Splice works with pages, it is much more effecient to use an entire
      page than to copy bits over several pages.
      
      Using logdev to trace the internals of the splice mechanism, I was
      able to see that splice can be very aggressive. When tracing is
      occurring, and the reader caught up to the writer, and the writer
      is on the reader page, the reader will copy what is there into the
      splice page. Splice may iterate over several pages and if the
      writer is still writing to the page, the reader will keep copying
      bits to new pages to pass to userspace.
      
      This patch changes it to only pass data to userspace if the page
      is full (the writer has left the page). This has a small side effect
      that splice can not read a partial page, and must wait for the
      page to fill. This should not be an issue. If tracing has stopped,
      then a use of "read" will still read all of the page.
      
      [ Impact: better performance for ring buffer splice code ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f2957f1f
    • S
      tracing: only add splice page if entries exist · 93459c6c
      Steven Rostedt 提交于
      The splice code allocates a page even when the ring buffer is empty.
      It detects the ring buffer being empty when it it fails to copy
      anything from the ring buffer into the page.
      
      This patch adds a check to see if there is anything in the ring buffer
      before allocating a page.
      
      Thanks to logdev for letting me trace the tracer to find this.
      
      [ Impact: speed up due to removing unnecessary allocation ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      93459c6c
    • S
      tracing: fix ref count in splice pages · 5beae6ef
      Steven Rostedt 提交于
      The pages allocated for the splice binary buffer did not initialize
      the ref count correctly. This caused pages not to be freed and causes
      a drastic memory leak.
      
      Thanks to logdev I was able to trace the tracer to find where the leak
      was.
      
      [ Impact: stop memory leak when using splice ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5beae6ef