1. 09 8月, 2009 1 次提交
    • F
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker 提交于
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f413cdb8
  2. 29 7月, 2009 1 次提交
    • L
      tracing: Fix missing function_graph events when we splice_read from trace_pipe · 74e7ff8c
      Lai Jiangshan 提交于
      About a half events are missing when we splice_read
      from trace_pipe. They are unexpectedly consumed because we ignore
      the TRACE_TYPE_NO_CONSUME return value used by the function graph
      tracer when it needs to consume the events by itself to walk on
      the ring buffer.
      
      The same problem appears with ftrace_dump()
      
      Example of an output before this patch:
      
      1)               |      ktime_get_real() {
      1)   2.846 us    |          read_hpet();
      1)   4.558 us    |        }
      1)   6.195 us    |      }
      
      After this patch:
      
      0)               |      ktime_get_real() {
      0)               |        getnstimeofday() {
      0)   1.960 us    |          read_hpet();
      0)   3.597 us    |        }
      0)   5.196 us    |      }
      
      The fix also applies on 2.6.30
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: stable@kernel.org
      LKML-Reference: <4A6EEC52.90704@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      74e7ff8c
  3. 23 7月, 2009 1 次提交
  4. 13 7月, 2009 1 次提交
  5. 24 6月, 2009 2 次提交
  6. 16 6月, 2009 1 次提交
    • G
      debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem. · 156f5a78
      GeunSik Lim 提交于
      Many developers use "/debug/" or "/debugfs/" or "/sys/kernel/debug/"
      directory name to mount debugfs filesystem for ftrace according to
      ./Documentation/tracers/ftrace.txt file.
      
      And, three directory names(ex:/debug/, /debugfs/, /sys/kernel/debug/) is
      existed in kernel source like ftrace, DRM, Wireless, Documentation,
      Network[sky2]files to mount debugfs filesystem.
      
      debugfs means debug filesystem for debugging easy to use by greg kroah
      hartman. "/sys/kernel/debug/" name is suitable as directory name
      of debugfs filesystem.
      - debugfs related reference: http://lwn.net/Articles/334546/
      
      Fix inconsistency of directory name to mount debugfs filesystem.
      
      * From Steven Rostedt
        - find_debugfs() and tracing_files() in this patch.
      Signed-off-by: NGeunSik Lim <geunsik.lim@samsung.com>
      Acked-by     : Inaky Perez-Gonzalez <inaky@linux.intel.com>
      Reviewed-by  : Steven Rostedt <rostedt@goodmis.org>
      Reviewed-by  : James Smart <james.smart@emulex.com>
      CC: Jiri Kosina <trivial@kernel.org>
      CC: David Airlie <airlied@linux.ie>
      CC: Peter Osterlund <petero2@telia.com>
      CC: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      CC: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      CC: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      156f5a78
  7. 15 6月, 2009 2 次提交
  8. 02 6月, 2009 1 次提交
  9. 28 5月, 2009 1 次提交
    • H
      trace: disable preemption before taking raw spinlocks · 5b6045a9
      Heiko Carstens 提交于
      s390 code uses smp_processor_id() in __raw_spin_lock() code which
      reveals that a (raw) spinlock is taken without preemption disabled.
      This can potentially deadlock.
      
      To fix this explicitly disable and enable preemption.
      
      BUG: using smp_processor_id() in preemptible [00000000] code: cat/2278
      caller is trace_find_cmdline+0x40/0xfc
      CPU: 0 Not tainted 2.6.30-rc7-dirty #39
      Process cat (pid: 2278, task: 000000003faedb68, ksp: 000000003b33b988)
      000000003b33b988 000000003b33bae0 0000000000000002 0000000000000000
             000000003b33bb80 000000003b33baf8 000000003b33baf8 00000000000175d6
             0000000000000001 000000003b33b988 000000003f9b0000 000000000000000b
             000000000000000c 000000003b33bb40 000000003b33bae0 0000000000000000
             0000000000000000 00000000000175d6 000000003b33bae0 000000003b33bb28
      Call Trace:
      ([<00000000000174b2>] show_trace+0x112/0x170)
       [<0000000000017582>] show_stack+0x72/0x100
       [<0000000000441538>] dump_stack+0xc8/0xd8
       [<000000000025c350>] debug_smp_processor_id+0x114/0x130
       [<00000000000bf0e4>] trace_find_cmdline+0x40/0xfc
       [<00000000000c35d4>] trace_print_context+0x58/0xac
       [<00000000000bb676>] print_trace_line+0x416/0x470
       [<00000000000bc8fe>] s_show+0x4e/0x428
       [<000000000013834e>] seq_read+0x36a/0x5d4
       [<0000000000112a78>] vfs_read+0xc8/0x174
       [<0000000000112c58>] SyS_read+0x74/0xc4
       [<000000000002c7ae>] sysc_noemu+0x10/0x16
       [<000002000012436c>] 0x2000012436c
      1 lock held by cat/2278:
       #0:  (&p->lock){+.+.+.}, at: [<0000000000138056>] seq_read+0x72/0x5d4
      
      [ Impact: fix preempt-unsafe raw spinlock ]
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      5b6045a9
  10. 26 5月, 2009 1 次提交
    • L
      tracing: add trace_event_read_lock() · 4f535968
      Lai Jiangshan 提交于
      I found that there is nothing to protect event_hash in
      ftrace_find_event(). Rcu protects the event hashlist
      but not the event itself while we use it after its extraction
      through ftrace_find_event().
      
      This lack of a proper locking in this spot opens a race
      window between any event dereferencing and module removal.
      
      Eg:
      
      --Task A--
      
      print_trace_line(trace) {
        event = find_ftrace_event(trace)
      
      --Task B--
      
      trace_module_remove_events(mod) {
        list_trace_events_module(ev, mod) {
          unregister_ftrace_event(ev->event) {
            hlist_del(ev->event->node)
              list_del(....)
          }
        }
      }
      |--> module removed, the event has been dropped
      
      --Task A--
      
        event->print(trace); // Dereferencing freed memory
      
      If the event retrieved belongs to a module and this module
      is concurrently removed, we may end up dereferencing a data
      from a freed module.
      
      RCU could solve this, but it would add latency to the kernel and
      forbid tracers output callbacks to call any sleepable code.
      So this fix converts 'trace_event_mutex' to a read/write semaphore,
      and adds trace_event_read_lock() to protect ftrace_find_event().
      
      [ Impact: fix possible freed memory dereference in ftrace ]
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <4A114806.7090302@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      4f535968
  11. 16 5月, 2009 1 次提交
  12. 07 5月, 2009 1 次提交
    • S
      tracing: reset ring buffer when removing modules with events · 9456f0fa
      Steven Rostedt 提交于
      Li Zefan found that there's a race using the event ids of events and
      modules. When a module is loaded, an event id is incremented. We only
      have 16 bits for event ids (65536) and there is a possible (but highly
      unlikely) race that we could load and unload a module that registers
      events so many times that the event id counter overflows.
      
      When it overflows, it then restarts and goes looking for available
      ids. An id is available if it was added by a module and released.
      
      The race is if you have one module add an id, and then is removed.
      Another module loaded can use that same event id. But if the old module
      still had events in the ring buffer, the new module's call back would
      get bogus data.  At best (and most likely) the output would just be
      garbage. But if the module for some reason used pointers (not recommended)
      then this could potentially crash.
      
      The safest thing to do is just reset the ring buffer if a module that
      registered events is removed.
      
      [ Impact: prevent unpredictable results of event id overflows ]
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <49FEAFD0.30106@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9456f0fa
  13. 06 5月, 2009 2 次提交
    • S
      tracing: use proper export symbol for tracing api · 94487d6d
      Steven Rostedt 提交于
      When adding the EXPORT_SYMBOL to some of the tracing API, I accidently
      used EXPORT_SYMBOL instead of EXPORT_SYMBOL_GPL. This patch fixes
      that mistake.
      
      [ Impact: export the tracing code only for GPL modules ]
      Reported-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      94487d6d
    • S
      tracing: export stats of ring buffers to userspace · c8d77183
      Steven Rostedt 提交于
      This patch adds stats to the ftrace ring buffers:
      
       # cat /debugfs/tracing/per_cpu/cpu0/stats
       entries: 42360
       overrun: 30509326
       commit overrun: 0
       nmi dropped: 0
      
      Where entries are the total number of data entries in the buffer.
      
      overrun is the number of entries not consumed and were overwritten by
      the writer.
      
      commit overrun is the number of entries dropped due to nested writers
      wrapping the buffer before the initial writer finished the commit.
      
      nmi dropped is the number of entries dropped due to the ring buffer
      lock being held when an nmi was going to write to the ring buffer.
      Note, this field will be meaningless and will go away when the ring
      buffer becomes lockless.
      
      [ Impact: let userspace know what is happening in the ring buffers ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c8d77183
  14. 29 4月, 2009 4 次提交
    • S
      tracing: fix ref count in splice pages · 7267fa68
      Steven Rostedt 提交于
      The pages allocated for the splice binary buffer did not initialize
      the ref count correctly. This caused pages not to be freed and causes
      a drastic memory leak.
      
      Thanks to logdev I was able to trace the tracer to find where the leak
      was.
      
      [ Impact: stop memory leak when using splice ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7267fa68
    • S
      tracing: have splice only copy full pages · f2957f1f
      Steven Rostedt 提交于
      Splice works with pages, it is much more effecient to use an entire
      page than to copy bits over several pages.
      
      Using logdev to trace the internals of the splice mechanism, I was
      able to see that splice can be very aggressive. When tracing is
      occurring, and the reader caught up to the writer, and the writer
      is on the reader page, the reader will copy what is there into the
      splice page. Splice may iterate over several pages and if the
      writer is still writing to the page, the reader will keep copying
      bits to new pages to pass to userspace.
      
      This patch changes it to only pass data to userspace if the page
      is full (the writer has left the page). This has a small side effect
      that splice can not read a partial page, and must wait for the
      page to fill. This should not be an issue. If tracing has stopped,
      then a use of "read" will still read all of the page.
      
      [ Impact: better performance for ring buffer splice code ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f2957f1f
    • S
      tracing: only add splice page if entries exist · 93459c6c
      Steven Rostedt 提交于
      The splice code allocates a page even when the ring buffer is empty.
      It detects the ring buffer being empty when it it fails to copy
      anything from the ring buffer into the page.
      
      This patch adds a check to see if there is anything in the ring buffer
      before allocating a page.
      
      Thanks to logdev for letting me trace the tracer to find this.
      
      [ Impact: speed up due to removing unnecessary allocation ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      93459c6c
    • S
      tracing: fix ref count in splice pages · 5beae6ef
      Steven Rostedt 提交于
      The pages allocated for the splice binary buffer did not initialize
      the ref count correctly. This caused pages not to be freed and causes
      a drastic memory leak.
      
      Thanks to logdev I was able to trace the tracer to find where the leak
      was.
      
      [ Impact: stop memory leak when using splice ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5beae6ef
  15. 28 4月, 2009 1 次提交
    • S
      tracing: convert ftrace_dump spinlocks to raw · cd891ae0
      Steven Rostedt 提交于
      ftrace_dump is used for printing out the contents of the ftrace ring buffer
      to the console on failure. Currently it uses a spinlock to synchronize
      the output from multiple failures on different CPUs. This spin lock
      currently is a normal spinlock and can cause issues with lockdep and
      lock tracing.
      
      This patch converts it to raw since it is for error handling only.
      The lock is local to the ftrace_dump and is not used by any other
      infrastructure.
      
      [ Impact: prevent ftrace_dump from locking up by internal tracing ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cd891ae0
  16. 22 4月, 2009 1 次提交
    • L
      tracing/events: make struct trace_entry->type to be int type · 7a4f453b
      Li Zefan 提交于
      struct trace_entry->type is unsigned char, while trace event's id is
      int type, thus for a event with id >= 256, it's entry->type is cast
      to (id % 256), and then we can't see the trace output of this event.
      
       # insmod trace-events-sample.ko
       # echo foo_bar > /mnt/tracing/set_event
       # cat /debug/tracing/events/trace-events-sample/foo_bar/id
       256
       # cat /mnt/tracing/trace_pipe
                 <...>-3548  [001]   215.091142: Unknown type 0
                 <...>-3548  [001]   216.089207: Unknown type 0
                 <...>-3548  [001]   217.087271: Unknown type 0
                 <...>-3548  [001]   218.085332: Unknown type 0
      
      [ Impact: fix output for trace events with id >= 256 ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49EEDB0E.5070207@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a4f453b
  17. 18 4月, 2009 3 次提交
  18. 17 4月, 2009 1 次提交
  19. 15 4月, 2009 1 次提交
  20. 14 4月, 2009 4 次提交
    • T
      tracing/filters: use ring_buffer_discard_commit() in filter_check_discard() · eb02ce01
      Tom Zanussi 提交于
      This patch changes filter_check_discard() to make use of the new
      ring_buffer_discard_commit() function and modifies the current users to
      call the old commit function in the non-discard case.
      
      It also introduces a version of filter_check_discard() that uses the
      global trace buffer (filter_current_check_discard()) for those cases.
      
      v2 changes:
      
      - fix compile error noticed by Ingo Molnar
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      LKML-Reference: <1239178554.10295.36.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eb02ce01
    • S
      tracing/filters: use ring_buffer_discard_commit for discarded events · 77d9f465
      Steven Rostedt 提交于
      The ring_buffer_discard_commit makes better usage of the ring_buffer
      when an event has been discarded. It tries to remove it completely if
      possible.
      
      This patch converts the trace event filtering to use
      ring_buffer_discard_commit instead of the ring_buffer_event_discard.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77d9f465
    • T
      tracing/filters: add TRACE_EVENT_FORMAT_NOFILTER event macro · e45f2e2b
      Tom Zanussi 提交于
      Frederic Weisbecker suggested that the trace_special event shouldn't be
      filterable; this patch adds a TRACE_EVENT_FORMAT_NOFILTER event macro
      that allows an event format to be exported without having a filter
      attached, and removes filtering from the trace_special event.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e45f2e2b
    • T
      tracing/filters: add run-time field descriptions to TRACE_EVENT_FORMAT events · e1112b4d
      Tom Zanussi 提交于
      This patch adds run-time field descriptions to all the event formats
      exported using TRACE_EVENT_FORMAT.  It also hooks up all the tracers
      that use them (i.e. the tracers in the 'ftrace subsystem') so they can
      also have their output filtered by the event-filtering mechanism.
      
      When I was testing this, there were a couple of things that fooled me
      into thinking the filters weren't working, when actually they were -
      I'll mention them here so others don't make the same mistakes (and file
      bug reports. ;-)
      
      One is that some of the tracers trace multiple events e.g. the
      sched_switch tracer uses the context_switch and wakeup events, and if
      you don't set filters on all of the traced events, the unfiltered output
      from the events without filters on them can make it look like the
      filtering as a whole isn't working properly, when actually it is doing
      what it was asked to do - it just wasn't asked to do the right thing.
      
      The other is that for the really high-volume tracers e.g. the function
      tracer, the volume of filtered events can be so high that it pushes the
      unfiltered events out of the ring buffer before they can be read so e.g.
      cat'ing the trace file repeatedly shows either no output, or once in
      awhile some output but that isn't there the next time you read the
      trace, which isn't what you normally expect when reading the trace file.
      If you read from the trace_pipe file though, you can catch them before
      they disappear.
      
      Changes from v1:
      
      As suggested by Frederic Weisbecker:
      
      - get rid of externs in functions
      - added unlikely() to filter_check_discard()
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e1112b4d
  21. 10 4月, 2009 4 次提交
    • L
      tracing: fix splice return too large · 93cfb3c9
      Lai Jiangshan 提交于
      I got these from strace:
      
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
      
      I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
      
      It is because the return value of tracing_buffers_splice_read()
      does not include "zero out any left over data" bytes.
      
      But tracing_buffers_read() includes these bytes, we make them
      consistent.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      93cfb3c9
    • L
      tracing: update file->f_pos when splice(2) it · c7625a55
      Lai Jiangshan 提交于
      Impact: Cleanup
      
      These two lines:
      
      	if (unlikely(*ppos))
      		return -ESPIPE;
      
      in tracing_buffers_splice_read() are not needed, VFS layer
      has disabled seek(2).
      
      We remove these two lines, and then we can update file->f_pos.
      
      And tracing_buffers_read() updates file->f_pos, this fix
      make tracing_buffers_splice_read() updates file->f_pos too.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46670.4010503@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7625a55
    • L
      tracing: allocate page when needed · ddd538f3
      Lai Jiangshan 提交于
      Impact: Cleanup
      
      Sometimes, we open trace_pipe_raw, but we don't read(2) it,
      we just splice(2) it, thus, the page is not used.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D4666B.4010608@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ddd538f3
    • L
      tracing: disable seeking for trace_pipe_raw · d1e7e02f
      Lai Jiangshan 提交于
      Impact: disable pread()
      
      We set tracing_buffers_fops.llseek to no_llseek,
      but we can still perform pread() to read this file.
      
      That is not expected.
      
      This fix uses nonseekable_open() to disable it.
      
      tracing_buffers_fops.llseek is still set to no_llseek,
      it mark this file is a "non-seekable device" and is used by
      sys_splice(). See also do_splice() or manual of splice(2):
      
      ERRORS
             EINVAL Target file system doesn't support  splicing;
                    neither  of the descriptors refers to a pipe;
                    or offset given for non-seekable device.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46668.8030806@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d1e7e02f
  22. 07 4月, 2009 5 次提交
    • F
      tracing/ftrace: factorize the tracing files creation · 5452af66
      Frederic Weisbecker 提交于
      Impact: cleanup
      
      Most of the tracing files creation follow the same pattern:
      
      ret = debugfs_create_file(...)
      if (!ret)
      	pr_warning("Couldn't create ... entry\n")
      
      Unify it!
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1238109938-11840-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5452af66
    • N
      Update /debug/tracing/README · bc2b6871
      Nikanth Karthikesan 提交于
      Some of the tracers have been renamed, which was not updated in the in-kernel
      run-time README file. Update it.
      Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
      LKML-Reference: <200903231158.32151.knikanth@suse.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bc2b6871
    • F
      tracing/ftrace: alloc the started cpumask for the trace file · b0dfa978
      Frederic Weisbecker 提交于
      Impact: fix a crash while cat trace file
      
      Currently we are using a cpumask to remind each cpu where a
      trace occured. It lets us notice the user that a cpu just had
      its first trace.
      
      But on latest -tip we have the following crash once we cat the trace
      file:
      
      IP: [<c0270c4a>] print_trace_fmt+0x45/0xe7
      *pde = 00000000
      Oops: 0000 [#1] PREEMPT SMP
      last sysfs file: /sys/class/net/eth0/carrier
      Pid: 3897, comm: cat Not tainted (2.6.29-tip-02825-g0f22972-dirty #81)
      EIP: 0060:[<c0270c4a>] EFLAGS: 00010297 CPU: 0
      EIP is at print_trace_fmt+0x45/0xe7
      EAX: 00000000 EBX: 00000000 ECX: c12d9e98 EDX: ccdb7010
      ESI: d31f4000 EDI: 00322401 EBP: d31f3f10 ESP: d31f3efc
      DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process cat (pid: 3897, ti=d31f2000 task=d3b3cf20 task.ti=d31f2000)
      Stack:
      d31f4080 ccdb7010 d31f4000 d691fe70 ccdb7010 d31f3f24 c0270e5c d31f4000
      d691fe70 d31f4000 d31f3f34 c02718e8 c12d9e98 d691fe70 d31f3f70 c02bfc33
      00001000 09130000 d3b46e00 d691fe98 00000000 00000079 00000001 00000000
      Call Trace:
      [<c0270e5c>] ? print_trace_line+0x170/0x17c
      [<c02718e8>] ? s_show+0xa7/0xbd
      [<c02bfc33>] ? seq_read+0x24a/0x327
      [<c02bf9e9>] ? seq_read+0x0/0x327
      [<c02ab18b>] ? vfs_read+0x86/0xe1
      [<c02ab289>] ? sys_read+0x40/0x65
      [<c0202d8f>] ? sysenter_do_call+0x12/0x3c
      Code: 00 00 00 89 45 ec f7 c7 00 20 00 00 89 55 f0 74 4e f6 86 98 10 00 00 02 74 45 8b 86 8c 10 00 00 8b 9e a8 10 00 00 e8 52 f3 ff ff <0f> a3 03 19 c0 85 c0 75 2b 8b 86 8c 10 00 00 8b 9e a8 10 00 00
      EIP: [<c0270c4a>] print_trace_fmt+0x45/0xe7 SS:ESP 0068:d31f3efc
      CR2: 0000000000000000
      ---[ end trace aa9cf38e5ebed9dd ]---
      
      This is because we alloc the iter->started cpumask on tracing_pipe_open but
      not on tracing_open.
      
      It hadn't been noticed until now because we need to have ring buffer overruns
      to activate the starting of cpu buffer detection.
      
      Also, we need a check to not print the messagge for the first trace on the file.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1238619188-6109-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b0dfa978
    • F
      tracing/ftrace: fix missing include string.h · 5f0c6c03
      Frederic Weisbecker 提交于
      Building a kernel with tracing can raise the following warning on
      tip/master:
      
      kernel/trace/trace.c:1249: error: implicit declaration of function 'vbin_printf'
      
      We are missing an include to string.h
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1238160130-7437-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5f0c6c03
    • L
      tracing: fix incorrect return type of ns2usecs() · cf8e3474
      Lai Jiangshan 提交于
      Impact: fix time output bug in 32bits system
      
      ns2usecs() returns 'long', it's incorrect.
      
      (In i386)
      ...
                <idle>-0     [000]   521.442100: _spin_lock <-tick_do_update_jiffies64
                <idle>-0     [000]   521.442101: do_timer <-tick_do_update_jiffies64
                <idle>-0     [000]   521.442102: update_wall_time <-do_timer
                <idle>-0     [000]   521.442102: update_xtime_cache <-update_wall_time
      ....
      (It always print the time less than 2200 seconds besides ...)
      Because 'long' is 32bits in i386. ( (1<<31) useconds is about 2200 seconds)
      
      ...
                <idle>-0     [001] 4154502640.134759: rcu_bh_qsctr_inc <-__do_softirq
                <idle>-0     [001] 4154502640.134760: _local_bh_enable <-__do_softirq
                <idle>-0     [001] 4154502640.134761: idle_cpu <-irq_exit
      ...
      (very large value)
      Because 'long' is a signed type and it is 32bits in i386.
      
      Changes in v2:
      return 'unsigned long long' instead of 'cycle_t'
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <49D05D10.4030009@cn.fujitsu.com>
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cf8e3474