1. 03 6月, 2009 10 次提交
  2. 02 6月, 2009 10 次提交
    • F
      tracing/stat: remove unappropriate safe walk on list · 43bd1236
      Frederic Weisbecker 提交于
      register_stat_tracer() uses list_for_each_entry_safe
      to check whether a tracer is already present in the list.
      But we don't delete anything from the list here, so
      we don't need the safe version
      
      [ Impact: cleanup list use is stat tracing ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      43bd1236
    • L
      tracing/stat: do some cleanups · dbd3fbdf
      Li Zefan 提交于
      - remove duplicate code in stat_seq_init()
      - update comments to reflect the change from stat list to stat rbtree
      
      [ Impact: clean up ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      dbd3fbdf
    • L
      tracing/stat: remember to free root node · e1622806
      Li Zefan 提交于
      When closing a trace_stat file, we destroy the rbtree constructed during
      file open, but there is memory leak that the root node is not freed.
      
      [ Impact: fix memory leak when closing a trace_stat file ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      e1622806
    • L
      tracing/stat: change dummpy_cmp() to return -1 · b3dd7ba7
      Li Zefan 提交于
      Currently the output of trace_stat/workqueues is totally reversed:
      
       # cat /debug/tracing/trace_stat/workqueues
          ...
          1       17       17      210       37   `-blk_unplug_work+0x0/0x57
          1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
          1     3796     3796                     kblockd/1:120
          ...
      
      The correct output should be:
      
          1     3796     3796                     kblockd/1:120
          1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
          1       17       17      210       37   `-blk_unplug_work+0x0/0x57
      
      It's caused by "tracing/stat: replace linked list by an rbtree for
      sorting"
      (53059c9b67a62a3dc8c80204d3da42b9267ea5a0).
      
      dummpy_cmp() should return -1, so rb_node will always be inserted as
      right-most node in the rbtree, thus we sort the output in ascending
      order.
      
      [ Impact: fix the output of trace_stat/workqueues ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      b3dd7ba7
    • F
      tracing/stat: replace linked list by an rbtree for sorting · 8f184f27
      Frederic Weisbecker 提交于
      When the stat tracing framework prepares the entries from a tracer
      to output them to the user, it starts by computing a linear sort
      through a linked list to give the entries ordered by relevance
      to the user.
      
      This is quite ugly and causes a small latency when we begin to
      read the file.
      
      This patch changes that by turning the linked list into a red-black
      tree. Athough the whole iteration using the start and next tracer
      callbacks while opening the file remain the same, it is now much
      more fast and scalable.
      
      The rbtree guarantees O(log(n)) insertions whereas a linked
      list with linear sorting brought us a O(n) despair. Now the
      (visible) latency has disapeared.
      
      [ Impact: kill the latency while starting to read a stat tracer file ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      8f184f27
    • F
      tracing/stat: replace trace_stat_session by stat_session · 0d64f834
      Frederic Weisbecker 提交于
      The "trace" prefix in struct trace_stat_session type is annoying while
      reading the trace_stat.c file. It makes the lines longer, and
      is not that much useful to explain the sense of this type.
      
      Just keep "struct stat_session" for this type.
      
      [ Impact: make the code a bit more readable ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      0d64f834
    • Z
      trace_workqueue: remove blank line between each cpu · f3c4ae26
      Zhaolei 提交于
      The blankline between each cpu's workqueue stat is not necessary, because
      the cpu number is enough to part them by eye.
      Old style also caused a blankline below headline, and made code complex
      by using lock, disableirq and get cpu var.
      
      Old style:
       # CPU  INSERTED  EXECUTED   NAME
       # |      |         |          |
      
         0   8644       8644       events/0
         0      0          0       cpuset
         ...
         0      1          1       kdmflush
      
         1  35365      35365       events/1
         ...
      
      New style:
       # CPU  INSERTED  EXECUTED   NAME
       # |      |         |          |
      
         0   8644       8644       events/0
         0      0          0       cpuset
         ...
         0      1          1       kdmflush
         1  35365      35365       events/1
         ...
      
      [ Impact: provide more readable code ]
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      f3c4ae26
    • Z
      trace_workqueue: remove cpu_workqueue_stats->first_entry · b8867164
      Zhaolei 提交于
      cpu_workqueue_stats->first_entry is useless because we can retrieve the
      header of a cpu workqueue using:
      if (&cpu_workqueue_stats->list == workqueue_cpu_stat(cpu)->list.next)
      
      [ Impact: cleanup ]
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      b8867164
    • Z
      trace_workqueue: use list_for_each_entry() instead of list_for_each_entry_safe() · 1fdfca9c
      Zhaolei 提交于
      No need to use list_for_each_entry_safe() in iteration without deleting
      any node, we can use list_for_each_entry() instead.
      
      [ Impact: cleanup ]
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      1fdfca9c
    • Z
      ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro · fb39125f
      Zhaolei 提交于
      v3: zhaolei@cn.fujitsu.com: Change TRACE_EVENT definition to new format
          introduced by Steven Rostedt: consolidate trace and trace_event headers
      v2: kosaki@jp.fujitsu.com: print the function names instead of addr, and zap
          the work addr
      v1: zhaolei@cn.fujitsu.com: Make workqueue tracepoints use TRACE_EVENT macro
      
      TRACE_EVENT is a more generic way to define tracepoints.
      Doing so adds these new capabilities to the tracepoints:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
      
      Then, this patch converts DEFINE_TRACE to TRACE_EVENT in workqueue related
      tracepoints.
      
      [ Impact: expand workqueue tracer to events tracing ]
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      fb39125f
  3. 28 5月, 2009 2 次提交
    • Z
      ftrace: don't convert function's local variable name in macro · f2aebaee
      Zhaolei 提交于
      "call" is an argument of macro, but it is also used as a local
      variable name of function in macro.
      We should keep this local variable name distinct from any
      CPP macro parameter name if both are in the same macro scope,
      although it hasn't caused any problem yet.
      
      [ Impact: robustify macro ]
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      f2aebaee
    • H
      trace: disable preemption before taking raw spinlocks · 5b6045a9
      Heiko Carstens 提交于
      s390 code uses smp_processor_id() in __raw_spin_lock() code which
      reveals that a (raw) spinlock is taken without preemption disabled.
      This can potentially deadlock.
      
      To fix this explicitly disable and enable preemption.
      
      BUG: using smp_processor_id() in preemptible [00000000] code: cat/2278
      caller is trace_find_cmdline+0x40/0xfc
      CPU: 0 Not tainted 2.6.30-rc7-dirty #39
      Process cat (pid: 2278, task: 000000003faedb68, ksp: 000000003b33b988)
      000000003b33b988 000000003b33bae0 0000000000000002 0000000000000000
             000000003b33bb80 000000003b33baf8 000000003b33baf8 00000000000175d6
             0000000000000001 000000003b33b988 000000003f9b0000 000000000000000b
             000000000000000c 000000003b33bb40 000000003b33bae0 0000000000000000
             0000000000000000 00000000000175d6 000000003b33bae0 000000003b33bb28
      Call Trace:
      ([<00000000000174b2>] show_trace+0x112/0x170)
       [<0000000000017582>] show_stack+0x72/0x100
       [<0000000000441538>] dump_stack+0xc8/0xd8
       [<000000000025c350>] debug_smp_processor_id+0x114/0x130
       [<00000000000bf0e4>] trace_find_cmdline+0x40/0xfc
       [<00000000000c35d4>] trace_print_context+0x58/0xac
       [<00000000000bb676>] print_trace_line+0x416/0x470
       [<00000000000bc8fe>] s_show+0x4e/0x428
       [<000000000013834e>] seq_read+0x36a/0x5d4
       [<0000000000112a78>] vfs_read+0xc8/0x174
       [<0000000000112c58>] SyS_read+0x74/0xc4
       [<000000000002c7ae>] sysc_noemu+0x10/0x16
       [<000002000012436c>] 0x2000012436c
      1 lock held by cat/2278:
       #0:  (&p->lock){+.+.+.}, at: [<0000000000138056>] seq_read+0x72/0x5d4
      
      [ Impact: fix preempt-unsafe raw spinlock ]
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      5b6045a9
  4. 27 5月, 2009 5 次提交
  5. 26 5月, 2009 5 次提交
    • Z
      ftrace: clean up of using ftrace_event_enable_disable() · 0e907c99
      Zhaolei 提交于
      Always use ftrace_event_enable_disable() to enable/disable an event
      so that we can factorize out the event toggling code.
      
      [ Impact: factorize and cleanup event tracing code ]
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4A14FDFE.2080402@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      0e907c99
    • Z
      ftrace: Add task_comm support for trace_event · b11c53e1
      Zhaolei 提交于
      If we enable a trace event alone without any tracer running (such as
      function tracer, sched switch tracer, etc...) it can't output enough
      task command information.
      
      We need to use the tracing_{start/stop}_cmdline_record() helpers
      which are designed to keep track of cmdlines for any tasks that
      were scheduled during the tracing.
      
      Before this patch:
       # echo 1 > debugfs/tracing/events/sched/sched_switch/enable
       # cat debugfs/tracing/trace
       # tracer: nop
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
                  <...>-2289  [000] 526276.724790: sched_switch: task bash:2289 [120] ==> sshd:2287 [120]
                  <...>-2287  [000] 526276.725231: sched_switch: task sshd:2287 [120] ==> bash:2289 [120]
                  <...>-2289  [000] 526276.725452: sched_switch: task bash:2289 [120] ==> sshd:2287 [120]
                  <...>-2287  [000] 526276.727181: sched_switch: task sshd:2287 [120] ==> swapper:0 [140]
                 <idle>-0     [000] 526277.032734: sched_switch: task swapper:0 [140] ==> events/0:5 [115]
                  <...>-5     [000] 526277.032782: sched_switch: task events/0:5 [115] ==> swapper:0 [140]
       ...
      
      After this patch:
       # tracer: nop
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
                   bash-2269  [000] 527347.989229: sched_switch: task bash:2269 [120] ==> sshd:2267 [120]
                   sshd-2267  [000] 527347.990960: sched_switch: task sshd:2267 [120] ==> bash:2269 [120]
                   bash-2269  [000] 527347.991143: sched_switch: task bash:2269 [120] ==> sshd:2267 [120]
                   sshd-2267  [000] 527347.992959: sched_switch: task sshd:2267 [120] ==> swapper:0 [140]
                 <idle>-0     [000] 527348.531989: sched_switch: task swapper:0 [140] ==> events/0:5 [115]
               events/0-5     [000] 527348.532115: sched_switch: task events/0:5 [115] ==> swapper:0 [140]
       ...
      
      Changelog:
      v1->v2: Update Kconfig to select CONTEXT_SWITCH_TRACER in
              ENABLE_EVENT_TRACING
      v2->v3: v2 can solve problem that was caused by config EVENT_TRACING
              alone, but when CONFIG_FTRACE is off and CONFIG_TRACING is
              selected by other config, compile fail happened again.
              This version solves it.
      
      [ Impact: fix incomplete output of event tracing ]
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <4A14FDFE.2080402@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      b11c53e1
    • P
      kmemtrace: fix kernel parameter documentation · 29fcefba
      Pekka Enberg 提交于
      The kmemtrace.enable kernel parameter no longer works. To enable
      kmemtrace at boot-time, you must pass "ftrace=kmemtrace" instead.
      
      [ Impact: remove obsolete kernel parameter documentation ]
      
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      LKML-Reference: <alpine.DEB.2.00.0905241112190.10296@rocky>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      29fcefba
    • L
      tracing/events: change the type of __str_loc_item to unsigned short · b0aae68c
      Li Zefan 提交于
      When defining a dynamic size string, we add __str_loc_##item to the
      trace entry, and it stores the location of the actual string in
      entry->_str_data[]
      
      'unsigned short' should be sufficient to store this information, thus
      we save 2 bytes per dyn-size string in the ring buffer.
      
      [ Impact: reduce memory occupied by dyn-size strings in ring buffer ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <4A14EDB6.2050507@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      b0aae68c
    • L
      tracing: add trace_event_read_lock() · 4f535968
      Lai Jiangshan 提交于
      I found that there is nothing to protect event_hash in
      ftrace_find_event(). Rcu protects the event hashlist
      but not the event itself while we use it after its extraction
      through ftrace_find_event().
      
      This lack of a proper locking in this spot opens a race
      window between any event dereferencing and module removal.
      
      Eg:
      
      --Task A--
      
      print_trace_line(trace) {
        event = find_ftrace_event(trace)
      
      --Task B--
      
      trace_module_remove_events(mod) {
        list_trace_events_module(ev, mod) {
          unregister_ftrace_event(ev->event) {
            hlist_del(ev->event->node)
              list_del(....)
          }
        }
      }
      |--> module removed, the event has been dropped
      
      --Task A--
      
        event->print(trace); // Dereferencing freed memory
      
      If the event retrieved belongs to a module and this module
      is concurrently removed, we may end up dereferencing a data
      from a freed module.
      
      RCU could solve this, but it would add latency to the kernel and
      forbid tracers output callbacks to call any sleepable code.
      So this fix converts 'trace_event_mutex' to a read/write semaphore,
      and adds trace_event_read_lock() to protect ftrace_find_event().
      
      [ Impact: fix possible freed memory dereference in ftrace ]
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <4A114806.7090302@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      4f535968
  6. 21 5月, 2009 1 次提交
  7. 19 5月, 2009 2 次提交
    • S
      blktrace: remove debugfs entries on bad path · fd51d251
      Stefan Raspl 提交于
      debugfs directory entries for devices are not removed on some
      of the failure pathes in do_blk_trace_setup().
      One way to reproduce is to start blktrace on multiple devices
      with insufficient Vmalloc space: Devices will fail with
      a message like this:
      
      	BLKTRACESETUP(2) /dev/sdu failed: 5/Input/output error
      
      If so, the respective entries in debugfs
      (e.g. /sys/kernel/debug/block/sdu) will remain and subsequent
      attempts to start blktrace on the respective devices will not
      succeed due to existing directories.
      
      [ Impact: fix /debug/tracing file cleanup corner case ]
      Signed-off-by: NStefan Raspl <stefan.raspl@linux.vnet.ibm.com>
      Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      LKML-Reference: <4A1266CC.5040801@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fd51d251
    • L
      tracing/events: Documentation updates · 143c145e
      Li Zefan 提交于
      - fix some typos
      - document the difference between '>' and '>>'
      - document the 'enable' toggle
      - remove section "Defining an event-enabled tracepoint", since it's
        out-dated and sample/trace_events/ already serves this purpose.
      
      v2: add "Updated by Li Zefan"
      
      [ Impact: make documentation up-to-date ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      LKML-Reference: <4A125503.5060406@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      143c145e
  8. 18 5月, 2009 2 次提交
  9. 16 5月, 2009 3 次提交