1. 03 5月, 2014 1 次提交
    • S
      tracing: Use rcu_dereference_sched() for trace event triggers · 561a4fe8
      Steven Rostedt (Red Hat) 提交于
      As trace event triggers are now part of the mainline kernel, I added
      my trace event trigger tests to my test suite I run on all my kernels.
      Now these tests get run under different config options, and one of
      those options is CONFIG_PROVE_RCU, which checks under lockdep that
      the rcu locking primitives are being used correctly. This triggered
      the following splat:
      
      ===============================
      [ INFO: suspicious RCU usage. ]
      3.15.0-rc2-test+ #11 Not tainted
      -------------------------------
      kernel/trace/trace_events_trigger.c:80 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 0
      4 locks held by swapper/1/0:
       #0:  ((&(&j_cdbs->work)->timer)){..-...}, at: [<ffffffff8104d2cc>] call_timer_fn+0x5/0x1be
       #1:  (&(&pool->lock)->rlock){-.-...}, at: [<ffffffff81059856>] __queue_work+0x140/0x283
       #2:  (&p->pi_lock){-.-.-.}, at: [<ffffffff8106e961>] try_to_wake_up+0x2e/0x1e8
       #3:  (&rq->lock){-.-.-.}, at: [<ffffffff8106ead3>] try_to_wake_up+0x1a0/0x1e8
      
      stack backtrace:
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.15.0-rc2-test+ #11
      Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
       0000000000000001 ffff88007e083b98 ffffffff819f53a5 0000000000000006
       ffff88007b0942c0 ffff88007e083bc8 ffffffff81081307 ffff88007ad96d20
       0000000000000000 ffff88007af2d840 ffff88007b2e701c ffff88007e083c18
      Call Trace:
       <IRQ>  [<ffffffff819f53a5>] dump_stack+0x4f/0x7c
       [<ffffffff81081307>] lockdep_rcu_suspicious+0x107/0x110
       [<ffffffff810ee51c>] event_triggers_call+0x99/0x108
       [<ffffffff810e8174>] ftrace_event_buffer_commit+0x42/0xa4
       [<ffffffff8106aadc>] ftrace_raw_event_sched_wakeup_template+0x71/0x7c
       [<ffffffff8106bcbf>] ttwu_do_wakeup+0x7f/0xff
       [<ffffffff8106bd9b>] ttwu_do_activate.constprop.126+0x5c/0x61
       [<ffffffff8106eadf>] try_to_wake_up+0x1ac/0x1e8
       [<ffffffff8106eb77>] wake_up_process+0x36/0x3b
       [<ffffffff810575cc>] wake_up_worker+0x24/0x26
       [<ffffffff810578bc>] insert_work+0x5c/0x65
       [<ffffffff81059982>] __queue_work+0x26c/0x283
       [<ffffffff81059999>] ? __queue_work+0x283/0x283
       [<ffffffff810599b7>] delayed_work_timer_fn+0x1e/0x20
       [<ffffffff8104d3a6>] call_timer_fn+0xdf/0x1be^M
       [<ffffffff8104d2cc>] ? call_timer_fn+0x5/0x1be
       [<ffffffff81059999>] ? __queue_work+0x283/0x283
       [<ffffffff8104d823>] run_timer_softirq+0x1a4/0x22f^M
       [<ffffffff8104696d>] __do_softirq+0x17b/0x31b^M
       [<ffffffff81046d03>] irq_exit+0x42/0x97
       [<ffffffff81a08db6>] smp_apic_timer_interrupt+0x37/0x44
       [<ffffffff81a07a2f>] apic_timer_interrupt+0x6f/0x80
       <EOI>  [<ffffffff8100a5d8>] ? default_idle+0x21/0x32
       [<ffffffff8100a5d6>] ? default_idle+0x1f/0x32
       [<ffffffff8100ac10>] arch_cpu_idle+0xf/0x11
       [<ffffffff8107b3a4>] cpu_startup_entry+0x1a3/0x213
       [<ffffffff8102a23c>] start_secondary+0x212/0x219
      
      The cause is that the triggers are protected by rcu_read_lock_sched() but
      the data is dereferenced with rcu_dereference() which expects it to
      be protected with rcu_read_lock(). The proper reference should be
      rcu_dereference_sched().
      
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Cc: stable@vger.kernel.org # 3.14+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      561a4fe8
  2. 01 5月, 2014 2 次提交
  3. 28 4月, 2014 1 次提交
    • S
      ftrace/module: Hardcode ftrace_module_init() call into load_module() · a949ae56
      Steven Rostedt (Red Hat) 提交于
      A race exists between module loading and enabling of function tracer.
      
      	CPU 1				CPU 2
      	-----				-----
        load_module()
         module->state = MODULE_STATE_COMING
      
      				register_ftrace_function()
      				 mutex_lock(&ftrace_lock);
      				 ftrace_startup()
      				  update_ftrace_function();
      				   ftrace_arch_code_modify_prepare()
      				    set_all_module_text_rw();
      				   <enables-ftrace>
      				    ftrace_arch_code_modify_post_process()
      				     set_all_module_text_ro();
      
      				[ here all module text is set to RO,
      				  including the module that is
      				  loading!! ]
      
         blocking_notifier_call_chain(MODULE_STATE_COMING);
          ftrace_init_module()
      
           [ tries to modify code, but it's RO, and fails!
             ftrace_bug() is called]
      
      When this race happens, ftrace_bug() will produces a nasty warning and
      all of the function tracing features will be disabled until reboot.
      
      The simple solution is to treate module load the same way the core
      kernel is treated at boot. To hardcode the ftrace function modification
      of converting calls to mcount into nops. This is done in init/main.c
      there's no reason it could not be done in load_module(). This gives
      a better control of the changes and doesn't tie the state of the
      module to its notifiers as much. Ftrace is special, it needs to be
      treated as such.
      
      The reason this would work, is that the ftrace_module_init() would be
      called while the module is in MODULE_STATE_UNFORMED, which is ignored
      by the set_all_module_text_ro() call.
      
      Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.comReported-by: NTakao Indoh <indou.takao@jp.fujitsu.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@vger.kernel.org # 2.6.38+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a949ae56
  4. 24 4月, 2014 2 次提交
  5. 17 4月, 2014 2 次提交
  6. 12 4月, 2014 1 次提交
  7. 11 4月, 2014 1 次提交
  8. 10 4月, 2014 1 次提交
  9. 09 4月, 2014 1 次提交
  10. 08 4月, 2014 1 次提交
  11. 02 4月, 2014 1 次提交
  12. 26 3月, 2014 1 次提交
    • S
      tracing: Fix traceon trigger condition to actually turn tracing on · 2c4a33ab
      Steven Rostedt (Red Hat) 提交于
      While working on my tutorial for 2014 Linux Collaboration Summit
      I found that the traceon trigger did not work when conditions were
      used. The other triggers worked fine though. Looking into it, it
      is because of the way the triggers use the ring buffer to store
      the fields it will use for the condition. But if tracing is off, nothing
      is stored in the buffer, and the tracepoint exits before calling the
      trigger to test the condition. This is fine for all the triggers that
      only work when tracing is on, but for traceon trigger that is to
      work when tracing is off, nothing happens.
      
      The fix is simple, just use a temp ring buffer to record the event
      if tracing is off and the event has a trace event conditional trigger
      enabled. The rest of the tracepoint code will work just fine, but
      the tracepoint wont be recorded in the other buffers.
      
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2c4a33ab
  13. 24 3月, 2014 1 次提交
  14. 22 3月, 2014 1 次提交
  15. 21 3月, 2014 1 次提交
    • V
      tracing: Fix array size mismatch in format string · 87291347
      Vaibhav Nagarnaik 提交于
      In event format strings, the array size is reported in two locations.
      One in array subscript and then via the "size:" attribute. The values
      reported there have a mismatch.
      
      For e.g., in sched:sched_switch the prev_comm and next_comm character
      arrays have subscript values as [32] where as the actual field size is
      16.
      
      name: sched_switch
      ID: 301
      format:
              field:unsigned short common_type;       offset:0;       size:2; signed:0;
              field:unsigned char common_flags;       offset:2;       size:1; signed:0;
              field:unsigned char common_preempt_count;       offset:3;       size:1;signed:0;
              field:int common_pid;   offset:4;       size:4; signed:1;
      
              field:char prev_comm[32];       offset:8;       size:16;        signed:1;
              field:pid_t prev_pid;   offset:24;      size:4; signed:1;
              field:int prev_prio;    offset:28;      size:4; signed:1;
              field:long prev_state;  offset:32;      size:8; signed:1;
              field:char next_comm[32];       offset:40;      size:16;        signed:1;
              field:pid_t next_pid;   offset:56;      size:4; signed:1;
              field:int next_prio;    offset:60;      size:4; signed:1;
      
      After bisection, the following commit was blamed:
      92edca07 tracing: Use direct field, type and system names
      
      This commit removes the duplication of strings for field->name and
      field->type assuming that all the strings passed in
      __trace_define_field() are immutable. This is not true for arrays, where
      the type string is created in event_storage variable and field->type for
      all array fields points to event_storage.
      
      Use __stringify() to create a string constant for the type string.
      
      Also, get rid of event_storage and event_storage_mutex that are not
      needed anymore.
      
      also, an added benefit is that this reduces the overhead of events a bit more:
      
         text    data     bss     dec     hex filename
      8424787 2036472 1302528 11763787         b3804b vmlinux
      8420814 2036408 1302528 11759750         b37086 vmlinux.patched
      
      Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com
      
      Cc: Laurent Chavey <chavey@google.com>
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      87291347
  16. 20 3月, 2014 1 次提交
    • S
      trace, ring-buffer: Fix CPU hotplug callback registration · d39ad278
      Srivatsa S. Bhat 提交于
      Subsystems that want to register CPU hotplug callbacks, as well as perform
      initialization for the CPUs that are already online, often do it as shown
      below:
      
      	get_online_cpus();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	register_cpu_notifier(&foobar_cpu_notifier);
      
      	put_online_cpus();
      
      This is wrong, since it is prone to ABBA deadlocks involving the
      cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
      with CPU hotplug operations).
      
      Instead, the correct and race-free way of performing the callback
      registration is:
      
      	cpu_notifier_register_begin();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	/* Note the use of the double underscored version of the API */
      	__register_cpu_notifier(&foobar_cpu_notifier);
      
      	cpu_notifier_register_done();
      
      Fix the tracing ring-buffer code by using this latter form of callback
      registration.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      d39ad278
  17. 19 3月, 2014 1 次提交
  18. 12 3月, 2014 2 次提交
  19. 11 3月, 2014 2 次提交
  20. 07 3月, 2014 8 次提交
  21. 06 3月, 2014 1 次提交
    • R
      blktrace: fix accounting of partially completed requests · af5040da
      Roman Pen 提交于
      trace_block_rq_complete does not take into account that request can
      be partially completed, so we can get the following incorrect output
      of blkparser:
      
        C   R 232 + 240 [0]
        C   R 240 + 232 [0]
        C   R 248 + 224 [0]
        C   R 256 + 216 [0]
      
      but should be:
      
        C   R 232 + 8 [0]
        C   R 240 + 8 [0]
        C   R 248 + 8 [0]
        C   R 256 + 8 [0]
      
      Also, the whole output summary statistics of completed requests and
      final throughput will be incorrect.
      
      This patch takes into account real completion size of the request and
      fixes wrong completion accounting.
      Signed-off-by: NRoman Pen <r.peniaev@gmail.com>
      CC: Steven Rostedt <rostedt@goodmis.org>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      CC: Ingo Molnar <mingo@redhat.com>
      CC: linux-kernel@vger.kernel.org
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <axboe@fb.com>
      af5040da
  22. 04 3月, 2014 1 次提交
  23. 27 2月, 2014 1 次提交
  24. 21 2月, 2014 5 次提交