1. 16 5月, 2018 4 次提交
    • P
      rcu: Rename cond_resched_rcu_qs() to cond_resched_tasks_rcu_qs() · cee43939
      Paul E. McKenney 提交于
      Commit e31d28b6 ("trace: Eliminate cond_resched_rcu_qs() in favor
      of cond_resched()") substituted cond_resched() for the earlier call
      to cond_resched_rcu_qs().  However, the new-age cond_resched() does
      not do anything to help RCU-tasks grace periods because (1) RCU-tasks
      is only enabled when CONFIG_PREEMPT=y and (2) cond_resched() is a
      complete no-op when preemption is enabled.  This situation results
      in hangs when running the trace benchmarks.
      
      A number of potential fixes were discussed on LKML
      (https://lkml.kernel.org/r/20180224151240.0d63a059@vmware.local.home),
      including making cond_resched() not be a no-op; making cond_resched()
      not be a no-op, but only when running tracing benchmarks; reverting
      the aforementioned commit (which works because cond_resched_rcu_qs()
      does provide an RCU-tasks quiescent state; and adding a call to the
      scheduler/RCU rcu_note_voluntary_context_switch() function.  All were
      deemed unsatisfactory, either due to added cond_resched() overhead or
      due to magic functions inviting cargo culting.
      
      This commit renames cond_resched_rcu_qs() to cond_resched_tasks_rcu_qs(),
      which provides a clear hint as to what this function is doing and
      why and where it should be used, and then replaces the call to
      cond_resched() with cond_resched_tasks_rcu_qs() in the trace benchmark's
      benchmark_event_kthread() function.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      cee43939
    • B
      rcu: Call wake_nocb_leader_defer() with 'FORCE' when nocb_q_count is high · efcd2d54
      Byungchul Park 提交于
      If an excessive number of callbacks have been queued, but the NOCB
      leader kthread's wakeup must be deferred, then we should wake up the
      leader unconditionally once it is safe to do so.
      
      This was handled correctly in commit fbce7497 ("rcu: Parallelize and
      economize NOCB kthread wakeups"), but then commit 8be6e1b1 ("rcu:
      Use timer as backstop for NOCB deferred wakeups") passed RCU_NOCB_WAKE
      instead of the correct RCU_NOCB_WAKE_FORCE to wake_nocb_leader_defer().
      As an interesting aside, RCU_NOCB_WAKE_FORCE is never passed to anything,
      which should have been taken as a hint.  ;-)
      
      This commit therefore passes RCU_NOCB_WAKE_FORCE instead of RCU_NOCB_WAKE
      to wake_nocb_leader_defer() when a callback is queued onto a NOCB CPU
      that already has an excessive number of callbacks pending.
      Signed-off-by: NByungchul Park <byungchul.park@lge.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      efcd2d54
    • P
      rcu: Don't allocate rcu_nocb_mask if no one needs it · ef126206
      Paul E. McKenney 提交于
      Commit 44c65ff2 ("rcu: Eliminate NOCBs CPU-state Kconfig options")
      made allocation of rcu_nocb_mask depend only on the rcu_nocbs=,
      nohz_full=, or isolcpus= kernel boot parameters.  However, it failed
      to change the initial value of rcu_init_nohz()'s local variable
      need_rcu_nocb_mask to false, which can result in useless allocation
      of an all-zero rcu_nocb_mask.  This commit therefore fixes this bug by
      changing the initial value of need_rcu_nocb_mask to false.
      
      While we are in the area, also correct the error message that is printed
      when someone specifies that can-never-exist CPUs should be NOCBs CPUs.
      Reported-by: NByungchul Park <byungchul.park@lge.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NByungchul Park <byungchul.park@lge.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      ef126206
    • B
      rcu: Inline rcu_preempt_do_callback() into its sole caller · be01b4ca
      Byungchul Park 提交于
      The rcu_preempt_do_callbacks() function was introduced in commit
      09223371(rcu: Use softirq to address performance regression), where it
      was necessary to handle kernel builds both containing and not containing
      RCU-preempt.  Since then, various changes (most notably f8b7fc6b
      ("rcu: use softirq instead of kthreads except when RCU_BOOST=y")) have
      resulted in this function being invoked only from rcu_kthread_do_work(),
      which is present only in kernels containing RCU-preempt, which in turn
      means that the rcu_preempt_do_callbacks() function is no longer needed.
      
      This commit therefore inlines rcu_preempt_do_callbacks() into its
      sole remaining caller and also removes the rcu_state_p and rcu_data_p
      indirection for added clarity.
      Signed-off-by: NByungchul Park <byungchul.park@lge.com>
      Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      [ paulmck: Remove the rcu_state_p and rcu_data_p indirection. ]
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      be01b4ca
  2. 21 2月, 2018 5 次提交
    • M
      rcu: Use wrapper for lockdep asserts · a32e01ee
      Matthew Wilcox 提交于
      Commits c0b334c5 and ea9b0c8a introduced new sparse warnings
      by accessing rcu_node->lock directly and ignoring the __private
      marker.  Introduce a new wrapper and use it.  Also fix a similar problem
      in srcutree.c introduced by a3883df3.
      Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a32e01ee
    • P
      rcu: Remove obsolete callback-invocation statistics for debugfs · 62df63e0
      Paul E. McKenney 提交于
      The debugfs interface displayed statistics on RCU callback invocation but
      this interface has since been removed.  This commit therefore removes the
      no-longer-used rcu_data structure's ->n_cbs_invoked and ->n_nocbs_invoked
      fields along with their updates.
      
      If this information proves necessary in the future, the corresponding
      event traces will be added.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      62df63e0
    • P
      rcu: Remove obsolete boost statistics for debugfs · bec06785
      Paul E. McKenney 提交于
      The debugfs interface displayed statistics on RCU priority boosting,
      but this interface has since been removed.  This commit therefore
      removes the no-longer-used rcu_data structure's ->n_tasks_boosted,
      ->n_exp_boosts, and ->n_exp_boosts and their updates.
      
      If this information proves necessary in the future, the corresponding
      event traces will be added.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bec06785
    • T
      rcu: Call touch_nmi_watchdog() while printing stall warnings · 3caa973b
      Tejun Heo 提交于
      When RCU stall warning triggers, it can print out a lot of messages
      while holding spinlocks.  If the console device is slow (e.g. an
      actual or IPMI serial console), it may end up triggering NMI hard
      lockup watchdog like the following.
      
      *** CPU printking while holding RCU spinlock
      
        PID: 4149739  TASK: ffff881a46baa880  CPU: 13  COMMAND: "CPUThreadPool8"
         #0 [ffff881fff945e48] crash_nmi_callback at ffffffff8103f7d0
         #1 [ffff881fff945e58] nmi_handle at ffffffff81020653
         #2 [ffff881fff945eb0] default_do_nmi at ffffffff81020c36
         #3 [ffff881fff945ed0] do_nmi at ffffffff81020d32
         #4 [ffff881fff945ef0] end_repeat_nmi at ffffffff81956a7e
            [exception RIP: io_serial_in+21]
            RIP: ffffffff81630e55  RSP: ffff881fff943b88  RFLAGS: 00000002
            RAX: 000000000000ca00  RBX: ffffffff8230e188  RCX: 0000000000000000
            RDX: 00000000000002fd  RSI: 0000000000000005  RDI: ffffffff8230e188
            RBP: ffff881fff943bb0   R8: 0000000000000000   R9: ffffffff820cb3c4
            R10: 0000000000000019  R11: 0000000000002000  R12: 00000000000026e1
            R13: 0000000000000020  R14: ffffffff820cd398  R15: 0000000000000035
            ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
        --- <NMI exception stack> ---
         #5 [ffff881fff943b88] io_serial_in at ffffffff81630e55
         #6 [ffff881fff943b90] wait_for_xmitr at ffffffff8163175c
         #7 [ffff881fff943bb8] serial8250_console_putchar at ffffffff816317dc
         #8 [ffff881fff943bd8] uart_console_write at ffffffff8162ac00
         #9 [ffff881fff943c08] serial8250_console_write at ffffffff81634691
        #10 [ffff881fff943c80] univ8250_console_write at ffffffff8162f7c2
        #11 [ffff881fff943c90] console_unlock at ffffffff810dfc55
        #12 [ffff881fff943cf0] vprintk_emit at ffffffff810dffb5
        #13 [ffff881fff943d50] vprintk_default at ffffffff810e01bf
        #14 [ffff881fff943d60] vprintk_func at ffffffff810e1127
        #15 [ffff881fff943d70] printk at ffffffff8119a8a4
        #16 [ffff881fff943dd0] print_cpu_stall_info at ffffffff810eb78c
        #17 [ffff881fff943e88] rcu_check_callbacks at ffffffff810ef133
        #18 [ffff881fff943ee8] update_process_times at ffffffff810f3497
        #19 [ffff881fff943f10] tick_sched_timer at ffffffff81103037
        #20 [ffff881fff943f38] __hrtimer_run_queues at ffffffff810f3f38
        #21 [ffff881fff943f88] hrtimer_interrupt at ffffffff810f442b
      
      *** CPU triggering the hardlockup watchdog
      
        PID: 4149709  TASK: ffff88010f88c380  CPU: 26  COMMAND: "CPUThreadPool35"
         #0 [ffff883fff1059d0] machine_kexec at ffffffff8104a874
         #1 [ffff883fff105a30] __crash_kexec at ffffffff811116cc
         #2 [ffff883fff105af0] __crash_kexec at ffffffff81111795
         #3 [ffff883fff105b08] panic at ffffffff8119a6ae
         #4 [ffff883fff105b98] watchdog_overflow_callback at ffffffff81135dbd
         #5 [ffff883fff105bb0] __perf_event_overflow at ffffffff81186866
         #6 [ffff883fff105be8] perf_event_overflow at ffffffff81192bc4
         #7 [ffff883fff105bf8] intel_pmu_handle_irq at ffffffff8100b265
         #8 [ffff883fff105df8] perf_event_nmi_handler at ffffffff8100489f
         #9 [ffff883fff105e58] nmi_handle at ffffffff81020653
        #10 [ffff883fff105eb0] default_do_nmi at ffffffff81020b94
        #11 [ffff883fff105ed0] do_nmi at ffffffff81020d32
        #12 [ffff883fff105ef0] end_repeat_nmi at ffffffff81956a7e
            [exception RIP: queued_spin_lock_slowpath+248]
            RIP: ffffffff810da958  RSP: ffff883fff103e68  RFLAGS: 00000046
            RAX: 0000000000000000  RBX: 0000000000000046  RCX: 00000000006d0000
            RDX: ffff883fff49a950  RSI: 0000000000d10101  RDI: ffffffff81e54300
            RBP: ffff883fff103e80   R8: ffff883fff11a950   R9: 0000000000000000
            R10: 000000000e5873ba  R11: 000000000000010f  R12: ffffffff81e54300
            R13: 0000000000000000  R14: ffff88010f88c380  R15: ffffffff81e54300
            ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
        --- <NMI exception stack> ---
        #13 [ffff883fff103e68] queued_spin_lock_slowpath at ffffffff810da958
        #14 [ffff883fff103e70] _raw_spin_lock_irqsave at ffffffff8195550b
        #15 [ffff883fff103e88] rcu_check_callbacks at ffffffff810eed18
        #16 [ffff883fff103ee8] update_process_times at ffffffff810f3497
        #17 [ffff883fff103f10] tick_sched_timer at ffffffff81103037
        #18 [ffff883fff103f38] __hrtimer_run_queues at ffffffff810f3f38
        #19 [ffff883fff103f88] hrtimer_interrupt at ffffffff810f442b
        --- <IRQ stack> ---
      
      Avoid spuriously triggering NMI hardlockup watchdog by touching it
      from the print functions.  show_state_filter() shares the same problem
      and solution.
      
      v2: Relocate the comment to where it belongs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3caa973b
    • P
      rcu: Fix CPU offload boot message when no CPUs are offloaded · 3016611e
      Paul E. McKenney 提交于
      In CONFIG_RCU_NOCB_CPU=y kernels, if the boot parameters indicate that
      none of the CPUs should in fact be offloaded, the following somewhat
      obtuse message appears:
      
      	Offload RCU callbacks from CPUs: .
      
      This commit therefore makes the message at least grammatically correct
      in this case:
      
      	Offload RCU callbacks from CPUs: (none)
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3016611e
  3. 12 12月, 2017 1 次提交
    • R
      rcu: Remove have_rcu_nocb_mask from tree_plugin.h · 84b12b75
      Rakib Mullick 提交于
      Currently have_rcu_nocb_mask is used to avoid double allocation of
      rcu_nocb_mask during boot up. Due to different representation of
      cpumask_var_t on different kernel config CPUMASK=y(or n) it was okay.
      But now we have a helper cpumask_available(), which can be utilized
      to check whether rcu_nocb_mask has been allocated or not without using
      a variable.
      
      Removing the variable also reduces vmlinux size.
      
      Unpatched version:
      text	   data	    bss	    dec	    hex	filename
      13050393	7852470	14543408	35446271	21cddff	vmlinux
      
      Patched version:
       text	   data	    bss	    dec	    hex	filename
      13050390	7852438	14543408	35446236	21cdddc	vmlinux
      Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      84b12b75
  4. 29 11月, 2017 1 次提交
  5. 08 11月, 2017 1 次提交
  6. 03 11月, 2017 1 次提交
    • K
      rcu: Convert timers to use timer_setup() · fd30b717
      Kees Cook 提交于
      In preparation for unconditionally passing the struct timer_list pointer to
      all timer callbacks, switch to using the new timer_setup() and from_timer()
      to pass the timer pointer explicitly.
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      fd30b717
  7. 27 10月, 2017 2 次提交
  8. 20 10月, 2017 2 次提交
  9. 10 10月, 2017 2 次提交
  10. 09 9月, 2017 1 次提交
  11. 17 8月, 2017 3 次提交
  12. 26 7月, 2017 2 次提交
    • P
      rcu: Make NOCB CPUs migrate CBs directly from outgoing CPU · b1a2d79f
      Paul E. McKenney 提交于
      RCU's CPU-hotplug callback-migration code first moves the outgoing
      CPU's callbacks to ->orphan_done and ->orphan_pend, and only then
      moves them to the NOCB callback list.  This commit avoids the
      extra step (and simplifies the code) by moving the callbacks directly
      from the outgoing CPU's callback list to the NOCB callback list.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b1a2d79f
    • P
      rcu: Use timer as backstop for NOCB deferred wakeups · 8be6e1b1
      Paul E. McKenney 提交于
      The handling of RCU's no-CBs CPUs has a maintenance headache, namely
      that if call_rcu() is invoked with interrupts disabled, the rcuo kthread
      wakeup must be defered to a point where we can be sure that scheduler
      locks are not held.  Of course, there are a lot of code paths leading
      from an interrupts-disabled invocation of call_rcu(), and missing any
      one of these can result in excessive callback-invocation latency, and
      potentially even system hangs.
      
      This commit therefore uses a timer to guarantee that the wakeup will
      eventually occur.  If one of the deferred-wakeup points kicks in, then
      the timer is simply cancelled.
      
      This commit also fixes up an incomplete removal of commits that were
      intended to plug remaining exit paths, which should have the added
      benefit of reducing the overhead of RCU's context-switch hooks.  In
      addition, it simplifies leader-to-follower callback-list handoff by
      introducing locking.  The call_rcu()-to-leader handoff continues to
      use atomic operations in order to maintain good real-time latency for
      common-case use of call_rcu().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Dan Carpenter fix for mod_timer() usage bug found by smatch. ]
      8be6e1b1
  13. 09 6月, 2017 7 次提交
    • P
      rcu: Eliminate NOCBs CPU-state Kconfig options · 44c65ff2
      Paul E. McKenney 提交于
      The CONFIG_RCU_NOCB_CPU_ALL, CONFIG_RCU_NOCB_CPU_NONE, and
      CONFIG_RCU_NOCB_CPU_ZERO Kconfig options are used only in testing and
      are redundant with the rcu_nocbs= boot parameter.  This commit therefore
      removes these three Kconfig options and adjusts the rcutorture scripts
      to use the boot parameter instead.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      44c65ff2
    • P
      rcu: Remove debugfs tracing · ae91aa0a
      Paul E. McKenney 提交于
      RCU's debugfs tracing used to be the only reasonable low-level debug
      information available, but ftrace and event tracing has since surpassed
      the RCU debugfs level of usefulness.  This commit therefore removes
      RCU's debugfs tracing.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      ae91aa0a
    • P
      rcu: Remove the now-obsolete PROVE_RCU_REPEATEDLY Kconfig option · c4a09ff7
      Paul E. McKenney 提交于
      The PROVE_RCU_REPEATEDLY Kconfig option was initially added due to
      the volume of messages from PROVE_RCU: Doing just one per boot would
      have required excessive numbers of boots to locate them all.  However,
      PROVE_RCU messages are now relatively rare, so there is no longer any
      reason to need more than one such message per boot.  This commit therefore
      removes the PROVE_RCU_REPEATEDLY Kconfig option.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      c4a09ff7
    • P
      rcu: Remove nohz_full full-system-idle state machine · fe5ac724
      Paul E. McKenney 提交于
      The NO_HZ_FULL_SYSIDLE full-system-idle capability was added in 2013
      by commit 0edd1b17 ("nohz_full: Add full-system-idle state machine"),
      but has not been used.  This commit therefore removes it.
      
      If it turns out to be needed later, this commit can always be reverted.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe5ac724
    • P
      rcu: Remove *_SLOW_* Kconfig options · 90040c9e
      Paul E. McKenney 提交于
      The RCU_TORTURE_TEST_SLOW_PREINIT, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY,
      RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_INIT,
      RCU_TORTURE_TEST_SLOW_INIT_DELAY, RCU_TORTURE_TEST_SLOW_CLEANUP,
      and RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig options are only
      useful for torture testing, and there are the rcutree.gp_cleanup_delay,
      rcutree.gp_init_delay, and rcutree.gp_preinit_delay kernel boot parameters
      that rcutorture can use instead.  The effect of these parameters is to
      artificially slow down grace period initialization and cleanup in order
      to make some types of race conditions happen more often.
      
      This commit therefore simplifies Tree RCU a bit by removing the Kconfig
      options and adding the corresponding kernel parameters to rcutorture's
      .boot files instead.  However, this commit also leaves out the kernel
      parameters for TREE02, TREE04, and TREE07 in order to have about the
      same number of tests slowed as not slowed.  TREE01, TREE03, TREE05,
      and TREE06 are slowed, and the rest are not slowed.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      90040c9e
    • P
      rcu: Move docbook comments out of rcupdate.h · a68a2bb2
      Paul E. McKenney 提交于
      The include/linux/rcupdate.h file is included by more than 200
      files, so shrinking it should provide some build-time benefits.
      This commit therefore moves several docbook comments from rcupdate.h to
      kernel/rcu/update.c, kernel/rcu/tree.c, and kernel/rcu/tree_plugin.h, thus
      reducing the number of times that the compiler has to scan these comments.
      This likely provides only a small benefit, but every little bit helps.
      
      This commit also fixes a malformed bulleted list noted by the 0day
      Test Robot.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a68a2bb2
    • P
      rcu: Add memory barriers for NOCB leader wakeup · 6b5fc3a1
      Paul E. McKenney 提交于
      Wait/wakeup operations do not guarantee ordering on their own.  Instead,
      either locking or memory barriers are required.  This commit therefore
      adds memory barriers to wake_nocb_leader() and nocb_leader_wait().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NKrister Johansen <kjlx@templeofstupid.com>
      Cc: <stable@vger.kernel.org> # 4.6.x
      6b5fc3a1
  14. 08 6月, 2017 6 次提交
  15. 03 5月, 2017 2 次提交