1. 09 6月, 2017 2 次提交
  2. 19 4月, 2017 2 次提交
  3. 15 1月, 2017 1 次提交
    • P
      rcu: Narrow early boot window of illegal synchronous grace periods · 52d7e48b
      Paul E. McKenney 提交于
      The current preemptible RCU implementation goes through three phases
      during bootup.  In the first phase, there is only one CPU that is running
      with preemption disabled, so that a no-op is a synchronous grace period.
      In the second mid-boot phase, the scheduler is running, but RCU has
      not yet gotten its kthreads spawned (and, for expedited grace periods,
      workqueues are not yet running.  During this time, any attempt to do
      a synchronous grace period will hang the system (or complain bitterly,
      depending).  In the third and final phase, RCU is fully operational and
      everything works normally.
      
      This has been OK for some time, but there has recently been some
      synchronous grace periods showing up during the second mid-boot phase.
      This code worked "by accident" for awhile, but started failing as soon
      as expedited RCU grace periods switched over to workqueues in commit
      8b355e3b ("rcu: Drive expedited grace periods from workqueue").
      Note that the code was buggy even before this commit, as it was subject
      to failure on real-time systems that forced all expedited grace periods
      to run as normal grace periods (for example, using the rcu_normal ksysfs
      parameter).  The callchain from the failure case is as follows:
      
      early_amd_iommu_init()
      |-> acpi_put_table(ivrs_base);
      |-> acpi_tb_put_table(table_desc);
      |-> acpi_tb_invalidate_table(table_desc);
      |-> acpi_tb_release_table(...)
      |-> acpi_os_unmap_memory
      |-> acpi_os_unmap_iomem
      |-> acpi_os_map_cleanup
      |-> synchronize_rcu_expedited
      
      The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
      which caused the code to try using workqueues before they were
      initialized, which did not go well.
      
      This commit therefore reworks RCU to permit synchronous grace periods
      to proceed during this mid-boot phase.  This commit is therefore a
      fix to a regression introduced in v4.9, and is therefore being put
      forward post-merge-window in v4.10.
      
      This commit sets a flag from the existing rcu_scheduler_starting()
      function which causes all synchronous grace periods to take the expedited
      path.  The expedited path now checks this flag, using the requesting task
      to drive the expedited grace period forward during the mid-boot phase.
      Finally, this flag is updated by a core_initcall() function named
      rcu_exp_runtime_mode(), which causes the runtime codepaths to be used.
      
      Note that this arrangement assumes that tasks are not sent POSIX signals
      (or anything similar) from the time that the first task is spawned
      through core_initcall() time.
      
      Fixes: 8b355e3b ("rcu: Drive expedited grace periods from workqueue")
      Reported-by: N"Zheng, Lv" <lv.zheng@intel.com>
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NStan Kain <stan.kain@gmail.com>
      Tested-by: NIvan <waffolz@hotmail.com>
      Tested-by: NEmanuel Castelo <emanuel.castelo@gmail.com>
      Tested-by: NBruno Pesavento <bpesavento@infinito.it>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NFrederic Bezies <fredbezies@gmail.com>
      Cc: <stable@vger.kernel.org> # 4.9.0-
      52d7e48b
  4. 24 2月, 2016 1 次提交
    • P
      rcu: Make rcu/tiny_plugin.h explicitly non-modular · 9fc9204e
      Paul Gortmaker 提交于
      The Kconfig currently controlling compilation of this code is:
      
      init/Kconfig:config TINY_RCU
      init/Kconfig:   bool
      
      ...meaning that it currently is not being built as a module by anyone.
      
      Lets remove the modular code that is essentially orphaned, so that
      when reading the code there is no doubt it is builtin-only.
      
      Since module_init translates to device_initcall in the non-modular
      case, the init ordering remains unchanged with this commit.  We could
      consider moving this to an earlier initcall (subsys?) if desired.
      
      We also delete the MODULE_LICENSE tag etc. since all that information
      is already contained at the top of the file in the comments.
      
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      9fc9204e
  5. 28 5月, 2015 1 次提交
  6. 16 1月, 2015 1 次提交
    • M
      rcu: Fix RCU CPU stall detection in tiny implementation · ec1fe396
      Miroslav Benes 提交于
      The tiny RCU CPU stall detection depends on *rcp->curtail not being
      NULL. It is however a tail pointer and thus NULL by definition. Instead we
      should check rcp->rcucblist for the presence of pending callbacks which
      need to be processed. With this fix INFO about the stall is printed and
      jiffies_stall (jiffies at next stall) correctly updated.
      
      Note that the check for pending callback is necessary to avoid spurious
      warnings if there are no pendings callbacks.
      Signed-off-by: NMiroslav Benes <mbenes@suse.cz>
      [ paulmck: Fused identical "if" statements, ported to -rcu. ]
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      ec1fe396
  7. 07 1月, 2015 1 次提交
    • L
      tiny_rcu: Directly force QS when call_rcu_[bh|sched]() on idle_task · 5f6130fa
      Lai Jiangshan 提交于
      For RCU in UP, context-switch = QS = GP, thus we can force a
      context-switch when any call_rcu_[bh|sched]() is happened on idle_task.
      After doing so, rcu_idle/irq_enter/exit() are useless, so we can simply
      make these functions empty.
      
      More important, this change does not change the functionality logically.
      Note: raise_softirq(RCU_SOFTIRQ)/rcu_sched_qs() in rcu_idle_enter() and
      outmost rcu_irq_exit() will have to wake up the ksoftirqd
      (due to in_interrupt() == 0).
      
      Before this patch		After this patch:
      call_rcu_sched() in idle;	call_rcu_sched() in idle
      				  set resched
      do other stuffs;		do other stuffs
      outmost rcu_irq_exit()		outmost rcu_irq_exit() (empty function)
        (or rcu_idle_enter())		  (or rcu_idle_enter(), also empty function)
      				start to resched. (see above)
        rcu_sched_qs()		rcu_sched_qs()
          QS,and GP and advance cb	  QS,and GP and advance cb
          wake up the ksoftirqd	    wake up the ksoftirqd
            set resched
      resched to ksoftirqd (or other)	resched to ksoftirqd (or other)
      
      These two code patches are almost the same.
      
      Size changed after patched:
      
      size kernel/rcu/tiny-old.o kernel/rcu/tiny-patched.o
         text	   data	    bss	    dec	    hex	filename
         3449	    206	      8	   3663	    e4f	kernel/rcu/tiny-old.o
         2406	    144	      8	   2558	    9fe	kernel/rcu/tiny-patched.o
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5f6130fa
  8. 29 4月, 2014 1 次提交
  9. 18 2月, 2014 1 次提交
  10. 16 10月, 2013 1 次提交
  11. 30 7月, 2013 1 次提交
    • S
      rcu: Add const annotation to char * for RCU tracepoints and functions · e66c33d5
      Steven Rostedt (Red Hat) 提交于
      All the RCU tracepoints and functions that reference char pointers do
      so with just 'char *' even though they do not modify the contents of
      the string itself. This will cause warnings if a const char * is used
      in one of these functions.
      
      The RCU tracepoints store the pointer to the string to refer back to them
      when the trace output is displayed. As this can be minutes, hours or
      even days later, those strings had better be constant.
      
      This change also opens the door to allow the RCU tracepoint strings and
      their addresses to be exported so that userspace tracing tools can
      translate the contents of the pointers of the RCU tracepoints.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e66c33d5
  12. 11 6月, 2013 9 次提交
  13. 29 1月, 2013 1 次提交
  14. 24 10月, 2012 1 次提交
  15. 23 9月, 2012 1 次提交
  16. 06 7月, 2012 1 次提交
  17. 03 7月, 2012 2 次提交
  18. 03 5月, 2012 1 次提交
    • P
      rcu: Make exit_rcu() more precise and consolidate · 9dd8fb16
      Paul E. McKenney 提交于
      When running preemptible RCU, if a task exits in an RCU read-side
      critical section having blocked within that same RCU read-side critical
      section, the task must be removed from the list of tasks blocking a
      grace period (perhaps the current grace period, perhaps the next grace
      period, depending on timing).  The exit() path invokes exit_rcu() to
      do this cleanup.
      
      However, the current implementation of exit_rcu() needlessly does the
      cleanup even if the task did not block within the current RCU read-side
      critical section, which wastes time and needlessly increases the size
      of the state space.  Fix this by only doing the cleanup if the current
      task is actually on the list of tasks blocking some grace period.
      
      While we are at it, consolidate the two identical exit_rcu() functions
      into a single function.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      
      Conflicts:
      
      	kernel/rcupdate.c
      9dd8fb16
  19. 22 2月, 2012 6 次提交
    • P
      rcu: Simplify unboosting checks · 1aa03f11
      Paul E. McKenney 提交于
      This is a port of commit #82e78d80 from TREE_PREEMPT_RCU to
      TINY_PREEMPT_RCU.
      
      This commit uses the fact that current->rcu_boost_mutex is set
      any time that the RCU_READ_UNLOCK_BOOSTED flag is set in the
      current->rcu_read_unlock_special bitmask.  This allows tests of
      the bit to be changed to tests of the pointer, which in turn allows
      the RCU_READ_UNLOCK_BOOSTED flag to be eliminated.
      
      Please note that the check of current->rcu_read_unlock_special need not
      change because any time that RCU_READ_UNLOCK_BOOSTED was set, so was
      RCU_READ_UNLOCK_BLOCKED.  Therefore, __rcu_read_unlock() can continue
      testing current->rcu_read_unlock_special for non-zero, as before.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1aa03f11
    • P
      rcu: Inform RCU of irq_exit() activity · 8762705a
      Paul E. McKenney 提交于
      This is a port to TINY_RCU of Peter Zijlstra's commit #ec433f0c
      
      The rcu_read_unlock_special() function relies on in_irq() to exclude
      scheduler activity from interrupt level.  This fails because exit_irq()
      can invoke the scheduler after clearing the preempt_count() bits that
      in_irq() uses to determine that it is at interrupt level.  This situation
      can result in failures as follows:
      
           $task			IRQ		SoftIRQ
      
           rcu_read_lock()
      
           /* do stuff */
      
           <preempt> |= UNLOCK_BLOCKED
      
           rcu_read_unlock()
             --t->rcu_read_lock_nesting
      
          			irq_enter();
          			/* do stuff, don't use RCU */
          			irq_exit();
          			  sub_preempt_count(IRQ_EXIT_OFFSET);
          			  invoke_softirq()
      
          					ttwu();
          					  spin_lock_irq(&pi->lock)
          					  rcu_read_lock();
          					  /* do stuff */
          					  rcu_read_unlock();
          					    rcu_read_unlock_special()
          					      rcu_report_exp_rnp()
          					        ttwu()
          					          spin_lock_irq(&pi->lock) /* deadlock */
      
             rcu_read_unlock_special(t);
      
      This can be triggered 'easily' because invoke_softirq() immediately does
      a ttwu() of ksoftirqd/# instead of doing the in-place softirq stuff first,
      but even without that the above happens.
      
      Cure this by also excluding softirqs from the rcu_read_unlock_special()
      handler and ensuring the force_irqthreads ksoftirqd/# wakeup is done
      from full softirq context.
      
      It is also necessary to delay the ->rcu_read_lock_nesting decrement until
      after rcu_read_unlock_special().  This delay is handled by the commit
      "Protect __rcu_read_unlock() against scheduler-using irq handlers".
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8762705a
    • P
      rcu: Prevent RCU callbacks from executing before scheduler initialized · 768dfffd
      Paul E. McKenney 提交于
      This is a port of commit #b0d30417 from TREE_RCU to TREE_PREEMPT_RCU.
      
      Under some rare but real combinations of configuration parameters, RCU
      callbacks are posted during early boot that use kernel facilities that are
      not yet initialized.  Therefore, when these callbacks are invoked, hard
      hangs and crashes ensue.  This commit therefore prevents RCU callbacks
      from being invoked until after the scheduler is fully up and running,
      as in after multiple tasks have been spawned.
      
      It might well turn out that a better approach is to identify the specific
      RCU callbacks that are causing this problem, but that discussion will
      wait until such time as someone really needs an RCU callback to be invoked
      (as opposed to merely registered) during early boot.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      768dfffd
    • P
      rcu: Streamline code produced by __rcu_read_unlock() · afef2054
      Paul E. McKenney 提交于
      This is a port of commit #be0e1e21 to TINY_PREEMPT_RCU.  This uses
      noinline to prevent rcu_read_unlock_special() from being inlined into
      __rcu_read_unlock().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      afef2054
    • P
      rcu: Protect __rcu_read_unlock() against scheduler-using irq handlers · 26861faf
      Paul E. McKenney 提交于
      This commit ports commit #10f39bb1 (rcu: protect __rcu_read_unlock()
      against scheduler-using irq handlers) from TREE_PREEMPT_RCU to
      TINY_PREEMPT_RCU.  The following is a corresponding port of that
      commit message.
      
      The addition of RCU read-side critical sections within runqueue and
      priority-inheritance critical sections introduced some deadlocks,
      for example, involving interrupts from __rcu_read_unlock() where the
      interrupt handlers call wake_up().  This situation can cause the
      instance of __rcu_read_unlock() invoked from interrupt to do some
      of the processing that would otherwise have been carried out by the
      task-level instance of __rcu_read_unlock().  When the interrupt-level
      instance of __rcu_read_unlock() is called with a scheduler lock held from
      interrupt-entry/exit situations where in_irq() returns false, deadlock can
      result.  Of course, in a UP kernel, there are not really any deadlocks,
      but the upper-level critical section can still be be fatally confused
      by the lower-level critical section changing things out from under it.
      
      This commit resolves these deadlocks by using negative values of the
      per-task ->rcu_read_lock_nesting counter to indicate that an instance of
      __rcu_read_unlock() is in flight, which in turn prevents instances from
      interrupt handlers from doing any special processing.  Note that nested
      rcu_read_lock()/rcu_read_unlock() pairs are still permitted, but they will
      never see ->rcu_read_lock_nesting go to zero, and will therefore never
      invoke rcu_read_unlock_special(), thus preventing them from seeing the
      RCU_READ_UNLOCK_BLOCKED bit should it be set in ->rcu_read_unlock_special.
      This patch also adds a check for ->rcu_read_unlock_special being negative
      in rcu_check_callbacks(), thus preventing the RCU_READ_UNLOCK_NEED_QS
      bit from being set should a scheduling-clock interrupt occur while
      __rcu_read_unlock() is exiting from an outermost RCU read-side critical
      section.
      
      Of course, __rcu_read_unlock() can be preempted during the time that
      ->rcu_read_lock_nesting is negative.  This could result in the setting
      of the RCU_READ_UNLOCK_BLOCKED bit after __rcu_read_unlock() checks it,
      and would also result it this task being queued on the corresponding
      rcu_node structure's blkd_tasks list.  Therefore, some later RCU read-side
      critical section would enter rcu_read_unlock_special() to clean up --
      which could result in deadlock (OK, OK, fatal confusion) if that RCU
      read-side critical section happened to be in the scheduler where the
      runqueue or priority-inheritance locks were held.
      
      To prevent the possibility of fatal confusion that might result from
      preemption during the time that ->rcu_read_lock_nesting is negative,
      this commit also makes rcu_preempt_note_context_switch() check for
      negative ->rcu_read_lock_nesting, thus refraining from queuing the task
      (and from setting RCU_READ_UNLOCK_BLOCKED) if we are already exiting
      from the outermost RCU read-side critical section (in other words,
      we really are no longer actually in that RCU read-side critical
      section).  In addition, rcu_preempt_note_context_switch() invokes
      rcu_read_unlock_special() to carry out the cleanup in this case, which
      clears out the ->rcu_read_unlock_special bits and dequeues the task
      (if necessary), in turn avoiding needless delay of the current RCU grace
      period and needless RCU priority boosting.
      
      It is still illegal to call rcu_read_unlock() while holding a scheduler
      lock if the prior RCU read-side critical section has ever had both
      preemption and irqs enabled.  However, the common use case is legal,
      namely where then entire RCU read-side critical section executes with
      irqs disabled, for example, when the scheduler lock is held across the
      entire lifetime of the RCU read-side critical section.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      26861faf
    • P
      rcu: Add lockdep-RCU checks for simple self-deadlock · fe15d706
      Paul E. McKenney 提交于
      It is illegal to have a grace period within a same-flavor RCU read-side
      critical section, so this commit adds lockdep-RCU checks to splat when
      such abuse is encountered.  This commit does not detect more elaborate
      RCU deadlock situations.  These situations might be a job for lockdep
      enhancements.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      fe15d706
  20. 12 12月, 2011 2 次提交
  21. 31 10月, 2011 1 次提交
  22. 29 9月, 2011 2 次提交