1. 09 6月, 2017 2 次提交
  2. 21 4月, 2017 1 次提交
    • P
      rcu: Make non-preemptive schedule be Tasks RCU quiescent state · bcbfdd01
      Paul E. McKenney 提交于
      Currently, a call to schedule() acts as a Tasks RCU quiescent state
      only if a context switch actually takes place.  However, just the
      call to schedule() guarantees that the calling task has moved off of
      whatever tracing trampoline that it might have been one previously.
      This commit therefore plumbs schedule()'s "preempt" parameter into
      rcu_note_context_switch(), which then records the Tasks RCU quiescent
      state, but only if this call to schedule() was -not- due to a preemption.
      
      To avoid adding overhead to the common-case context-switch path,
      this commit hides the rcu_note_context_switch() check under an existing
      non-common-case check.
      Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bcbfdd01
  3. 19 4月, 2017 2 次提交
    • P
      srcu: Allow SRCU to access rcu_scheduler_active · 900b1028
      Paul E. McKenney 提交于
      This is primarily a code-movement commit in preparation for allowing
      SRCU to handle early-boot SRCU grace periods.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      900b1028
    • P
      rcu: Maintain special bits at bottom of ->dynticks counter · b8c17e66
      Paul E. McKenney 提交于
      Currently, IPIs are used to force other CPUs to invalidate their TLBs
      in response to a kernel virtual-memory mapping change.  This works, but
      degrades both battery lifetime (for idle CPUs) and real-time response
      (for nohz_full CPUs), and in addition results in unnecessary IPIs due to
      the fact that CPUs executing in usermode are unaffected by stale kernel
      mappings.  It would be better to cause a CPU executing in usermode to
      wait until it is entering kernel mode to do the flush, first to avoid
      interrupting usemode tasks and second to handle multiple flush requests
      with a single flush in the case of a long-running user task.
      
      This commit therefore reserves a bit at the bottom of the ->dynticks
      counter, which is checked upon exit from extended quiescent states.
      If it is set, it is cleared and then a new rcu_eqs_special_exit() macro is
      invoked, which, if not supplied, is an empty single-pass do-while loop.
      If this bottom bit is set on -entry- to an extended quiescent state,
      then a WARN_ON_ONCE() triggers.
      
      This bottom bit may be set using a new rcu_eqs_special_set() function,
      which returns true if the bit was set, or false if the CPU turned
      out to not be in an extended quiescent state.  Please note that this
      function refuses to set the bit for a non-nohz_full CPU when that CPU
      is executing in usermode because usermode execution is tracked by RCU
      as a dyntick-idle extended quiescent state only for nohz_full CPUs.
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      b8c17e66
  4. 02 3月, 2017 1 次提交
    • I
      rcu: Separate the RCU synchronization types and APIs into <linux/rcupdate_wait.h> · f9411ebe
      Ingo Molnar 提交于
      So rcupdate.h is a pretty complex header, in particular it includes
      <linux/completion.h> which includes <linux/wait.h> - creating a
      dependency that includes <linux/wait.h> in <linux/sched.h>,
      which prevents the isolation of <linux/sched.h> from the derived
      <linux/wait.h> header.
      
      Solve part of the problem by decoupling rcupdate.h from completions:
      this can be done by separating out the rcu_synchronize types and APIs,
      and updating their usage sites.
      
      Since this is a mostly RCU-internal types this will not just simplify
      <linux/sched.h>'s dependencies, but will make all the hundreds of
      .c files that include rcupdate.h but not completions or wait.h build
      faster.
      
      ( For rcutiny this means that two dependent APIs have to be uninlined,
        but that shouldn't be much of a problem as they are rare variants. )
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f9411ebe
  5. 24 1月, 2017 1 次提交
  6. 15 7月, 2016 1 次提交
  7. 01 4月, 2016 1 次提交
  8. 08 12月, 2015 1 次提交
    • P
      rcu: Don't redundantly disable irqs in rcu_irq_{enter,exit}() · 7c9906ca
      Paul E. McKenney 提交于
      This commit replaces a local_irq_save()/local_irq_restore() pair with
      a lockdep assertion that interrupts are already disabled.  This should
      remove the corresponding overhead from the interrupt entry/exit fastpaths.
      
      This change was inspired by the fact that Iftekhar Ahmed's mutation
      testing showed that removing rcu_irq_enter()'s call to local_ird_restore()
      had no effect, which might indicate that interrupts were always enabled
      anyway.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      7c9906ca
  9. 07 10月, 2015 2 次提交
    • B
      rcu: Don't disable preemption for Tiny and Tree RCU readers · bb73c52b
      Boqun Feng 提交于
      Because preempt_disable() maps to barrier() for non-debug builds,
      it forces the compiler to spill and reload registers.  Because Tree
      RCU and Tiny RCU now only appear in CONFIG_PREEMPT=n builds, these
      barrier() instances generate needless extra code for each instance of
      rcu_read_lock() and rcu_read_unlock().  This extra code slows down Tree
      RCU and bloats Tiny RCU.
      
      This commit therefore removes the preempt_disable() and preempt_enable()
      from the non-preemptible implementations of __rcu_read_lock() and
      __rcu_read_unlock(), respectively.  However, for debug purposes,
      preempt_disable() and preempt_enable() are still invoked if
      CONFIG_PREEMPT_COUNT=y, because this allows detection of sleeping inside
      atomic sections in non-preemptible kernels.
      
      However, Tiny and Tree RCU operates by coalescing all RCU read-side
      critical sections on a given CPU that lie between successive quiescent
      states.  It is therefore necessary to compensate for removing barriers
      from __rcu_read_lock() and __rcu_read_unlock() by adding them to a
      couple of the RCU functions invoked during quiescent states, namely to
      rcu_all_qs() and rcu_note_context_switch().  However, note that the latter
      is more paranoia than necessity, at least until link-time optimizations
      become more aggressive.
      
      This is based on an earlier patch by Paul E. McKenney, fixing
      a bug encountered in kernels built with CONFIG_PREEMPT=n and
      CONFIG_PREEMPT_COUNT=y.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bb73c52b
    • B
      rcu: Use rcu_callback_t in call_rcu*() and friends · b6a4ae76
      Boqun Feng 提交于
      As we now have rcu_callback_t typedefs as the type of rcu callbacks, we
      should use it in call_rcu*() and friends as the type of parameters. This
      could save us a few lines of code and make it clear which function
      requires an rcu callbacks rather than other callbacks as its argument.
      
      Besides, this can also help cscope to generate a better database for
      code reading.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      b6a4ae76
  10. 23 7月, 2015 1 次提交
  11. 28 5月, 2015 1 次提交
  12. 16 1月, 2015 1 次提交
    • P
      rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors · 5cd37193
      Paul E. McKenney 提交于
      Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used
      in places where it would be useful for it to apply to the normal RCU
      flavors, rcu_preempt, rcu_sched, and rcu_bh.  This is especially the
      case for workloads that aggressively overload the system, particularly
      those that generate large numbers of RCU updates on systems running
      NO_HZ_FULL CPUs.  This commit therefore communicates quiescent states
      from cond_resched_rcu_qs() to the normal RCU flavors.
      
      Note that it is unfortunately necessary to leave the old ->passed_quiesce
      mechanism in place to allow quiescent states that apply to only one
      flavor to be recorded.  (Yes, we could decrement ->rcu_qs_ctr_snap in
      that case, but that is not so good for debugging of RCU internals.)
      In addition, if one of the RCU flavor's grace period has stalled, this
      will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight
      quiescent state visible from other CPUs.
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu()
        was used in preemptible code. ]
      5cd37193
  13. 11 1月, 2015 3 次提交
  14. 04 11月, 2014 1 次提交
  15. 08 9月, 2014 1 次提交
  16. 15 5月, 2014 1 次提交
  17. 21 3月, 2014 1 次提交
    • P
      rcu: Provide grace-period piggybacking API · 765a3f4f
      Paul E. McKenney 提交于
      The following pattern is currently not well supported by RCU:
      
      1.	Make data element inaccessible to RCU readers.
      
      2.	Do work that probably lasts for more than one grace period.
      
      3.	Do something to make sure RCU readers in flight before #1 above
      	have completed.
      
      Here are some things that could currently be done:
      
      a.	Do a synchronize_rcu() unconditionally at either #1 or #3 above.
      	This works, but imposes needless work and latency.
      
      b.	Post an RCU callback at #1 above that does a wakeup, then
      	wait for the wakeup at #3.  This works well, but likely results
      	in an extra unneeded grace period.  Open-coding this is also
      	a bit more semi-tricky code than would be good.
      
      This commit therefore adds get_state_synchronize_rcu() and
      cond_synchronize_rcu() APIs.  Call get_state_synchronize_rcu() at #1
      above and pass its return value to cond_synchronize_rcu() at #3 above.
      This results in a call to synchronize_rcu() if no grace period has
      elapsed between #1 and #3, but requires only a load, comparison, and
      memory barrier if a full grace period did elapse.
      Requested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      765a3f4f
  18. 18 2月, 2014 2 次提交
  19. 13 12月, 2013 1 次提交
  20. 25 9月, 2013 2 次提交
  21. 11 6月, 2013 5 次提交
  22. 03 7月, 2012 1 次提交
  23. 07 6月, 2012 1 次提交
    • P
      rcu: Precompute RCU_FAST_NO_HZ timer offsets · aa9b1630
      Paul E. McKenney 提交于
      When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick()
      calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the
      next wakeup time based on the timer wheels.  Only later, when actually
      entering the idle loop, rcu_prepare_for_idle() will be invoked.  In some
      cases, rcu_prepare_for_idle() will post timers to wake the CPU back up.
      But all for naught: The next wakeup time for the CPU has already been
      computed, and posting a timer afterwards does not force that wakeup
      time to be recomputed.  This means that rcu_prepare_for_idle()'s have
      no effect.
      
      This is not a problem on a busy system because something else will wake
      up the CPU soon enough.  However, on lightly loaded systems, the CPU
      might stay asleep for a considerable length of time.  If that CPU has
      a callback that the rest of the system is waiting on, the system might
      run very slowly or (in theory) even hang.
      
      This commit avoids this problem by having rcu_needs_cpu() give
      tick_nohz_stop_sched_tick() an estimate of when RCU will need the CPU
      to wake back up, which tick_nohz_stop_sched_tick() takes into account
      when programming the CPU's wakeup time.  An alternative approach is
      for rcu_prepare_for_idle() to use hrtimers instead of normal timers,
      but timers are much more efficient than are hrtimers for frequently
      and repeatedly posting and cancelling a given timer, which is exactly
      what RCU_FAST_NO_HZ does.
      Reported-by: NPascal Chapperon <pascal.chapperon@wanadoo.fr>
      Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: NPascal Chapperon <pascal.chapperon@wanadoo.fr>
      aa9b1630
  24. 03 5月, 2012 2 次提交
  25. 22 2月, 2012 2 次提交
    • P
      rcu: Prevent RCU callbacks from executing before scheduler initialized · 768dfffd
      Paul E. McKenney 提交于
      This is a port of commit #b0d30417 from TREE_RCU to TREE_PREEMPT_RCU.
      
      Under some rare but real combinations of configuration parameters, RCU
      callbacks are posted during early boot that use kernel facilities that are
      not yet initialized.  Therefore, when these callbacks are invoked, hard
      hangs and crashes ensue.  This commit therefore prevents RCU callbacks
      from being invoked until after the scheduler is fully up and running,
      as in after multiple tasks have been spawned.
      
      It might well turn out that a better approach is to identify the specific
      RCU callbacks that are causing this problem, but that discussion will
      wait until such time as someone really needs an RCU callback to be invoked
      (as opposed to merely registered) during early boot.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      768dfffd
    • P
      rcu: Avoid waking up CPUs having only kfree_rcu() callbacks · 486e2593
      Paul E. McKenney 提交于
      When CONFIG_RCU_FAST_NO_HZ is enabled, RCU will allow a given CPU to
      enter dyntick-idle mode even if it still has RCU callbacks queued.
      RCU avoids system hangs in this case by scheduling a timer for several
      jiffies in the future.  However, if all of the callbacks on that CPU
      are from kfree_rcu(), there is no reason to wake the CPU up, as it is
      not a problem to defer freeing of memory.
      
      This commit therefore tracks the number of callbacks on a given CPU
      that are from kfree_rcu(), and avoids scheduling the timer if all of
      a given CPU's callbacks are from kfree_rcu().
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      486e2593
  26. 29 9月, 2011 2 次提交