1. 25 4月, 2012 3 次提交
  2. 22 2月, 2012 14 次提交
  3. 12 12月, 2011 15 次提交
    • P
      rcu: Apply ACCESS_ONCE() to rcu_boost() return value · 4f89b336
      Paul E. McKenney 提交于
      Both TINY_RCU's and TREE_RCU's implementations of rcu_boost() access
      the ->boost_tasks and ->exp_tasks fields without preventing concurrent
      changes to these fields.  This commit therefore applies ACCESS_ONCE in
      order to prevent compiler mischief.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4f89b336
    • P
      Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" · 70321d44
      Paul E. McKenney 提交于
      This reverts commit 5342e269.
      
      The approach taken in this patch was deemed too abusive to mutexes,
      and thus too likely to result in maintenance problems in the future.
      Instead, we will disallow RCU read-side critical sections that partially
      overlap with interrupt-disbled code segments.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      70321d44
    • P
      rcu: Adaptive dyntick-idle preparation · f23f7fa1
      Paul E. McKenney 提交于
      If there are other CPUs active at a given point in time, then there is a
      limit to what a given CPU can do to advance the current RCU grace period.
      Beyond this limit, attempting to force the RCU grace period forward will
      do nothing but consume energy burning CPU cycles.
      
      Therefore, this commit takes an adaptive approach to RCU_FAST_NO_HZ
      preparations for idle.  It pushes the RCU core state machine for
      two cycles unconditionally, and then it will push from zero to three
      additional cycles, but only as long as the RCU core has work for this
      CPU to do immediately.  The rcu_pending() function is used to check
      whether the RCU core has such work.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f23f7fa1
    • P
      rcu: Keep invoking callbacks if CPU otherwise idle · dff1672d
      Paul E. McKenney 提交于
      The rcu_do_batch() function that invokes callbacks for TREE_RCU and
      TREE_PREEMPT_RCU normally throttles callback invocation to avoid degrading
      scheduling latency.  However, as long as the CPU would otherwise be idle,
      there is no downside to continuing to invoke any callbacks that have passed
      through their grace periods.  In fact, processing such callbacks in a
      timely manner has the benefit of increasing the probability that the
      CPU can enter the power-saving dyntick-idle mode.
      
      Therefore, this commit allows callback invocation to continue beyond the
      preset limit as long as the scheduler does not have some other task to
      run and as long as context is that of the idle task or the relevant
      RCU kthread.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      dff1672d
    • P
      rcu: Permit dyntick-idle with callbacks pending · 7cb92499
      Paul E. McKenney 提交于
      The current implementation of RCU_FAST_NO_HZ prevents CPUs from entering
      dyntick-idle state if they have RCU callbacks pending.  Unfortunately,
      this has the side-effect of often preventing them from entering this
      state, especially if at least one other CPU is not in dyntick-idle state.
      However, the resulting per-tick wakeup is wasteful in many cases: if the
      CPU has already fully responded to the current RCU grace period, there
      will be nothing for it to do until this grace period ends, which will
      frequently take several jiffies.
      
      This commit therefore permits a CPU that has done everything that the
      current grace period has asked of it (rcu_pending() == 0) even if it
      still as RCU callbacks pending.  However, such a CPU posts a timer to
      wake it up several jiffies later (6 jiffies, based on experience with
      grace-period lengths).  This wakeup is required to handle situations
      that can result in all CPUs being in dyntick-idle mode, thus failing
      to ever complete the current grace period.  If a CPU wakes up before
      the timer goes off, then it cancels that timer, thus avoiding spurious
      wakeups.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      7cb92499
    • P
      rcu: Reduce latency of rcu_prepare_for_idle() · 3ad0decf
      Paul E. McKenney 提交于
      Re-enable interrupts across calls to quiescent-state functions and
      also across force_quiescent_state() to reduce latency.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3ad0decf
    • P
      rcu: Eliminate RCU_FAST_NO_HZ grace-period hang · f535a607
      Paul E. McKenney 提交于
      With the new implementation of RCU_FAST_NO_HZ, it was possible to hang
      RCU grace periods as follows:
      
      o	CPU 0 attempts to go idle, cycles several times through the
      	rcu_prepare_for_idle() loop, then goes dyntick-idle when
      	RCU needs nothing more from it, while still having at least
      	on RCU callback pending.
      
      o	CPU 1 goes idle with no callbacks.
      
      Both CPUs can then stay in dyntick-idle mode indefinitely, preventing
      the RCU grace period from ever completing, possibly hanging the system.
      
      This commit therefore prevents CPUs that have RCU callbacks from entering
      dyntick-idle mode.  This approach also eliminates the need for the
      end-of-grace-period IPIs used previously.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f535a607
    • P
      rcu: Avoid needlessly IPIing CPUs at GP end · 84ad00cb
      Paul E. McKenney 提交于
      If a CPU enters dyntick-idle mode with callbacks pending, it will need
      an IPI at the end of the grace period.  However, if it exits dyntick-idle
      mode before the grace period ends, it will be needlessly IPIed at the
      end of the grace period.
      
      Therefore, this commit clears the per-CPU rcu_awake_at_gp_end flag
      when a CPU determines that it does not need it.  This in turn requires
      disabling interrupts across much of rcu_prepare_for_idle() in order to
      avoid having nested interrupts clearing this state out from under us.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      84ad00cb
    • P
      rcu: Go dyntick-idle more quickly if CPU has serviced current grace period · 3084f2f8
      Paul E. McKenney 提交于
      The earlier version would attempt to push callbacks through five times
      before going into dyntick-idle mode if callbacks remained, but the CPU
      had done all that it needed to do for the current RCU grace periods.
      This is wasteful:  In most cases, once the CPU has done all that it
      needs to for the current RCU grace periods, it will make no further
      progress on the callbacks no matter how many times it loops through
      the RCU core processing and the idle-entry code.
      
      This commit therefore goes to dyntick-idle mode whenever the current
      CPU has done all it can for the current grace period.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3084f2f8
    • P
      rcu: Add tracing for RCU_FAST_NO_HZ · 433cdddc
      Paul E. McKenney 提交于
      This commit adds trace_rcu_prep_idle(), which is invoked from
      rcu_prepare_for_idle() and rcu_wake_cpu() to trace attempts on
      the part of RCU to force CPUs into dyntick-idle mode.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      433cdddc
    • P
      rcu: Allow dyntick-idle mode for CPUs with callbacks · aea1b35e
      Paul E. McKenney 提交于
      Currently, RCU does not permit a CPU to enter dyntick-idle mode if that
      CPU has any RCU callbacks queued.  This means that workloads for which
      each CPU wakes up and does some RCU updates every few ticks will never
      enter dyntick-idle mode.  This can result in significant unnecessary power
      consumption, so this patch permits a given to enter dyntick-idle mode if
      it has callbacks, but only if that same CPU has completed all current
      work for the RCU core.  We determine use rcu_pending() to determine
      whether a given CPU has completed all current work for the RCU core.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      aea1b35e
    • T
      rcu: Remove redundant return from rcu_report_exp_rnp() · a0f8eefb
      Thomas Gleixner 提交于
      Empty void functions do not need "return", so this commit removes it
      from rcu_report_exp_rnp().
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a0f8eefb
    • T
      rcu: Omit self-awaken when setting up expedited grace period · b40d293e
      Thomas Gleixner 提交于
      When setting up an expedited grace period, if there were no readers, the
      task will awaken itself.  This commit removes this useless self-awakening.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b40d293e
    • P
      rcu: Make synchronize_sched_expedited() better at work sharing · 7077714e
      Paul E. McKenney 提交于
      When synchronize_sched_expedited() takes its second and subsequent
      snapshots of sync_sched_expedited_started, it subtracts 1.  This
      means that the concurrent caller of synchronize_sched_expedited()
      that incremented to that value sees our successful completion, it
      will not be able to take advantage of it.  This restriction is
      pointless, given that our full expedited grace period would have
      happened after the other guy started, and thus should be able to
      serve as a proxy for the other guy successfully executing
      try_stop_cpus().
      
      This commit therefore removes the subtraction of 1.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      7077714e
    • P
      rcu: Avoid RCU-preempt expedited grace-period botch · 389abd48
      Paul E. McKenney 提交于
      Because rcu_read_unlock_special() samples rcu_preempted_readers_exp(rnp)
      after dropping rnp->lock, the following sequence of events is possible:
      
      1.	Task A exits its RCU read-side critical section, and removes
      	itself from the ->blkd_tasks list, releases rnp->lock, and is
      	then preempted.  Task B remains on the ->blkd_tasks list, and
      	blocks the current expedited grace period.
      
      2.	Task B exits from its RCU read-side critical section and removes
      	itself from the ->blkd_tasks list.  Because it is the last task
      	blocking the current expedited grace period, it ends that
      	expedited grace period.
      
      3.	Task A resumes, and samples rcu_preempted_readers_exp(rnp) which
      	of course indicates that nothing is blocking the nonexistent
      	expedited grace period. Task A is again preempted.
      
      4.	Some other CPU starts an expedited grace period.  There are several
      	tasks blocking this expedited grace period queued on the
      	same rcu_node structure that Task A was using in step 1 above.
      
      5.	Task A examines its state and incorrectly concludes that it was
      	the last task blocking the expedited grace period on the current
      	rcu_node structure.  It therefore reports completion up the
      	rcu_node tree.
      
      6.	The expedited grace period can then incorrectly complete before
      	the tasks blocked on this same rcu_node structure exit their
      	RCU read-side critical sections.  Arbitrarily bad things happen.
      
      This commit therefore takes a snapshot of rcu_preempted_readers_exp(rnp)
      prior to dropping the lock, so that only the last task thinks that it is
      the last task, thus avoiding the failure scenario laid out above.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      389abd48
  4. 29 9月, 2011 8 次提交
    • P
      rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states · e90c53d3
      Paul E. McKenney 提交于
      The purpose of rcu_needs_cpu_flush() was to iterate on pushing the
      current grace period in order to help the current CPU enter dyntick-idle
      mode.  However, this can result in failures if the CPU starts entering
      dyntick-idle mode, but then backs out.  In this case, the call to
      rcu_pending() from rcu_needs_cpu_flush() might end up announcing a
      non-existing quiescent state.
      
      This commit therefore removes rcu_needs_cpu_flush() in favor of letting
      the dyntick-idle machinery at the end of the softirq handler push the
      loop along via its call to rcu_pending().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e90c53d3
    • M
      rcu: Wire up RCU_BOOST_PRIO for rcutree · 5b61b0ba
      Mike Galbraith 提交于
      RCU boost threads start life at RCU_BOOST_PRIO, while others remain
      at RCU_KTHREAD_PRIO.  While here, change thread names to match other
      kthreads, and adjust rcu_yield() to not override the priority set by
      the user.  This last change sets the stage for runtime changes to
      priority in the -rt tree.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5b61b0ba
    • P
      rcu: Permit rt_mutex_unlock() with irqs disabled · 5342e269
      Paul E. McKenney 提交于
      Create a separate lockdep class for the rt_mutex used for RCU priority
      boosting and enable use of rt_mutex_lock() with irqs disabled.  This
      prevents RCU priority boosting from falling prey to deadlocks when
      someone begins an RCU read-side critical section in preemptible state,
      but releases it with an irq-disabled lock held.
      
      Unfortunately, the scheduler's runqueue and priority-inheritance locks
      still must either completely enclose or be completely enclosed by any
      overlapping RCU read-side critical section.
      
      This version removes a redundant local_irq_restore() noted by
      Yong Zhang.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5342e269
    • P
      rcu: Suppress NMI backtraces when stall ends before dump · 9bc8b558
      Paul E. McKenney 提交于
      It is possible for an RCU CPU stall to end just as it is detected, in
      which case the current code will uselessly dump all CPU's stacks.
      This commit therefore checks for this condition and refrains from
      sending needless NMIs.
      
      And yes, the stall might also end just after we checked all CPUs and
      tasks, but in that case we would at least have given some clue as
      to which CPU/task was at fault.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      9bc8b558
    • P
      rcu: Simplify unboosting checks · 82e78d80
      Paul E. McKenney 提交于
      Commit 7765be (Fix RCU_BOOST race handling current->rcu_read_unlock_special)
      introduced a new ->rcu_boosted field in the task structure.  This is
      redundant because the existing ->rcu_boost_mutex will be non-NULL at
      any time that ->rcu_boosted is nonzero.  Therefore, this commit removes
      ->rcu_boosted and tests ->rcu_boost_mutex instead.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      82e78d80
    • P
      rcu: Move __rcu_read_unlock()'s barrier() within if-statement · 6206ab9b
      Paul E. McKenney 提交于
      We only need to constrain the compiler if we are actually exiting
      the top-level RCU read-side critical section.  This commit therefore
      moves the first barrier() cal in __rcu_read_unlock() to inside the
      "if" statement, thus avoiding needless register flushes for inner
      rcu_read_unlock() calls.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      6206ab9b
    • P
      rcu: Simplify quiescent-state accounting · e4cc1f22
      Paul E. McKenney 提交于
      There is often a delay between the time that a CPU passes through a
      quiescent state and the time that this quiescent state is reported to the
      RCU core.  It is quite possible that the grace period ended before the
      quiescent state could be reported, for example, some other CPU might have
      deduced that this CPU passed through dyntick-idle mode.  It is critically
      important that quiescent state be counted only against the grace period
      that was in effect at the time that the quiescent state was detected.
      
      Previously, this was handled by recording the number of the last grace
      period to complete when passing through a quiescent state.  The RCU
      core then checks this number against the current value, and rejects
      the quiescent state if there is a mismatch.  However, one additional
      possibility must be accounted for, namely that the quiescent state was
      recorded after the prior grace period completed but before the current
      grace period started.  In this case, the RCU core must reject the
      quiescent state, but the recorded number will match.  This is handled
      when the CPU becomes aware of a new grace period -- at that point,
      it invalidates any prior quiescent state.
      
      This works, but is a bit indirect.  The new approach records the current
      grace period, and the RCU core checks to see (1) that this is still the
      current grace period and (2) that this grace period has not yet ended.
      This approach simplifies reasoning about correctness, and this commit
      changes over to this new approach.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e4cc1f22
    • P
      rcu: Add grace-period, quiescent-state, and call_rcu trace events · d4c08f2a
      Paul E. McKenney 提交于
      Add trace events to record grace-period start and end, quiescent states,
      CPUs noticing grace-period start and end, grace-period initialization,
      call_rcu() invocation, tasks blocking in RCU read-side critical sections,
      tasks exiting those same critical sections, force_quiescent_state()
      detection of dyntick-idle and offline CPUs, CPUs entering and leaving
      dyntick-idle mode (except from NMIs), CPUs coming online and going
      offline, and CPUs being kicked for staying in dyntick-idle mode for too
      long (as in many weeks, even on 32-bit systems).
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      
      rcu: Add the rcu flavor to callback trace events
      
      The earlier trace events for registering RCU callbacks and for invoking
      them did not include the RCU flavor (rcu_bh, rcu_preempt, or rcu_sched).
      This commit adds the RCU flavor to those trace events.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      d4c08f2a