1. 23 5月, 2018 1 次提交
  2. 16 5月, 2018 22 次提交
    • P
      rcu: Drop early GP request check from rcu_gp_kthread() · a458360a
      Paul E. McKenney 提交于
      Now that grace-period requests use funnel locking and now that they
      set ->gp_flags to RCU_GP_FLAG_INIT even when the RCU grace-period
      kthread has not yet started, rcu_gp_kthread() no longer needs to check
      need_any_future_gp() at startup time.  This commit therefore removes
      this check.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      a458360a
    • P
      rcu: Simplify and inline cpu_needs_another_gp() · c1935209
      Paul E. McKenney 提交于
      Now that RCU no longer relies on failsafe checks, cpu_needs_another_gp()
      can be greatly simplified.  This simplification eliminates the last
      call to rcu_future_needs_gp() and to rcu_segcblist_future_gp_needed(),
      both of which which can then be eliminated.  And then, because
      cpu_needs_another_gp() is called only from __rcu_pending(), it can be
      inlined and eliminated.
      
      This commit carries out the simplification, inlining, and elimination
      called out above.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      c1935209
    • P
      rcu: The rcu_gp_cleanup() function does not need cpu_needs_another_gp() · 384f77f4
      Paul E. McKenney 提交于
      All of the cpu_needs_another_gp() function's checks (except for
      newly arrived callbacks) have been subsumed into the rcu_gp_cleanup()
      function's scan of the rcu_node tree.  This commit therefore drops the
      call to cpu_needs_another_gp().  The check for newly arrived callbacks
      is supplied by rcu_accelerate_cbs().  Any needed advancing (as in the
      earlier rcu_advance_cbs() call) will be supplied when the corresponding
      CPU becomes aware of the end of the now-completed grace period.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      384f77f4
    • P
      rcu: Make rcu_start_this_gp() check for out-of-range requests · 665f08f1
      Paul E. McKenney 提交于
      If rcu_start_this_gp() is invoked with a requested grace period more
      than three in the future, then either the ->need_future_gp[] array
      needs to be bigger or the caller needs to be repaired.  This commit
      therefore adds a WARN_ON_ONCE() checking for this condition.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      665f08f1
    • P
      rcu: Add funnel locking to rcu_start_this_gp() · 360e0da6
      Paul E. McKenney 提交于
      The rcu_start_this_gp() function had a simple form of funnel locking that
      used only the leaves and root of the rcu_node tree, which is fine for
      systems with only a few hundred CPUs, but sub-optimal for systems having
      thousands of CPUs.  This commit therefore adds full-tree funnel locking.
      
      This variant of funnel locking is unusual in the following ways:
      
      1.	The leaf-level rcu_node structure's ->lock is held throughout.
      	Other funnel-locking implementations drop the leaf-level lock
      	before progressing to the next level of the tree.
      
      2.	Funnel locking can be started at the root, which is convenient
      	for code that already holds the root rcu_node structure's ->lock.
      	Other funnel-locking implementations start at the leaves.
      
      3.	If an rcu_node structure other than the initial one believes
      	that a grace period is in progress, it is not necessary to
      	go further up the tree.  This is because grace-period cleanup
      	scans the full tree, so that marking the need for a subsequent
      	grace period anywhere in the tree suffices -- but only if
      	a grace period is currently in progress.
      
      4.	It is possible that the RCU grace-period kthread has not yet
      	started, and this case must be handled appropriately.
      
      However, the general approach of using a tree to control lock contention
      is still in place.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      360e0da6
    • P
      rcu: Make rcu_start_future_gp() caller select grace period · 41e80595
      Paul E. McKenney 提交于
      The rcu_accelerate_cbs() function selects a grace-period target, which
      it uses to have rcu_segcblist_accelerate() assign numbers to recently
      queued callbacks.  Then it invokes rcu_start_future_gp(), which selects
      a grace-period target again, which is a bit pointless.  This commit
      therefore changes rcu_start_future_gp() to take the grace-period target as
      a parameter, thus avoiding double selection.  This commit also changes
      the name of rcu_start_future_gp() to rcu_start_this_gp() to reflect
      this change in functionality, and also makes a similar change to the
      name of trace_rcu_future_gp().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      41e80595
    • P
      rcu: Inline rcu_start_gp_advanced() into rcu_start_future_gp() · d5cd9685
      Paul E. McKenney 提交于
      The rcu_start_gp_advanced() is invoked only from rcu_start_future_gp() and
      much of its code is redundant when invoked from that context.  This commit
      therefore inlines rcu_start_gp_advanced() into rcu_start_future_gp(),
      then removes rcu_start_gp_advanced().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      d5cd9685
    • P
      rcu: Clear request other than RCU_GP_FLAG_INIT at GP end · a824a287
      Paul E. McKenney 提交于
      Once the grace period has ended, any RCU_GP_FLAG_FQS requests are
      irrelevant:  The grace period has ended, so there is no longer any
      point in forcing quiescent states in order to try to make it end sooner.
      This commit therefore causes rcu_gp_cleanup() to clear any bits other
      than RCU_GP_FLAG_INIT from ->gp_flags at the end of the grace period.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      a824a287
    • P
      rcu: Cleanup, don't put ->completed into an int · a508aa59
      Paul E. McKenney 提交于
      It is true that currently only the low-order two bits are used, so
      there should be no problem given modern machines and compilers, but
      good hygiene and maintainability dictates use of an unsigned long
      instead of an int.  This commit therefore makes this change.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      a508aa59
    • P
      rcu: Switch __rcu_process_callbacks() to rcu_accelerate_cbs() · bd7af846
      Paul E. McKenney 提交于
      The __rcu_process_callbacks() function currently checks to see if
      the current CPU needs a grace period and also if there is any other
      reason to kick off a new grace period.  This is one of the fail-safe
      checks that has been rendered unnecessary by the changes that increase
      the accuracy of rcu_gp_cleanup()'s estimate as to whether another grace
      period is required.  Because this particular fail-safe involved acquiring
      the root rcu_node structure's ->lock, which has seen excessive contention
      in real life, this fail-safe needs to go.
      
      However, one check must remain, namely the check for newly arrived
      RCU callbacks that have not yet been associated with a grace period.
      One might hope that the checks in __note_gp_changes(), which is invoked
      indirectly from rcu_check_quiescent_state(), would suffice, but this
      function won't be invoked at all if RCU is idle.  It is therefore necessary
      to replace the fail-safe checks with a simpler check for newly arrived
      callbacks during an RCU idle period, which is exactly what this commit
      does.  This change removes the final call to rcu_start_gp(), so this
      function is removed as well.
      
      Note that lockless use of cpu_needs_another_gp() is racy, but that
      these races are harmless in this case.  If RCU really is idle, the
      values will not change, so the return value from cpu_needs_another_gp()
      will be correct.  If RCU is not idle, the resulting redundant call to
      rcu_accelerate_cbs() will be harmless, and might even have the benefit
      of reducing grace-period latency a bit.
      
      This commit also moves interrupt disabling into the "if" statement to
      improve real-time response a bit.
      Reported-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      bd7af846
    • P
      rcu: Avoid __call_rcu_core() root rcu_node ->lock acquisition · a6058d85
      Paul E. McKenney 提交于
      When __call_rcu_core() notices excessive numbers of callbacks pending
      on the current CPU, we know that at least one of them is not yet
      classified, namely the one that was just now queued.  Therefore, it
      is not necessary to invoke rcu_start_gp() and thus not necessary to
      acquire the root rcu_node structure's ->lock.  This commit therefore
      replaces the rcu_start_gp() with rcu_accelerate_cbs(), thus replacing
      an acquisition of the root rcu_node structure's ->lock with that of
      this CPU's leaf rcu_node structure.
      
      This decreases contention on the root rcu_node structure's ->lock.
      Reported-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      a6058d85
    • P
      rcu: Make rcu_migrate_callbacks wake GP kthread when needed · ec4eacce
      Paul E. McKenney 提交于
      The rcu_migrate_callbacks() function invokes rcu_advance_cbs()
      twice, ignoring the return value.  This is OK at pressent because of
      failsafe code that does the wakeup when needed.  However, this failsafe
      code acquires the root rcu_node structure's lock frequently, while
      rcu_migrate_callbacks() does so only once per CPU-offline operation.
      
      This commit therefore makes rcu_migrate_callbacks()
      wake up the RCU GP kthread when either call to rcu_advance_cbs()
      returns true, thus removing need for the failsafe code.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      ec4eacce
    • P
      rcu: Convert ->need_future_gp[] array to boolean · 6f576e28
      Paul E. McKenney 提交于
      There is no longer any need for ->need_future_gp[] to count the number of
      requests for future grace periods, so this commit converts the additions
      to assignments to "true" and reduces the size of each element to one byte.
      While we are in the area, fix an obsolete comment.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      6f576e28
    • P
      rcu: Make rcu_future_needs_gp() check all ->need_future_gps[] elements · 0ae94e00
      Paul E. McKenney 提交于
      Currently, the rcu_future_needs_gp() function checks only the current
      element of the ->need_future_gps[] array, which might miss elements that
      were offset from the expected element, for example, due to races with
      the start or the end of a grace period.  This commit therefore makes
      rcu_future_needs_gp() use the need_any_future_gp() macro to check all
      of the elements of this array.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      0ae94e00
    • P
      rcu: Make rcu_gp_cleanup() more accurately predict need for new GP · fb31340f
      Paul E. McKenney 提交于
      Currently, rcu_gp_cleanup() scans the rcu_node tree in order to reset
      state to reflect the end of the grace period.  It also checks to see
      whether a new grace period is needed, but in a number of cases, rather
      than directly cause the new grace period to be immediately started, it
      instead leaves the grace-period-needed state where various fail-safes
      can find it.  This works fine, but results in higher contention on the
      root rcu_node structure's ->lock, which is undesirable, and contention
      on that lock has recently become noticeable.
      
      This commit therefore makes rcu_gp_cleanup() immediately start a new
      grace period if there is any need for one.
      
      It is quite possible that it will later be necessary to throttle the
      grace-period rate, but that can be dealt with when and if.
      Reported-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      fb31340f
    • P
      rcu: Make rcu_gp_kthread() check for early-boot activity · 5fe0a562
      Paul E. McKenney 提交于
      The rcu_gp_kthread() function immediately sleeps waiting to be notified
      of the need for a new grace period, which currently works because there
      are a number of code sequences that will provide the needed wakeup later.
      However, some of these code sequences need to acquire the root rcu_node
      structure's ->lock, and contention on that lock has started manifesting.
      This commit therefore makes rcu_gp_kthread() check for early-boot activity
      when it starts up, omitting the initial sleep in that case.
      Reported-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      5fe0a562
    • P
      rcu: Add accessor macros for the ->need_future_gp[] array · c91a8675
      Paul E. McKenney 提交于
      Accessors for the ->need_future_gp[] array are currently open-coded,
      which makes them difficult to change.  To improve maintainability, this
      commit adds need_future_gp_mask() to compute the indexing mask from the
      array size, need_future_gp_element() to access the element corresponding
      to the specified grace-period number, and need_any_future_gp() to
      determine if any future grace period is needed.  This commit also applies
      need_future_gp_element() to existing open-coded single-element accesses.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      c91a8675
    • P
      rcu: Make rcu_start_future_gp()'s grace-period check more precise · 825a9911
      Paul E. McKenney 提交于
      The rcu_start_future_gp() function uses a sloppy check for a grace
      period being in progress, which works today because there are a number
      of code sequences that resolve the resulting races.  However, some of
      these race-resolution code sequences must acquire the root rcu_node
      structure's ->lock, and contention on that lock has started manifesting.
      This commit therefore makes rcu_start_future_gp() check more precise,
      eliminating the sloppy lockless check of the rcu_state structure's ->gpnum
      and ->completed fields.  The effect is that rcu_start_future_gp() will
      sometimes unnecessarily attempt to start a new grace period, but this
      overhead will be reduced later using funnel locking.
      Reported-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      825a9911
    • P
      rcu: Improve non-root rcu_cbs_completed() accuracy · 9036c2ff
      Paul E. McKenney 提交于
      When rcu_cbs_completed() is invoked on a non-root rcu_node structure,
      it unconditionally assumes that two grace periods must complete before
      the callbacks at hand can be invoked.  This is overly conservative because
      if that non-root rcu_node structure believes that no grace period is in
      progress, and if the corresponding rcu_state structure's ->gpnum field
      has not yet been incremented, then these callbacks may safely be invoked
      after only one grace period has completed.
      
      This change is required to permit grace-period start requests to use
      funnel locking, which is in turn permitted to reduce root rcu_node ->lock
      contention, which has been observed by Nick Piggin.  Furthermore, such
      contention will likely be increased by the merging of RCU-bh, RCU-preempt,
      and RCU-sched, so it makes sense to take steps to decrease it.
      
      This commit therefore improves the accuracy of rcu_cbs_completed() when
      invoked on a non-root rcu_node structure as described above.
      Reported-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      9036c2ff
    • P
      rcu: Add leaf-node macros · 5b4c11d5
      Paul E. McKenney 提交于
      This commit adds rcu_first_leaf_node() that returns a pointer to
      the first leaf rcu_node structure in the specified RCU flavor and an
      rcu_is_leaf_node() that returns true iff the specified rcu_node structure
      is a leaf.  This commit also uses these macros where appropriate.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      5b4c11d5
    • P
      rcu: Rename cond_resched_rcu_qs() to cond_resched_tasks_rcu_qs() · cee43939
      Paul E. McKenney 提交于
      Commit e31d28b6 ("trace: Eliminate cond_resched_rcu_qs() in favor
      of cond_resched()") substituted cond_resched() for the earlier call
      to cond_resched_rcu_qs().  However, the new-age cond_resched() does
      not do anything to help RCU-tasks grace periods because (1) RCU-tasks
      is only enabled when CONFIG_PREEMPT=y and (2) cond_resched() is a
      complete no-op when preemption is enabled.  This situation results
      in hangs when running the trace benchmarks.
      
      A number of potential fixes were discussed on LKML
      (https://lkml.kernel.org/r/20180224151240.0d63a059@vmware.local.home),
      including making cond_resched() not be a no-op; making cond_resched()
      not be a no-op, but only when running tracing benchmarks; reverting
      the aforementioned commit (which works because cond_resched_rcu_qs()
      does provide an RCU-tasks quiescent state; and adding a call to the
      scheduler/RCU rcu_note_voluntary_context_switch() function.  All were
      deemed unsatisfactory, either due to added cond_resched() overhead or
      due to magic functions inviting cargo culting.
      
      This commit renames cond_resched_rcu_qs() to cond_resched_tasks_rcu_qs(),
      which provides a clear hint as to what this function is doing and
      why and where it should be used, and then replaces the call to
      cond_resched() with cond_resched_tasks_rcu_qs() in the trace benchmark's
      benchmark_event_kthread() function.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      cee43939
    • P
      rcu: Parallelize expedited grace-period initialization · 25f3d7ef
      Paul E. McKenney 提交于
      The latency of RCU expedited grace periods grows with increasing numbers
      of CPUs, eventually failing to be all that expedited.  Much of the growth
      in latency is in the initialization phase, so this commit uses workqueues
      to carry out this initialization concurrently on a rcu_node-by-rcu_node
      basis.
      
      This change makes use of a new rcu_par_gp_wq because flushing a work
      item from another work item running from the same workqueue can result
      in deadlock.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NNicholas Piggin <npiggin@gmail.com>
      25f3d7ef
  3. 24 2月, 2018 1 次提交
    • P
      rcu: Create RCU-specific workqueues with rescuers · ad7c946b
      Paul E. McKenney 提交于
      RCU's expedited grace periods can participate in out-of-memory deadlocks
      due to all available system_wq kthreads being blocked and there not being
      memory available to create more.  This commit prevents such deadlocks
      by allocating an RCU-specific workqueue_struct at early boot time, and
      providing it with a rescuer to ensure forward progress.  This uses the
      shiny new init_rescuer() function provided by Tejun (but indirectly).
      
      This commit also causes SRCU to use this new RCU-specific
      workqueue_struct.  Note that SRCU's use of workqueues never blocks them
      waiting for readers, so this should be safe from a forward-progress
      viewpoint.  Note that this moves SRCU from system_power_efficient_wq
      to a normal workqueue.  In the unlikely event that this results in
      measurable degradation, a separate power-efficient workqueue will be
      creates for SRCU.
      Reported-by: NPrateek Sood <prsood@codeaurora.org>
      Reported-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      ad7c946b
  4. 21 2月, 2018 5 次提交
  5. 16 2月, 2018 1 次提交
  6. 12 12月, 2017 1 次提交
  7. 29 11月, 2017 8 次提交
  8. 28 11月, 2017 1 次提交