提交 · 665f08f1ce9cf608a9435e11d66f55be4e72540a · OpenHarmony / kernel_linux

16 5月, 2018 17 次提交

rcu: Make rcu_start_this_gp() check for out-of-range requests · 665f08f1

由 Paul E. McKenney 提交于 4月 19, 2018

If rcu_start_this_gp() is invoked with a requested grace period more
than three in the future, then either the ->need_future_gp[] array
needs to be bigger or the caller needs to be repaired. This commit
therefore adds a WARN_ON_ONCE() checking for this condition.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

665f08f1

rcu: Add funnel locking to rcu_start_this_gp() · 360e0da6

由 Paul E. McKenney 提交于 4月 12, 2018

The rcu_start_this_gp() function had a simple form of funnel locking that
used only the leaves and root of the rcu_node tree, which is fine for
systems with only a few hundred CPUs, but sub-optimal for systems having
thousands of CPUs.  This commit therefore adds full-tree funnel locking.

This variant of funnel locking is unusual in the following ways:

1.	The leaf-level rcu_node structure's ->lock is held throughout.
	Other funnel-locking implementations drop the leaf-level lock
	before progressing to the next level of the tree.

2.	Funnel locking can be started at the root, which is convenient
	for code that already holds the root rcu_node structure's ->lock.
	Other funnel-locking implementations start at the leaves.

3.	If an rcu_node structure other than the initial one believes
	that a grace period is in progress, it is not necessary to
	go further up the tree.  This is because grace-period cleanup
	scans the full tree, so that marking the need for a subsequent
	grace period anywhere in the tree suffices -- but only if
	a grace period is currently in progress.

4.	It is possible that the RCU grace-period kthread has not yet
	started, and this case must be handled appropriately.

However, the general approach of using a tree to control lock contention
is still in place.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

360e0da6

rcu: Make rcu_start_future_gp() caller select grace period · 41e80595

由 Paul E. McKenney 提交于 4月 12, 2018

The rcu_accelerate_cbs() function selects a grace-period target, which
it uses to have rcu_segcblist_accelerate() assign numbers to recently
queued callbacks. Then it invokes rcu_start_future_gp(), which selects
a grace-period target again, which is a bit pointless. This commit
therefore changes rcu_start_future_gp() to take the grace-period target as
a parameter, thus avoiding double selection. This commit also changes
the name of rcu_start_future_gp() to rcu_start_this_gp() to reflect
this change in functionality, and also makes a similar change to the
name of trace_rcu_future_gp().
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

41e80595

rcu: Inline rcu_start_gp_advanced() into rcu_start_future_gp() · d5cd9685

由 Paul E. McKenney 提交于 4月 12, 2018

The rcu_start_gp_advanced() is invoked only from rcu_start_future_gp() and
much of its code is redundant when invoked from that context.  This commit
therefore inlines rcu_start_gp_advanced() into rcu_start_future_gp(),
then removes rcu_start_gp_advanced().
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

d5cd9685

rcu: Clear request other than RCU_GP_FLAG_INIT at GP end · a824a287

由 Paul E. McKenney 提交于 4月 19, 2018

Once the grace period has ended, any RCU_GP_FLAG_FQS requests are
irrelevant: The grace period has ended, so there is no longer any
point in forcing quiescent states in order to try to make it end sooner.
This commit therefore causes rcu_gp_cleanup() to clear any bits other
than RCU_GP_FLAG_INIT from ->gp_flags at the end of the grace period.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

a824a287

rcu: Cleanup, don't put ->completed into an int · a508aa59

由 Paul E. McKenney 提交于 4月 19, 2018

It is true that currently only the low-order two bits are used, so
there should be no problem given modern machines and compilers, but
good hygiene and maintainability dictates use of an unsigned long
instead of an int. This commit therefore makes this change.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

a508aa59

rcu: Switch __rcu_process_callbacks() to rcu_accelerate_cbs() · bd7af846

由 Paul E. McKenney 提交于 4月 11, 2018

The __rcu_process_callbacks() function currently checks to see if
the current CPU needs a grace period and also if there is any other
reason to kick off a new grace period. This is one of the fail-safe
checks that has been rendered unnecessary by the changes that increase
the accuracy of rcu_gp_cleanup()'s estimate as to whether another grace
period is required. Because this particular fail-safe involved acquiring
the root rcu_node structure's ->lock, which has seen excessive contention
in real life, this fail-safe needs to go.

However, one check must remain, namely the check for newly arrived
RCU callbacks that have not yet been associated with a grace period.
One might hope that the checks in __note_gp_changes(), which is invoked
indirectly from rcu_check_quiescent_state(), would suffice, but this
function won't be invoked at all if RCU is idle. It is therefore necessary
to replace the fail-safe checks with a simpler check for newly arrived
callbacks during an RCU idle period, which is exactly what this commit
does. This change removes the final call to rcu_start_gp(), so this
function is removed as well.

Note that lockless use of cpu_needs_another_gp() is racy, but that
these races are harmless in this case. If RCU really is idle, the
values will not change, so the return value from cpu_needs_another_gp()
will be correct. If RCU is not idle, the resulting redundant call to
rcu_accelerate_cbs() will be harmless, and might even have the benefit
of reducing grace-period latency a bit.

This commit also moves interrupt disabling into the "if" statement to
improve real-time response a bit.
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

bd7af846

rcu: Avoid __call_rcu_core() root rcu_node ->lock acquisition · a6058d85

由 Paul E. McKenney 提交于 4月 11, 2018

When __call_rcu_core() notices excessive numbers of callbacks pending
on the current CPU, we know that at least one of them is not yet
classified, namely the one that was just now queued.  Therefore, it
is not necessary to invoke rcu_start_gp() and thus not necessary to
acquire the root rcu_node structure's ->lock.  This commit therefore
replaces the rcu_start_gp() with rcu_accelerate_cbs(), thus replacing
an acquisition of the root rcu_node structure's ->lock with that of
this CPU's leaf rcu_node structure.

This decreases contention on the root rcu_node structure's ->lock.
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

a6058d85

rcu: Make rcu_migrate_callbacks wake GP kthread when needed · ec4eacce

由 Paul E. McKenney 提交于 4月 22, 2018

The rcu_migrate_callbacks() function invokes rcu_advance_cbs()
twice, ignoring the return value. This is OK at pressent because of
failsafe code that does the wakeup when needed. However, this failsafe
code acquires the root rcu_node structure's lock frequently, while
rcu_migrate_callbacks() does so only once per CPU-offline operation.

This commit therefore makes rcu_migrate_callbacks()
wake up the RCU GP kthread when either call to rcu_advance_cbs()
returns true, thus removing need for the failsafe code.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

ec4eacce

rcu: Convert ->need_future_gp[] array to boolean · 6f576e28

由 Paul E. McKenney 提交于 4月 18, 2018

There is no longer any need for ->need_future_gp[] to count the number of
requests for future grace periods, so this commit converts the additions
to assignments to "true" and reduces the size of each element to one byte.
While we are in the area, fix an obsolete comment.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

6f576e28

rcu: Make rcu_future_needs_gp() check all ->need_future_gps[] elements · 0ae94e00

由 Paul E. McKenney 提交于 4月 18, 2018

Currently, the rcu_future_needs_gp() function checks only the current
element of the ->need_future_gps[] array, which might miss elements that
were offset from the expected element, for example, due to races with
the start or the end of a grace period.  This commit therefore makes
rcu_future_needs_gp() use the need_any_future_gp() macro to check all
of the elements of this array.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

0ae94e00

rcu: Avoid losing ->need_future_gp[] values due to GP start/end races · 51af970d

由 Paul E. McKenney 提交于 4月 14, 2018

The rcu_cbs_completed() function provides the value of ->completed
at which new callbacks can safely be invoked.  This is recorded in
two-element ->need_future_gp[] arrays in the rcu_node structure, and
the elements of these arrays corresponding to the just-completed grace
period are zeroed at the end of that grace period.  However, the
rcu_cbs_completed() function can return the current ->completed value
plus either one or two, so it is possible for the corresponding
->need_future_gp[] entry to be cleared just after it was set, thus
losing a request for a future grace period.

This commit avoids this race by expanding ->need_future_gp[] to four
elements.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

51af970d

rcu: Make rcu_gp_cleanup() more accurately predict need for new GP · fb31340f

由 Paul E. McKenney 提交于 4月 12, 2018

Currently, rcu_gp_cleanup() scans the rcu_node tree in order to reset
state to reflect the end of the grace period. It also checks to see
whether a new grace period is needed, but in a number of cases, rather
than directly cause the new grace period to be immediately started, it
instead leaves the grace-period-needed state where various fail-safes
can find it. This works fine, but results in higher contention on the
root rcu_node structure's ->lock, which is undesirable, and contention
on that lock has recently become noticeable.

This commit therefore makes rcu_gp_cleanup() immediately start a new
grace period if there is any need for one.

It is quite possible that it will later be necessary to throttle the
grace-period rate, but that can be dealt with when and if.
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

fb31340f

rcu: Make rcu_gp_kthread() check for early-boot activity · 5fe0a562

由 Paul E. McKenney 提交于 4月 11, 2018

The rcu_gp_kthread() function immediately sleeps waiting to be notified
of the need for a new grace period, which currently works because there
are a number of code sequences that will provide the needed wakeup later.
However, some of these code sequences need to acquire the root rcu_node
structure's ->lock, and contention on that lock has started manifesting.
This commit therefore makes rcu_gp_kthread() check for early-boot activity
when it starts up, omitting the initial sleep in that case.
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

5fe0a562

rcu: Add accessor macros for the ->need_future_gp[] array · c91a8675

由 Paul E. McKenney 提交于 4月 18, 2018

Accessors for the ->need_future_gp[] array are currently open-coded,
which makes them difficult to change. To improve maintainability, this
commit adds need_future_gp_mask() to compute the indexing mask from the
array size, need_future_gp_element() to access the element corresponding
to the specified grace-period number, and need_any_future_gp() to
determine if any future grace period is needed. This commit also applies
need_future_gp_element() to existing open-coded single-element accesses.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

c91a8675

rcu: Make rcu_start_future_gp()'s grace-period check more precise · 825a9911

由 Paul E. McKenney 提交于 4月 11, 2018

The rcu_start_future_gp() function uses a sloppy check for a grace
period being in progress, which works today because there are a number
of code sequences that resolve the resulting races. However, some of
these race-resolution code sequences must acquire the root rcu_node
structure's ->lock, and contention on that lock has started manifesting.
This commit therefore makes rcu_start_future_gp() check more precise,
eliminating the sloppy lockless check of the rcu_state structure's ->gpnum
and ->completed fields. The effect is that rcu_start_future_gp() will
sometimes unnecessarily attempt to start a new grace period, but this
overhead will be reduced later using funnel locking.
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

825a9911

rcu: Improve non-root rcu_cbs_completed() accuracy · 9036c2ff

由 Paul E. McKenney 提交于 4月 10, 2018

When rcu_cbs_completed() is invoked on a non-root rcu_node structure,
it unconditionally assumes that two grace periods must complete before
the callbacks at hand can be invoked. This is overly conservative because
if that non-root rcu_node structure believes that no grace period is in
progress, and if the corresponding rcu_state structure's ->gpnum field
has not yet been incremented, then these callbacks may safely be invoked
after only one grace period has completed.

This change is required to permit grace-period start requests to use
funnel locking, which is in turn permitted to reduce root rcu_node ->lock
contention, which has been observed by Nick Piggin. Furthermore, such
contention will likely be increased by the merging of RCU-bh, RCU-preempt,
and RCU-sched, so it makes sense to take steps to decrease it.

This commit therefore improves the accuracy of rcu_cbs_completed() when
invoked on a non-root rcu_node structure as described above.
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NNicholas Piggin <npiggin@gmail.com>

9036c2ff

14 4月, 2018 14 次提交

kernel/kexec_file.c: allow archs to set purgatory load address · 3be3f61d