提交 · 330e9e46253cbfab178450c976aa90ef0f3ae940 · openanolis / cloud-kernel

09 6月, 2017 6 次提交

rcu: Remove nohz_full full-system-idle state machine · fe5ac724

由 Paul E. McKenney 提交于 5月 11, 2017

The NO_HZ_FULL_SYSIDLE full-system-idle capability was added in 2013
by commit 0edd1b17 ("nohz_full: Add full-system-idle state machine"),
but has not been used.  This commit therefore removes it.

If it turns out to be needed later, this commit can always be reverted.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe5ac724

rcu: Remove the RCU_KTHREAD_PRIO Kconfig option · f7a10a97

由 Paul E. McKenney 提交于 5月 10, 2017

Anything that can be done with the RCU_KTHREAD_PRIO Kconfig option can
also be done with the rcutree.kthread_prio kernel boot parameter.
This commit therefore removes this Kconfig option.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rik van Riel <riel@redhat.com>

f7a10a97

rcu: Remove *_SLOW_* Kconfig options · 90040c9e

由 Paul E. McKenney 提交于 5月 10, 2017

The RCU_TORTURE_TEST_SLOW_PREINIT, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY,
RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_INIT,
RCU_TORTURE_TEST_SLOW_INIT_DELAY, RCU_TORTURE_TEST_SLOW_CLEANUP,
and RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig options are only
useful for torture testing, and there are the rcutree.gp_cleanup_delay,
rcutree.gp_init_delay, and rcutree.gp_preinit_delay kernel boot parameters
that rcutorture can use instead.  The effect of these parameters is to
artificially slow down grace period initialization and cleanup in order
to make some types of race conditions happen more often.

This commit therefore simplifies Tree RCU a bit by removing the Kconfig
options and adding the corresponding kernel parameters to rcutorture's
.boot files instead.  However, this commit also leaves out the kernel
parameters for TREE02, TREE04, and TREE07 in order to have about the
same number of tests slowed as not slowed.  TREE01, TREE03, TREE05,
and TREE06 are slowed, and the rest are not slowed.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

90040c9e

rcu: Improve __call_rcu() debug-objects error message · fa3c6647

由 Paul E. McKenney 提交于 5月 03, 2017

The "__call_rcu(): Leaked duplicate callback" error message from
__call_rcu() has proven to be unhelpful.  This commit therefore changes
it to "__call_rcu(): Double-freed CB" and adds the value of the pointer
passed in.  The value of the pointer improves debuggability by allowing
correlation with tracing output, for example, the rcu:rcu_callback trace
event.
Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

fa3c6647

rcu: Eliminate the unused __rcu_is_watching() function · 791875d1

由 Paul E. McKenney 提交于 5月 03, 2017

The __rcu_is_watching() function is currently not used, aside from
to implement the rcu_is_watching() function.  This commit therefore
eliminates __rcu_is_watching(), which has the beneficial side-effect
of shrinking include/linux/rcupdate.h a bit.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

791875d1

rcu: Move docbook comments out of rcupdate.h · a68a2bb2

由 Paul E. McKenney 提交于 5月 03, 2017

The include/linux/rcupdate.h file is included by more than 200
files, so shrinking it should provide some build-time benefits.
This commit therefore moves several docbook comments from rcupdate.h to
kernel/rcu/update.c, kernel/rcu/tree.c, and kernel/rcu/tree_plugin.h, thus
reducing the number of times that the compiler has to scan these comments.
This likely provides only a small benefit, but every little bit helps.

This commit also fixes a malformed bulleted list noted by the 0day
Test Robot.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

a68a2bb2

08 6月, 2017 6 次提交

rcu: Add lockdep_assert_held() teeth to tree.c · c0b334c5

由 Paul E. McKenney 提交于 4月 28, 2017

Comments can be helpful, but assertions carry more force. This
commit therefore adds lockdep_assert_held() and RCU_LOCKDEP_WARN()
calls to enforce lock-held and interrupt-disabled preconditions.
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

c0b334c5

rcu: Update rcu_bootup_announce_oddness() · 17c7798b

由 Paul E. McKenney 提交于 4月 28, 2017

This commit updates rcu_bootup_announce_oddness() to check additional
Kconfig options and module/boot parameters.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

17c7798b

rcu: Add preemptibility checks in rcu_sched_qs() and rcu_bh_qs() · f4687d26

由 Paul E. McKenney 提交于 4月 27, 2017

This commit adds WARN_ON_ONCE() calls that trigger if either
rcu_sched_qs() or rcu_bh_qs() are invoked with preemption enabled.
In the immortal words of Peter Zijlstra: "these are much harder to ignore
than comments".
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

f4687d26

rcu: Remove obsolete reference to synchronize_kernel() · e28371c8

由 Paul E. McKenney 提交于 4月 17, 2017

The synchronize_kernel() primitive was removed in favor of
synchronize_sched() more than a decade ago, and it seems likely that
rather few kernel hackers are familiar with it.  Its continued presence
is therefore providing more confusion than enlightenment.  This commit
therefore removes the reference from the synchronize_sched() header
comment, and adds the corresponding information to the synchronize_rcu(0
header comment.
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

e28371c8

rcu: Complain if blocking in preemptible RCU read-side critical section · 5b72f964

由 Paul E. McKenney 提交于 4月 12, 2017

Although preemptible RCU allows its read-side critical sections to be
preempted, general blocking is forbidden. The reason for this is that
excessive preemption times can be handled by CONFIG_RCU_BOOST=y, but a
voluntarily blocked task doesn't care how high you boost its priority.
Because preemptible RCU is a global mechanism, one ill-behaved reader
hurts everyone. Hence the prohibition against general blocking in
RCU-preempt read-side critical sections. Preemption yes, blocking no.

This commit enforces this prohibition.

There is a special exception for the -rt patchset (which they kindly
volunteered to implement): It is OK to block (as opposed to merely being
preempted) within an RCU-preempt read-side critical section, but only if
the blocking is subject to priority inheritance. This exception permits
CONFIG_RCU_BOOST=y to get -rt RCU readers out of trouble.

Why doesn't this exception also apply to mainline's rt_mutex? Because
of the possibility that someone does general blocking while holding
an rt_mutex. Yes, the priority boosting will affect the rt_mutex,
but it won't help with the task doing general blocking while holding
that rt_mutex.
Reported-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

5b72f964

rcu: Prevent rcu_barrier() from starting needless grace periods · f92c734f

由 Paul E. McKenney 提交于 4月 10, 2017

Currently rcu_barrier() uses call_rcu() to enqueue new callbacks
on each CPU with a non-empty callback list.  This works, but means
that rcu_barrier() forces grace periods that are not otherwise needed.
The key point is that rcu_barrier() never needs to wait for a grace
period, but instead only for all pre-existing callbacks to be invoked.
This means that rcu_barrier()'s new callbacks should be placed in
the callback-list segment containing the last pre-existing callback.

This commit makes this change using the new rcu_segcblist_entrain()
function.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

f92c734f

03 5月, 2017 2 次提交

rcu: Open-code the rcu_cblist_n_lazy_cbs() function · 933dfbd7

由 Paul E. McKenney 提交于 5月 02, 2017

Because the rcu_cblist_n_lazy_cbs() just samples the ->len_lazy counter,
and because the rcu_cblist structure is quite straightforward, it makes
sense to open-code rcu_cblist_n_lazy_cbs(p) as p->len_lazy, cutting out
a level of indirection.  This commit makes this change.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>

933dfbd7

rcu: Open-code the rcu_cblist_n_cbs() function · 4b27f20b

由 Paul E. McKenney 提交于 5月 02, 2017

Because the rcu_cblist_n_cbs() just samples the ->len counter, and
because the rcu_cblist structure is quite straightforward, it makes
sense to open-code rcu_cblist_n_cbs(p) as p->len, cutting out a level
of indirection.  This commit makes this change.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>

4b27f20b

02 5月, 2017 1 次提交

rcu: Open-code the rcu_cblist_empty() function · 8ef0f37e

由 Paul E. McKenney 提交于 5月 02, 2017

Because the rcu_cblist_empty() just samples the ->head pointer, and
because the rcu_cblist structure is quite straightforward, it makes
sense to open-code rcu_cblist_empty(p) as !p->head, cutting out a
level of indirection.  This commit makes this change.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>

8ef0f37e

27 4月, 2017 1 次提交

srcu: Make rcutorture writer stalls print SRCU GP state · 7f6733c3

由 Paul E. McKenney 提交于 4月 18, 2017

In the past, SRCU was simple enough that there was little point in
making the rcutorture writer stall messages print the SRCU grace-period
number state.  With the advent of Tree SRCU, this has changed.  This
commit therefore makes Classic, Tiny, and Tree SRCU report this state
to rcutorture as needed.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NMike Galbraith <efault@gmx.de>

7f6733c3

21 4月, 2017 2 次提交

rcu: Make non-preemptive schedule be Tasks RCU quiescent state · bcbfdd01

由 Paul E. McKenney 提交于 4月 11, 2017

Currently, a call to schedule() acts as a Tasks RCU quiescent state
only if a context switch actually takes place. However, just the
call to schedule() guarantees that the calling task has moved off of
whatever tracing trampoline that it might have been one previously.
This commit therefore plumbs schedule()'s "preempt" parameter into
rcu_note_context_switch(), which then records the Tasks RCU quiescent
state, but only if this call to schedule() was -not- due to a preemption.

To avoid adding overhead to the common-case context-switch path,
this commit hides the rcu_note_context_switch() check under an existing
non-common-case check.
Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

bcbfdd01

srcu: Parallelize callback handling · da915ad5

由 Paul E. McKenney 提交于 4月 05, 2017

Peter Zijlstra proposed using SRCU to reduce mmap_sem contention [1,2],
however, there are workloads that could result in a high volume of
concurrent invocations of call_srcu(), which with current SRCU would
result in excessive lock contention on the srcu_struct structure's
->queue_lock, which protects SRCU's callback lists. This commit therefore
moves SRCU to per-CPU callback lists, thus greatly reducing contention.

Because a given SRCU instance no longer has a single centralized callback
list, starting grace periods and invoking callbacks are both more complex
than in the single-list Classic SRCU implementation. Starting grace
periods and handling callbacks are now handled using an srcu_node tree
that is in some ways similar to the rcu_node trees used by RCU-bh,
RCU-preempt, and RCU-sched (for example, the srcu_node tree shape is
controlled by exactly the same Kconfig options and boot parameters that
control the shape of the rcu_node tree).

In addition, the old per-CPU srcu_array structure is now named srcu_data
and contains an rcu_segcblist structure named ->srcu_cblist for its
callbacks (and a spinlock to protect this). The srcu_struct gets
an srcu_gp_seq that is used to associate callback segments with the
corresponding completion-time grace-period number. These completion-time
grace-period numbers are propagated up the srcu_node tree so that the
grace-period workqueue handler can determine whether additional grace
periods are needed on the one hand and where to look for callbacks that
are ready to be invoked.

The srcu_barrier() function must now wait on all instances of the per-CPU
->srcu_cblist. Because each ->srcu_cblist is protected by ->lock,
srcu_barrier() can remotely add the needed callbacks. In theory,
it could also remotely start grace periods, but in practice doing so
is complex and racy. And interestingly enough, it is never necessary
for srcu_barrier() to start a grace period because srcu_barrier() only
enqueues a callback when a callback is already present--and it turns out
that a grace period has to have already been started for this pre-existing
callback. Furthermore, it is only the callback that srcu_barrier()
needs to wait on, not any particular grace period. Therefore, a new
rcu_segcblist_entrain() function enqueues the srcu_barrier() function's
callback into the same segment occupied by the last pre-existing callback
in the list. The special case where all the pre-existing callbacks are
on a different list (because they are in the process of being invoked)
is handled by enqueuing srcu_barrier()'s callback into the RCU_DONE_TAIL
segment, relying on the done-callbacks check that takes place after all
callbacks are inovked.

Note that the readers use the same algorithm as before. Note that there
is a separate srcu_idx that tells the readers what counter to increment.
This unfortunately cannot be combined with srcu_gp_seq because they
need to be incremented at different times.

This commit introduces some ugly #ifdefs in rcutorture. These will go
away when I feel good enough about Tree SRCU to ditch Classic SRCU.

Some crude performance comparisons, courtesy of a quickly hacked rcuperf
asynchronous-grace-period capability:

Callback Queuing Overhead
-------------------------
# CPUS Classic SRCU Tree SRCU
------ ------------ ---------
2 0.349 us 0.342 us
16 31.66 us 0.4 us
41 --------- 0.417 us

The times are the 90th percentiles, a statistic that was chosen to reject
the overheads of the occasional srcu_barrier() call needed to avoid OOMing
the test machine. The rcuperf test hangs when running Classic SRCU at 41
CPUs, hence the line of dashes. Despite the hacks to both the rcuperf code
and that statistics, this is a convincing demonstration of Tree SRCU's
performance and scalability advantages.

[1] https://lwn.net/Articles/309030/
[2] https://patchwork.kernel.org/patch/5108281/Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix initialization if synchronize_srcu_expedited() called first. ]

da915ad5

20 4月, 2017 4 次提交

rcu: Fix typo in PER_RCU_NODE_PERIOD header comment · bfd090be

由 Paul E. McKenney 提交于 2月 08, 2017

This commit just changes a "the the" to "the" to reduce repetition.
Reported-by: NMichalis Kokologiannakis <mixaskok@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

bfd090be

rcu: Use bool value directly · 50dc7def

由 Nicholas Mc Guire 提交于 3月 25, 2017

The beenonline variable is declared bool so there is no need for an
explicit comparison, especially not against the constant zero.
Signed-off-by: NNicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

50dc7def

P
rcu: Improve comments for hotplug/suspend/hibernate functions · deb34f36
由 Paul E. McKenney 提交于 3月 23, 2017
```
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
```
deb34f36

rcu: Remove obsolete comment from rcu_future_gp_cleanup() header · d1e4f01d

由 Paul E. McKenney 提交于 2月 08, 2017

The rcu_nocb_gp_cleanup() function is now invoked elsewhere, so this
commit drags this comment into the year 2017.
Reported-by: NMichalis Kokologiannakis <mixaskok@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

d1e4f01d

19 4月, 2017 12 次提交

srcu: Make num_rcu_lvl[] array be external · e95d68d2

由 Paul E. McKenney 提交于 3月 15, 2017

This commit makes the num_rcu_lvl[] array external so that SRCU can
make use of it for initializing its upcoming srcu_node tree.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

e95d68d2

rcu: Remove redundant levelcnt[] array from rcu_init_one() · 41f5c631

由 Paul E. McKenney 提交于 3月 15, 2017

The levelcnt[] array is identical to num_rcu_lvl[], so this commit
removes levelcnt[].
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

41f5c631

srcu: Move rcu_init_levelspread() to rcu_tree_node.h · 2b34c43c

由 Paul E. McKenney 提交于 3月 14, 2017

This commit moves the rcu_init_levelspread() function from
kernel/rcu/tree.c to kernel/rcu/rcu.h so that SRCU can access it. This is
another step towards enabling SRCU to create its own combining tree.
This commit is code-movement only, give or take knock-on adjustments.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

2b34c43c

srcu: Move rcu_seq_start() and friends to rcu.h · 2e8c28c2

由 Paul E. McKenney 提交于 2月 20, 2017

This commit moves rcu_seq_start(), rcu_seq_end(), rcu_seq_snap(),
and rcu_seq_done() from kernel/rcu/tree.c to kernel/rcu/rcu.h.
This will allow SRCU to use these functions, which in turn will
allow SRCU to move from a single global callback queue to a
per-CPU callback queue.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

2e8c28c2

srcu: Allow SRCU to access rcu_scheduler_active · 900b1028

由 Paul E. McKenney 提交于 2月 10, 2017

This is primarily a code-movement commit in preparation for allowing
SRCU to handle early-boot SRCU grace periods.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

900b1028

srcu: Abstract multi-tail callback list handling · 15fecf89

由 Paul E. McKenney 提交于 2月 08, 2017

RCU has only one multi-tail callback list, which is implemented via
the nxtlist, nxttail, nxtcompleted, qlen_lazy, and qlen fields in the
rcu_data structure, and whose operations are open-code throughout the
Tree RCU implementation. This has been more or less OK in the past,
but upcoming callback-list optimizations in SRCU could really use
a multi-tail callback list there as well.

This commit therefore abstracts the multi-tail callback list handling
into a new kernel/rcu/rcu_segcblist.h file, and uses this new API.
The simple head-and-tail pointer callback list is also abstracted and
applied everywhere except for the NOCB callback-offload lists. (Yes,
the plan is to apply them there as well, but this commit is already
bigger than would be good.)
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

15fecf89

rcu: Place guard on rcu_all_qs() and rcu_note_context_switch() actions · 9226b10d

由 Paul E. McKenney 提交于 1月 27, 2017

The rcu_all_qs() and rcu_note_context_switch() do a series of checks,
taking various actions to supply RCU with quiescent states, depending
on the outcomes of the various checks. This is a bit much for scheduling
fastpaths, so this commit creates a separate ->rcu_urgent_qs field in
the rcu_dynticks structure that acts as a global guard for these checks.
Thus, in the common case, rcu_all_qs() and rcu_note_context_switch()
check the ->rcu_urgent_qs field, find it false, and simply return.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>

9226b10d

rcu: Eliminate flavor scan in rcu_momentary_dyntick_idle() · 0f9be8ca

由 Paul E. McKenney 提交于 1月 27, 2017

The rcu_momentary_dyntick_idle() function scans the RCU flavors, checking
that one of them still needs a quiescent state before doing an expensive
atomic operation on the ->dynticks counter. However, this check reduces
overhead only after a rare race condition, and increases complexity. This
commit therefore removes the scan and the mechanism enabling the scan.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

0f9be8ca

rcu: Pull rcu_qs_ctr into rcu_dynticks structure · 9577df9a

由 Paul E. McKenney 提交于 1月 26, 2017

The rcu_qs_ctr variable is yet another isolated per-CPU variable,
so this commit pulls it into the pre-existing rcu_dynticks per-CPU
structure.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

9577df9a

rcu: Pull rcu_sched_qs_mask into rcu_dynticks structure · abb06b99

由 Paul E. McKenney 提交于 1月 26, 2017

The rcu_sched_qs_mask variable is yet another isolated per-CPU variable,
so this commit pulls it into the pre-existing rcu_dynticks per-CPU
structure.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

abb06b99

rcu: Semicolon inside RCU_TRACE() for tree.c · 88a4976d

由 Paul E. McKenney 提交于 1月 23, 2017

The current use of "RCU_TRACE(statement);" can cause odd bugs, especially
where "statement" is a local-variable declaration, as it can leave a
misplaced ";" in the source code. This commit therefore converts these
to "RCU_TRACE(statement;)", which avoids the misplaced ";".
Reported-by: NJosh Triplett <josh@joshtriplett.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

88a4976d

rcu: Maintain special bits at bottom of ->dynticks counter · b8c17e66

由 Paul E. McKenney 提交于 11月 08, 2016

Currently, IPIs are used to force other CPUs to invalidate their TLBs
in response to a kernel virtual-memory mapping change. This works, but
degrades both battery lifetime (for idle CPUs) and real-time response
(for nohz_full CPUs), and in addition results in unnecessary IPIs due to
the fact that CPUs executing in usermode are unaffected by stale kernel
mappings. It would be better to cause a CPU executing in usermode to
wait until it is entering kernel mode to do the flush, first to avoid
interrupting usemode tasks and second to handle multiple flush requests
with a single flush in the case of a long-running user task.

This commit therefore reserves a bit at the bottom of the ->dynticks
counter, which is checked upon exit from extended quiescent states.
If it is set, it is cleared and then a new rcu_eqs_special_exit() macro is
invoked, which, if not supplied, is an empty single-pass do-while loop.
If this bottom bit is set on -entry- to an extended quiescent state,
then a WARN_ON_ONCE() triggers.

This bottom bit may be set using a new rcu_eqs_special_set() function,
which returns true if the bit was set, or false if the CPU turned
out to not be in an extended quiescent state. Please note that this
function refuses to set the bit for a non-nohz_full CPU when that CPU
is executing in usermode because usermode execution is tracked by RCU
as a dyntick-idle extended quiescent state only for nohz_full CPUs.
Reported-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>

b8c17e66

11 4月, 2017 2 次提交

rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work · 03ecd3f4

由 Steven Rostedt (VMware) 提交于 4月 07, 2017

Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when
it needs to use rcu_read_lock() and friends. This is because tracing can
happen as RCU is about to enter user space, or about to go idle, and RCU
does not watch for RCU read side critical sections as it makes the
transition.

There is a small location within the RCU infrastructure that rcu_irq_enter()
itself will not work. If tracing were to occur in that section it will break
if it tries to use rcu_irq_enter().

Originally, this happens with the stack_tracer, because it will call
save_stack_trace when it encounters stack usage that is greater than any
stack usage it had encountered previously. There was a case where that
happened in the RCU section where rcu_irq_enter() did not work, and lockdep
complained loudly about it. To fix it, stack tracing added a call to be
disabled and RCU would disable stack tracing during the critical section
that rcu_irq_enter() was inoperable. This solution worked, but there are
other cases that use rcu_irq_enter() and it would be a good idea to let RCU
give a way to let others know that rcu_irq_enter() will not work. For
example, in trace events.

Another helpful aspect of this change is that it also moves the per cpu
variable called in the RCU critical section into a cache locale along with
other RCU per cpu variables used in that same location.

I'm keeping the stack_trace_disable() code, as that still could be used in
the future by places that really need to disable it. And since it's only a
static inline, it wont take up any kernel text if it is not used.

Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.homeAcked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

03ecd3f4

rcu: Fix dyntick-idle tracing · a278d471

由 Paul E. McKenney 提交于 4月 05, 2017

The tracing subsystem started using rcu_irq_entry() and rcu_irq_exit()
(with my blessing) to allow the current _rcuidle alternative tracepoint
name to be dispensed with while still maintaining good performance.
Unfortunately, this causes RCU's dyntick-idle entry code's tracing to
appear to RCU like an interrupt that occurs where RCU is not designed
to handle interrupts.

This commit fixes this problem by moving the zeroing of ->dynticks_nesting
after the offending trace_rcu_dyntick() statement, which narrows the
window of vulnerability to a pair of adjacent statements that are now
marked with comments to that effect.

Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home
Link: http://lkml.kernel.org/r/20170405193928.GM1600@linux.vnet.ibm.comReported-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

a278d471

02 3月, 2017 3 次提交

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> · b17b0153

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b17b0153

sched/headers: Prepare for new header dependencies before moving code to <uapi/linux/sched/types.h> · ae7e81c0

由 Ingo Molnar 提交于 2月 01, 2017

We are going to move scheduler ABI details to <uapi/linux/sched/types.h>,
which will be used from a number of .c files.

Create empty placeholder header that maps to <linux/types.h>.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

ae7e81c0

rcu: Separate the RCU synchronization types and APIs into <linux/rcupdate_wait.h> · f9411ebe

由 Ingo Molnar 提交于 2月 06, 2017

So rcupdate.h is a pretty complex header, in particular it includes
<linux/completion.h> which includes <linux/wait.h> - creating a
dependency that includes <linux/wait.h> in <linux/sched.h>,
which prevents the isolation of <linux/sched.h> from the derived
<linux/wait.h> header.

Solve part of the problem by decoupling rcupdate.h from completions:
this can be done by separating out the rcu_synchronize types and APIs,
and updating their usage sites.

Since this is a mostly RCU-internal types this will not just simplify
<linux/sched.h>'s dependencies, but will make all the hundreds of
.c files that include rcupdate.h but not completions or wait.h build
faster.

( For rcutiny this means that two dependent APIs have to be uninlined,
  but that shouldn't be much of a problem as they are rare variants. )
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

f9411ebe

24 1月, 2017 1 次提交

rcu: Adjust FQS offline checks for exact online-CPU detection · 38d30b33

由 Paul E. McKenney 提交于 12月 01, 2016

Commit 7ec99de3 ("rcu: Provide exact CPU-online tracking for RCU"),
as its title suggests, got rid of RCU's remaining CPU-hotplug timing
guesswork.  This commit therefore removes the one-jiffy kludge that was
used to paper over this guesswork.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>

38d30b33

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功