提交 · bc17ea1092c48227334a311a130c1a41966333fe · openeuler / raspberrypi-kernel

23 7月, 2015 1 次提交

rcu: Use WRITE_ONCE in RCU_INIT_POINTER · 155d1d12

由 Peter Zijlstra 提交于 6月 02, 2015

For the paranoid amongst us GCC would be in its right to use byte stores
to write our NULL value, tell it not to do that.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

155d1d12

16 7月, 2015 1 次提交

rcu: Deinline rcu_read_lock_sched_held() if DEBUG_LOCK_ALLOC · d5671f6b

由 Denys Vlasenko 提交于 5月 26, 2015

DEBUG_LOCK_ALLOC=y is not a production setting, but it is
not very unusual either. Many developers routinely
use kernels built with it enabled.

Apart from being selected by hand, it is also auto-selected by
PROVE_LOCKING "Lock debugging: prove locking correctness" and
LOCK_STAT "Lock usage statistics" config options.
LOCK STAT is necessary for "perf lock" to work.

I wouldn't spend too much time optimizing it, but this particular
function has a very large cost in code size: when it is deinlined,
code size decreases by 830,000 bytes:

    text     data      bss       dec     hex filename
85674192 22294776 20627456 128596424 7aa39c8 vmlinux.before
84837612 22294424 20627456 127759492 79d7484 vmlinux

(with this config: http://busybox.net/~vda/kernel_config)
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Tejun Heo <tj@kernel.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: linux-kernel@vger.kernel.org
Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

d5671f6b

07 7月, 2015 1 次提交

rcu: Drop RCU_USER_QS in favor of NO_HZ_FULL · d1ec4c34

由 Paul E. McKenney 提交于 5月 13, 2015

The RCU_USER_QS Kconfig parameter is now just a synonym for NO_HZ_FULL,
so this commit eliminates RCU_USER_QS, replacing all uses with NO_HZ_FULL.
Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>

d1ec4c34

28 5月, 2015 5 次提交

rcu: Move lockless_dereference() out of rcupdate.h · 0a04b016

由 Peter Zijlstra 提交于 5月 27, 2015

I want to use lockless_dereference() from seqlock.h, which would mean
including rcupdate.h from it, however rcupdate.h already includes
seqlock.h.

Avoid this by moving lockless_dereference() into compiler.h. This is
somewhat tricky since it uses smp_read_barrier_depends() which isn't
available there, but its a CPP macro so we can get away with it.

The alternative would be moving it into asm/barrier.h, but that would
be updating each arch (I can do if people feel that is more
appropriate).

Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

0a04b016

rcu: Further shrink Tiny RCU by making empty functions static inlines · 51952bc6

由 Paul E. McKenney 提交于 4月 21, 2015

The Tiny RCU counterparts to rcu_idle_enter(), rcu_idle_exit(),
rcu_irq_enter(), and rcu_irq_exit() are empty functions, but each has
EXPORT_SYMBOL_GPL(), which needlessly consumes extra memory, especially
in kernels built with module support. This commit therefore moves these
functions to static inlines in rcutiny.h, removing the need for exports.

This won't affect the size of the tiniest kernels, which are likely
built without module support, but might help semi-tiny kernels that
might include module support.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>

51952bc6

rcu: Eliminate a few CONFIG_RCU_NOCB_CPU_ALL #ifdefs · 3382adbc

由 Paul E. McKenney 提交于 3月 04, 2015

This commit converts several CONFIG_RCU_NOCB_CPU_ALL #ifdefs to
instead use IS_ENABLED().  This change should help avoid hiding
code from compiler diagnostics.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

3382adbc

rcu: Eliminate array-index-based RCU primitives · 1ebee801

由 Paul E. McKenney 提交于 4月 19, 2015

Now that rcu_access_index() and rcu_dereference_index_check() are no
longer used, the commit removes them from the RCU API.  This means that
RCU's data dependencies now involve only pointers, give or take the
occasional cast to and then back from an integer type to do pointer
arithmetic.  This in turn eliminates the need for a number of operations
on values carrying RCU data dependencies.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: linux-edac@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Acked-by: NBorislav Petkov <bp@suse.de>

1ebee801

rcu: Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() · 7d0ae808

由 Paul E. McKenney 提交于 3月 03, 2015

This commit moves from the old ACCESS_ONCE() API to the new READ_ONCE()
and WRITE_ONCE() APIs.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck:  Updated to include kernel/torture.c as suggested by Jason Low. ]

7d0ae808

22 4月, 2015 1 次提交

tick: Nohz: Rework next timer evaluation · c1ad348b

由 Thomas Gleixner 提交于 4月 14, 2015

The evaluation of the next timer in the nohz code is based on jiffies
while all the tick internals are nano seconds based. We have also to
convert hrtimer nanoseconds to jiffies in the !highres case. That's
just wrong and introduces interesting corner cases.

Turn it around and convert the next timer wheel timer expiry and the
rcu event to clock monotonic and base all calculations on
nanoseconds. That identifies the case where no timer is pending
clearly with an absolute expiry value of KTIME_MAX.

Makes the code more readable and gets rid of the jiffies magic in the
nohz code.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Link: http://lkml.kernel.org/r/20150414203502.184198593@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

c1ad348b

13 3月, 2015 1 次提交

rcu: Handle outgoing CPUs on exit from idle loop · 88428cc5

由 Paul E. McKenney 提交于 1月 28, 2015

This commit informs RCU of an outgoing CPU just before that CPU invokes
arch_cpu_idle_dead() during its last pass through the idle loop (via a
new CPU_DYING_IDLE notifier value). This change means that RCU need not
deal with outgoing CPUs passing through the scheduler after informing
RCU that they are no longer online. Note that removing the CPU from
the rcu_node ->qsmaskinit bit masks is done at CPU_DYING_IDLE time,
and orphaning callbacks is still done at CPU_DEAD time, the reason being
that at CPU_DEAD time we have another CPU that can adopt them.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

88428cc5

04 3月, 2015 2 次提交

rcu: Reverse rcu_dereference_check() conditions · b826565a

由 Paul E. McKenney 提交于 2月 02, 2015

The rcu_dereference_check() family of primitives evaluates the RCU
lockdep expression first, and only then evaluates the expression passed
in.  This works fine normally, but can potentially fail in environments
(such as NMI handlers) where lockdep cannot be invoked.  The problem is
that even if the expression passed in is "1", the compiler would need to
prove that the RCU lockdep expression (rcu_read_lock_held(), for example)
is free of side effects in order to be able to elide it.  Given that
rcu_read_lock_held() is sometimes separately compiled, the compiler cannot
always use this optimization.

This commit therefore reverse the order of evaluation, so that the
expression passed in is evaluated first, and the RCU lockdep expression is
evaluated only if the passed-in expression evaluated to false, courtesy
of the C-language short-circuit boolean evaluation rules.  This compells
the compiler to forego executing the RCU lockdep expression in cases
where the passed-in expression evaluates to "1" at compile time, so that
(for example) rcu_dereference_raw() can be guaranteed to execute safely
within an NMI handler.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

b826565a

rcu: Improve diagnostics for blocked critical sections in irq · d24209bb

由 Paul E. McKenney 提交于 1月 21, 2015

If an RCU read-side critical section occurs within an interrupt handler
or a softirq handler, it cannot have been preempted. Therefore, there is
a check in rcu_read_unlock_special() checking for this error. However,
when this check triggers, it lacks diagnostic information. This commit
therefore moves rcu_read_unlock()'s lockdep annotation to follow the
call to __rcu_read_unlock() and changes rcu_read_unlock_special()'s
WARN_ON_ONCE() to an lockdep_rcu_suspicious() in order to locate where
the offending RCU read-side critical section began. In addition, the
value of the ->rcu_read_unlock_special field is printed.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

d24209bb

27 2月, 2015 2 次提交

rcu: Add Kconfig option to expedite grace periods during boot · ee42571f

由 Paul E. McKenney 提交于 2月 19, 2015

This commit adds a CONFIG_RCU_EXPEDITE_BOOT Kconfig parameter
that emulates a very early boot rcu_expedite_gp().  A late-boot
call to rcu_end_inkernel_boot() will provide the corresponding
rcu_unexpedite_gp().  The late-boot call to rcu_end_inkernel_boot()
should be made just before init is spawned.

According to Arjan:

> To show the boot time, I'm using the timestamp of the "Write protecting"
> line, that's pretty much the last thing we print prior to ring 3 execution.
>
> A kernel with default RCU behavior (inside KVM, only virtual devices)
> looks like this:
>
> [    0.038724] Write protecting the kernel read-only data: 10240k
>
> a kernel with expedited RCU (using the command line option, so that I
> don't have to recompile between measurements and thus am completely
> oranges-to-oranges)
>
> [    0.031768] Write protecting the kernel read-only data: 10240k
>
> which, in percentage, is an 18% improvement.
Reported-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NArjan van de Ven <arjan@linux.intel.com>

ee42571f

rcu: Provide rcu_expedite_gp() and rcu_unexpedite_gp() · 0d39482c

由 Paul E. McKenney 提交于 2月 18, 2015

Currently, expediting of normal synchronous grace-period primitives
(synchronize_rcu() and friends) is controlled by the rcu_expedited()
boot/sysfs parameter.  This works well, but does not handle nesting.
This commit therefore provides rcu_expedite_gp() to enable expediting
and rcu_unexpedite_gp() to cancel a prior rcu_expedite_gp(), both of
which support nesting.
Reported-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

0d39482c

26 2月, 2015 1 次提交

rcu: Consolidate rcu_synchronize and wakeme_after_rcu() · ee376dbd

由 Paul E. McKenney 提交于 1月 10, 2015

There are currently duplicate identical definitions of the
rcu_synchronize() structure and the wakeme_after_rcu() function.
Thie commit therefore consolidates them.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

ee376dbd

16 1月, 2015 1 次提交

rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors · 5cd37193

由 Paul E. McKenney 提交于 12月 13, 2014

Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used
in places where it would be useful for it to apply to the normal RCU
flavors, rcu_preempt, rcu_sched, and rcu_bh.  This is especially the
case for workloads that aggressively overload the system, particularly
those that generate large numbers of RCU updates on systems running
NO_HZ_FULL CPUs.  This commit therefore communicates quiescent states
from cond_resched_rcu_qs() to the normal RCU flavors.

Note that it is unfortunately necessary to leave the old ->passed_quiesce
mechanism in place to allow quiescent states that apply to only one
flavor to be recorded.  (Yes, we could decrement ->rcu_qs_ctr_snap in
that case, but that is not so good for debugging of RCU internals.)
In addition, if one of the RCU flavor's grace period has stalled, this
will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight
quiescent state visible from other CPUs.
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Reported-by: NDave Jones <davej@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu()
  was used in preemptible code. ]

5cd37193

07 1月, 2015 1 次提交

rcupdate: Replace smp_read_barrier_depends() with lockless_dereference() · ac59853c

由 Pranith Kumar 提交于 11月 13, 2014

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

ac59853c

14 11月, 2014 3 次提交

rcu: More info about potential deadlocks with rcu_read_unlock() · ce36f2f3

由 Oleg Nesterov 提交于 9月 28, 2014

The comment above rcu_read_unlock() explains the potential deadlock
if the caller holds one of the locks taken by rt_mutex_unlock() paths,
but it is not clear from this documentation that any lock which can
be taken from interrupt can lead to deadlock as well and we need to
take rt_mutex_lock() into account too.

The problem is that rt_mutex_lock() takes wait_lock without disabling
irqs, and thus an interrupt taking some LOCK can obviously race with
rcu_read_unlock_special() called with the same LOCK held.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

ce36f2f3

rcu: Optimize cond_resched_rcu_qs() · b6331ae8

由 Paul E. McKenney 提交于 10月 04, 2014

The current implementation of cond_resched_rcu_qs() can invoke
rcu_note_voluntary_context_switch() twice in the should_resched()
case, once via the call to __schedule() and once directly. However, as
noted by Joe Lawrence in a patch to the team subsystem, cond_resched()
returns an indication as to whether or not the call to __schedule()
actually happened. This commit therefore changes cond_resched_rcu_qs()
so as to invoke rcu_note_voluntary_context_switch() only when __schedule()
was not called.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

b6331ae8

rcu: Add sparse check for RCU_INIT_POINTER() · 1a6c9b26

由 Pranith Kumar 提交于 9月 25, 2014

Add a sparse check when RCU_INIT_POINTER() is used to assign a non __rcu
annotated pointer.
Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

1a6c9b26

04 11月, 2014 2 次提交

rcu: Remove "cpu" argument to rcu_needs_cpu() · aa6da514

由 Paul E. McKenney 提交于 10月 21, 2014

The "cpu" argument to rcu_needs_cpu() is always the current CPU, so drop
it. This in turn allows the "cpu" argument to rcu_cpu_has_callbacks()
to be removed, which allows the uses of "cpu" in both functions to be
replaced with a this_cpu_ptr(). Again, the anticipated cross-CPU uses
of these functions has been replaced by NO_HZ_FULL.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>

aa6da514

rcu: Remove "cpu" argument to rcu_check_callbacks() · c3377c2d

由 Paul E. McKenney 提交于 10月 21, 2014

The "cpu" argument was kept around on the off-chance that RCU might
offload scheduler-clock interrupts. However, this offload approach
has been replaced by NO_HZ_FULL, which offloads -all- RCU processing
from qualifying CPUs. It is therefore time to remove the "cpu" argument
to rcu_check_callbacks(), which this commit does.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>

c3377c2d

30 10月, 2014 1 次提交

rcu: Remove redundant TREE_PREEMPT_RCU config option · 28f6569a

由 Pranith Kumar 提交于 9月 22, 2014

PREEMPT_RCU and TREE_PREEMPT_RCU serve the same function after
TINY_PREEMPT_RCU has been removed. This patch removes TREE_PREEMPT_RCU
and uses PREEMPT_RCU config option in its place.
Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

28f6569a

29 10月, 2014 1 次提交

rcu: Provide counterpart to rcu_dereference() for non-RCU situations · 54ef6df3

由 Paul E. McKenney 提交于 10月 27, 2014

Although rcu_dereference() and friends can be used in situations where
object lifetimes are being managed by something other than RCU, the
resulting sparse and lockdep-RCU noise can be annoying. This commit
therefore supplies a lockless_dereference(), which provides the
protection for dereferences without the RCU-related debugging noise.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

54ef6df3

17 9月, 2014 2 次提交

rcutorture: Rename rcutorture_runnable parameter · 59da22a0

由 Paul E. McKenney 提交于 9月 12, 2014

This commit changes rcutorture_runnable to torture_runnable, which is
consistent with the names of the other parameters and is a bit shorter
as well.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

59da22a0

rcu: Fix attempt to avoid unsolicited offloading of callbacks · f4579fc5

由 Paul E. McKenney 提交于 7月 25, 2014

Commit b58cc46c (rcu: Don't offload callbacks unless specifically
requested) failed to adjust the callback lists of the CPUs that are
known to be no-CBs CPUs only because they are also nohz_full= CPUs.
This failure can result in callbacks that are posted during early boot
getting stranded on nxtlist for CPUs whose no-CBs property becomes
apparent late, and there can also be spurious warnings about offline
CPUs posting callbacks.

This commit fixes these problems by adding an early-boot rcu_init_nohz()
that properly initializes the no-CBs CPUs.

Note that kernels built with CONFIG_RCU_NOCB_CPU_ALL=y or with
CONFIG_RCU_NOCB_CPU=n do not exhibit this bug.  Neither do kernels
booted without the nohz_full= boot parameter.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>
Tested-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

f4579fc5

08 9月, 2014 9 次提交

rcu: Per-CPU operation cleanups to rcu_*_qs() functions · 284a8c93

由 Paul E. McKenney 提交于 8月 14, 2014

The rcu_bh_qs(), rcu_preempt_qs(), and rcu_sched_qs() functions use
old-style per-CPU variable access and write to ->passed_quiesce even
if it is already set. This commit therefore updates to use the new-style
per-CPU variable access functions and avoids the spurious writes.
This commit also eliminates the "cpu" argument to these functions because
they are always invoked on the indicated CPU.
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

284a8c93

rcu: Remove redundant preempt_disable() from rcu_note_voluntary_context_switch() · 01a81330

由 Paul E. McKenney 提交于 8月 05, 2014

In theory, synchronize_sched() requires a read-side critical section
to order against.  In practice, preemption can be thought of as
being disabled across every machine instruction, at least for those
machine instructions that are not in the idle loop and not on offline
CPUs.  So this commit removes the redundant preempt_disable() from
rcu_note_voluntary_context_switch().

Please note that the single instruction in question is the store of
zero to ->rcu_tasks_holdout.  The "if" is simply a performance optimization
that avoids unnecessary stores.  To see this, keep in mind that both
the "if" condition and the store are in a quiescent state.  Therefore,
even if the task is preempted for a full grace period (presumably due
to its having done a context switch beforehand), the store will be
recording a legitimate quiescent state.
Reported-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

Conflicts:
	include/linux/rcupdate.h

01a81330

rcutorture: Add torture tests for RCU-tasks · 69c60455

由 Paul E. McKenney 提交于 7月 01, 2014

This commit adds torture tests for RCU-tasks. It also fixes a bug that
would segfault for an RCU flavor lacking a callback-barrier function.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>

69c60455

rcu: Make TASKS_RCU handle tasks that are almost done exiting · 3f95aa81

由 Paul E. McKenney 提交于 8月 04, 2014

Once a task has passed exit_notify() in the do_exit() code path, it
is no longer on the task lists, and is therefore no longer visible
to rcu_tasks_kthread(). This means that an almost-exited task might
be preempted while within a trampoline, and this task won't be waited
on by rcu_tasks_kthread(). This commit fixes this bug by adding an
srcu_struct. An exiting task does srcu_read_lock() just before calling
exit_notify(), and does the corresponding srcu_read_unlock() after
doing the final preempt_disable(). This means that rcu_tasks_kthread()
can do synchronize_srcu() to wait for all mostly-exited tasks to reach
their final preempt_disable() region, and then use synchronize_sched()
to wait for those tasks to finish exiting.
Reported-by: NOleg Nesterov <oleg@redhat.com>
Suggested-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

3f95aa81

rcu: Add synchronous grace-period waiting for RCU-tasks · 53c6d4ed

由 Paul E. McKenney 提交于 7月 01, 2014

It turns out to be easier to add the synchronous grace-period waiting
functions to RCU-tasks than to work around their absense in rcutorture,
so this commit adds them. The key point is that the existence of
call_rcu_tasks() means that rcutorture needs an rcu_barrier_tasks().
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

53c6d4ed

rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops · bde6c3aa

由 Paul E. McKenney 提交于 7月 01, 2014

RCU-tasks requires the occasional voluntary context switch
from CPU-bound in-kernel tasks.  In some cases, this requires
instrumenting cond_resched().  However, there is some reluctance
to countenance unconditionally instrumenting cond_resched() (see
http://lwn.net/Articles/603252/), so this commit creates a separate
cond_resched_rcu_qs() that may be used in place of cond_resched() in
locations prone to long-duration in-kernel looping.

This commit currently instruments only RCU-tasks.  Future possibilities
include also instrumenting RCU, RCU-bh, and RCU-sched in order to reduce
IPI usage.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

bde6c3aa

rcu: Add call_rcu_tasks() · 8315f422

由 Paul E. McKenney 提交于 6月 27, 2014

This commit adds a new RCU-tasks flavor of RCU, which provides
call_rcu_tasks(). This RCU flavor's quiescent states are voluntary
context switch (not preemption!) and userspace execution (not the idle
loop -- use some sort of schedule_on_each_cpu() if you need to handle the
idle tasks. Note that unlike other RCU flavors, these quiescent states
occur in tasks, not necessarily CPUs. Includes fixes from Steven Rostedt.

This RCU flavor is assumed to have very infrequent latency-tolerant
updaters. This assumption permits significant simplifications, including
a single global callback list protected by a single global lock, along
with a single task-private linked list containing all tasks that have not
yet passed through a quiescent state. If experience shows this assumption
to be incorrect, the required additional complexity will be added.
Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

8315f422

rcu: Uninline rcu_read_lock_held() · 85b39d30

由 Oleg Nesterov 提交于 7月 08, 2014

This commit uninlines rcu_read_lock_held(). According to "size vmlinux"
this saves 28549 in .text:

	- 5541731 3014560 14757888 23314179
	+ 5513182 3026848 14757888 23297918

Note: it looks as if the data grows by 12288 bytes but this is not true,
it does not actually grow. But .data starts with ALIGN(THREAD_SIZE) and
since .text shrinks the padding grows, and thus .data grows too as it
seen by /bin/size. diff System.map:

	- ffffffff81510000 D _sdata
	- ffffffff81510000 D init_thread_union
	+ ffffffff81509000 D _sdata
	+ ffffffff8150c000 D init_thread_union

Perhaps we can change vmlinux.lds.S to .data itself, so that /bin/size
can't "wrongly" report that .data grows if .text shinks.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

85b39d30

rcu: Return bool type in rcu_lockdep_current_cpu_online() · 521d24ee

由 Pranith Kumar 提交于 7月 08, 2014

Return true instead of 1 in rcu_lockdep_current_cpu_online() as this
has bool as return type.
Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

521d24ee

10 7月, 2014 2 次提交

rcu: Handle obsolete references to TINY_PREEMPT_RCU · ab74fdfd

由 Paul E. McKenney 提交于 5月 04, 2014

Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

ab74fdfd

rcu: Document deadlock-avoidance information for rcu_read_unlock() · f27bc487

由 Paul E. McKenney 提交于 5月 04, 2014

Reported-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

f27bc487

24 6月, 2014 2 次提交

rcu: Reduce overhead of cond_resched() checks for RCU · 4a81e832

由 Paul E. McKenney 提交于 6月 20, 2014

Commit ac1bea85 (Make cond_resched() report RCU quiescent states)
fixed a problem where a CPU looping in the kernel with but one runnable
task would give RCU CPU stall warnings, even if the in-kernel loop
contained cond_resched() calls.  Unfortunately, in so doing, it introduced
performance regressions in Anton Blanchard's will-it-scale "open1" test.
The problem appears to be not so much the increased cond_resched() path
length as an increase in the rate at which grace periods complete, which
increased per-update grace-period overhead.

This commit takes a different approach to fixing this bug, mainly by
moving the RCU-visible quiescent state from cond_resched() to
rcu_note_context_switch(), and by further reducing the check to a
simple non-zero test of a single per-CPU variable.  However, this
approach requires that the force-quiescent-state processing send
resched IPIs to the offending CPUs.  These will be sent only once
the grace period has reached an age specified by the boot/sysfs
parameter rcutree.jiffies_till_sched_qs, or once the grace period
reaches an age halfway to the point at which RCU CPU stall warnings
will be emitted, whichever comes first.
Reported-by: NDave Hansen <dave.hansen@intel.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
[ paulmck: Made rcu_momentary_dyntick_idle() as suggested by the
  ktest build robot.  Also fixed smp_mb() comment as noted by
  Oleg Nesterov. ]

Merge with e552592e (Reduce overhead of cond_resched() checks for RCU)
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

4a81e832

rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() · 546a9d85

由 Paul E. McKenney 提交于 6月 19, 2014

Currently, call_rcu() relies on implicit allocation and initialization
for the debug-objects handling of RCU callbacks.  If you hammer the
kernel hard enough with Sasha's modified version of trinity, you can end
up with the sl*b allocators recursing into themselves via this implicit
call_rcu() allocation.

This commit therefore exports the debug_init_rcu_head() and
debug_rcu_head_free() functions, which permits the allocators to allocated
and pre-initialize the debug-objects information, so that there no longer
any need for call_rcu() to do that initialization, which in turn prevents
the recursion into the memory allocators.
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Looks-good-to: Christoph Lameter <cl@linux.com>

546a9d85

20 5月, 2014 1 次提交

rcu: Provide API to suppress stall warnings while sysrc runs · 61f38db3

由 Rik van Riel 提交于 4月 26, 2014

Some sysrq handlers can run for a long time, because they dump a lot
of data onto a serial console. Having RCU stall warnings pop up in
the middle of them only makes the problem worse.

This commit provides rcu_sysrq_start() and rcu_sysrq_end() APIs to
temporarily suppress RCU CPU stall warnings while a sysrq request is
handled.
Signed-off-by: NRik van Riel <riel@redhat.com>
[ paulmck: Fix TINY_RCU build error. ]
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

61f38db3