- 12 12月, 2011 40 次提交
-
-
由 Josh Triplett 提交于
When architectures register CPUs, they indicate whether the CPU allows hotplugging; notably, x86 and ARM don't allow hotplugging CPU 0. Userspace can easily query the hotpluggability of a CPU via sysfs; however, the kernel has no convenient way of accessing that property in an architecture-independent way. While the kernel can simply try it and see, some code needs to distinguish between "hotplug failed" and "hotplug has no hope of working on this CPU"; for example, rcutorture's CPU hotplug tests want to avoid drowning out real hotplug failures with expected failures. Expose this property via a new cpu_is_hotpluggable function, so that the rest of the kernel can access it in an architecture-independent way. Signed-off-by: NJosh Triplett <josh@joshtriplett.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
No point in having two identical rcu_cpu_stall_suppress declarations, so remove the more obscure of the two. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
If there are other CPUs active at a given point in time, then there is a limit to what a given CPU can do to advance the current RCU grace period. Beyond this limit, attempting to force the RCU grace period forward will do nothing but consume energy burning CPU cycles. Therefore, this commit takes an adaptive approach to RCU_FAST_NO_HZ preparations for idle. It pushes the RCU core state machine for two cycles unconditionally, and then it will push from zero to three additional cycles, but only as long as the RCU core has work for this CPU to do immediately. The rcu_pending() function is used to check whether the RCU core has such work. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The rcu_do_batch() function that invokes callbacks for TREE_RCU and TREE_PREEMPT_RCU normally throttles callback invocation to avoid degrading scheduling latency. However, as long as the CPU would otherwise be idle, there is no downside to continuing to invoke any callbacks that have passed through their grace periods. In fact, processing such callbacks in a timely manner has the benefit of increasing the probability that the CPU can enter the power-saving dyntick-idle mode. Therefore, this commit allows callback invocation to continue beyond the preset limit as long as the scheduler does not have some other task to run and as long as context is that of the idle task or the relevant RCU kthread. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Frederic Weisbecker 提交于
Because tasks don't nest, the ->dyntick_nesting must always be zero upon entry to rcu_idle_enter_common(). Therefore, pass "0" rather than the counter itself. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Frederic Weisbecker 提交于
Because tasks do not nest, rcu_idle_enter() and rcu_idle_exit() do not need to check for nesting. This commit therefore moves nesting checks from rcu_idle_enter_common() to rcu_irq_exit() and from rcu_idle_exit_common() to rcu_irq_enter(). Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The current implementation of RCU_FAST_NO_HZ prevents CPUs from entering dyntick-idle state if they have RCU callbacks pending. Unfortunately, this has the side-effect of often preventing them from entering this state, especially if at least one other CPU is not in dyntick-idle state. However, the resulting per-tick wakeup is wasteful in many cases: if the CPU has already fully responded to the current RCU grace period, there will be nothing for it to do until this grace period ends, which will frequently take several jiffies. This commit therefore permits a CPU that has done everything that the current grace period has asked of it (rcu_pending() == 0) even if it still as RCU callbacks pending. However, such a CPU posts a timer to wake it up several jiffies later (6 jiffies, based on experience with grace-period lengths). This wakeup is required to handle situations that can result in all CPUs being in dyntick-idle mode, thus failing to ever complete the current grace period. If a CPU wakes up before the timer goes off, then it cancels that timer, thus avoiding spurious wakeups. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The intent is that a given RCU read-side critical section be confined to a single context. For example, it is illegal to invoke rcu_read_lock() in an exception handler and then invoke rcu_read_unlock() from the context of the task that received the exception. Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Fixes and workarounds for a number of issues (for example, that in df4012edc) make it safe to once again detect dyntick-idle CPUs on the first pass of force_quiescent_state(), so this commit makes that change. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Assertions in rcu_init_percpu_data() unknowingly relied on outgoing CPUs being turned off before reaching the idle loop. Unfortunately, when running under kvm/qemu on x86, CPUs really can get to idle before begin shut off. These CPUs are then born in dyntick-idle mode from an RCU perspective, which results in splats in rcu_init_percpu_data() and in RCU wrongly ignoring those CPUs despite them being active. This in turn can cause RCU to end grace periods prematurely, potentially freeing up memory that the newly onlined CPUs were still using. This is most decidedly not what we need to see in an RCU implementation. This commit therefore replaces the assertions in rcu_init_percpu_data() with code that forces RCU's dyntick-idle view of newly onlined CPUs to match reality. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Re-enable interrupts across calls to quiescent-state functions and also across force_quiescent_state() to reduce latency. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
With the new implementation of RCU_FAST_NO_HZ, it was possible to hang RCU grace periods as follows: o CPU 0 attempts to go idle, cycles several times through the rcu_prepare_for_idle() loop, then goes dyntick-idle when RCU needs nothing more from it, while still having at least on RCU callback pending. o CPU 1 goes idle with no callbacks. Both CPUs can then stay in dyntick-idle mode indefinitely, preventing the RCU grace period from ever completing, possibly hanging the system. This commit therefore prevents CPUs that have RCU callbacks from entering dyntick-idle mode. This approach also eliminates the need for the end-of-grace-period IPIs used previously. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
If a CPU enters dyntick-idle mode with callbacks pending, it will need an IPI at the end of the grace period. However, if it exits dyntick-idle mode before the grace period ends, it will be needlessly IPIed at the end of the grace period. Therefore, this commit clears the per-CPU rcu_awake_at_gp_end flag when a CPU determines that it does not need it. This in turn requires disabling interrupts across much of rcu_prepare_for_idle() in order to avoid having nested interrupts clearing this state out from under us. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The earlier version would attempt to push callbacks through five times before going into dyntick-idle mode if callbacks remained, but the CPU had done all that it needed to do for the current RCU grace periods. This is wasteful: In most cases, once the CPU has done all that it needs to for the current RCU grace periods, it will make no further progress on the callbacks no matter how many times it loops through the RCU core processing and the idle-entry code. This commit therefore goes to dyntick-idle mode whenever the current CPU has done all it can for the current grace period. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
This commit adds trace_rcu_prep_idle(), which is invoked from rcu_prepare_for_idle() and rcu_wake_cpu() to trace attempts on the part of RCU to force CPUs into dyntick-idle mode. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
This commit updates the trace_rcu_dyntick() header comment to reflect events added by commit 4b4f421. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
An IRC discussion uncovered many conflicting opinions on what types of data may be atomically loaded and stored. This commit therefore calls this out the official set: pointers, longs, ints, and chars (but not shorts). This commit also gives some examples of compiler mischief that can thwart atomicity. Please note that this discussion is relevant to !SMP kernels if CONFIG_PREEMPT=y: preemption can cause almost as much trouble as can SMP. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Jes Sorensen <jes@sgi.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Chris Zankel <chris@zankel.net>
-
由 Frederic Weisbecker 提交于
Those two APIs were provided to optimize the calls of tick_nohz_idle_enter() and rcu_idle_enter() into a single irq disabled section. This way no interrupt happening in-between would needlessly process any RCU job. Now we are talking about an optimization for which benefits have yet to be measured. Let's start simple and completely decouple idle rcu and dyntick idle logics to simplify. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NJosh Triplett <josh@joshtriplett.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Running CPU-hotplug operations concurrently with rcutorture has historically been a good way to find bugs in both RCU and CPU hotplug. This commit therefore adds an rcutorture module parameter called "onoff_interval" that causes a randomly selected CPU-hotplug operation to be executed at the specified interval, in seconds. The default value of "onoff_interval" is zero, which disables rcutorture-instigated CPU-hotplug operations. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Change from direct comparison of ->pid with zero to is_idle_task(). Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org> Acked-by: NChris Metcalf <cmetcalf@tilera.com>
-
由 Paul E. McKenney 提交于
Change from direct comparison of ->pid with zero to is_idle_task(). Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Paul E. McKenney 提交于
Change from direct comparison of ->pid with zero to is_idle_task(). Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Jason Wessel <jason.wessel@windriver.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Paul E. McKenney 提交于
Change from direct comparison of ->pid with zero to is_idle_task(). Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Paul E. McKenney 提交于
Change from direct comparison of ->pid with zero to is_idle_task(). Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Paul E. McKenney 提交于
Commit 908a3283 (Fix idle_cpu()) invalidated some uses of idle_cpu(), which used to say whether or not the CPU was running the idle task, but now instead says whether or not the CPU is running the idle task in the absence of pending wakeups. Although this new implementation gives a better answer to the question "is this CPU idle?", it also invalidates other uses that were made of idle_cpu(). This commit therefore introduces a new is_idle_task() API member that determines whether or not the specified task is one of the idle tasks, allowing open-coded "->pid == 0" sequences to be replaced by something more meaningful. Suggested-by: NJosh Triplett <josh@joshtriplett.org> Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Currently, if rcutorture is built into the kernel, it must be manually started or started from an init script. This is inconvenient for automated KVM testing, where it is good to be able to fully control rcutorture execution from the kernel parameters. This patch therefore adds a module parameter named "rcutorture_runnable" that defaults to zero ("don't start automatically"), but which can be set to one to cause rcutorture to start up immediately during boot. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Although it is easy to run rcutorture tests under KVM, there is currently no nice way to run such a test for a fixed time period, collect all of the rcutorture data, and then shut the system down cleanly. This commit therefore adds an rcutorture module parameter named "shutdown_secs" that specified the run duration in seconds, after which rcutorture terminates the test and powers the system down. The default value for "shutdown_secs" is zero, which disables shutdown. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The new implementation of RCU_FAST_NO_HZ is compatible with preemptible RCU, so this commit removes the Kconfig restriction that previously prohibited this. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Paul E. McKenney 提交于
RCU has traditionally relied on idle_cpu() to determine whether a given CPU is running in the context of an idle task, but commit 908a3283 (Fix idle_cpu()) has invalidated this approach. After commit 908a3283, idle_cpu() will return true if the current CPU is currently running the idle task, and will be doing so for the foreseeable future. RCU instead needs to know whether or not the current CPU is currently running the idle task, regardless of what the near future might bring. This commit therefore switches from idle_cpu() to "current->pid != 0". Reported-by: NWu Fengguang <fengguang.wu@intel.com> Suggested-by: NCarsten Emde <C.Emde@osadl.org> Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Tested-by: NWu Fengguang <fengguang.wu@intel.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Currently, RCU does not permit a CPU to enter dyntick-idle mode if that CPU has any RCU callbacks queued. This means that workloads for which each CPU wakes up and does some RCU updates every few ticks will never enter dyntick-idle mode. This can result in significant unnecessary power consumption, so this patch permits a given to enter dyntick-idle mode if it has callbacks, but only if that same CPU has completed all current work for the RCU core. We determine use rcu_pending() to determine whether a given CPU has completed all current work for the RCU core. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The current code just complains if the current task is not the idle task. This commit therefore adds printing of the identity of the idle task. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Paul E. McKenney 提交于
The trace_rcu_dyntick() trace event did not print both the old and the new value of the nesting level, and furthermore printed only the low-order 32 bits of it. This could result in some confusion when interpreting trace-event dumps, so this commit prints both the old and the new value, prints the full 64 bits, and also selects the process-entry/exit increment to print nicely in hexadecimal. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Paul E. McKenney 提交于
Update various files in Documentation/RCU to reflect srcu_read_lock_raw() and srcu_read_unlock_raw(). Credit to Peter Zijlstra for suggesting use of the existing _raw suffix instead of the earlier bulkref names. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The RCU implementations, including SRCU, are designed to be used in a lock-like fashion, so that the read-side lock and unlock primitives must execute in the same context for any given read-side critical section. This constraint is enforced by lockdep-RCU. However, there is a need to enter an SRCU read-side critical section within the context of an exception and then exit in the context of the task that encountered the exception. The cost of this capability is that the read-side operations incur the overhead of disabling interrupts. Note that although the current implementation allows a given read-side critical section to be entered by one task and then exited by another, all known possible implementations that allow this have scalability problems. Therefore, a given read-side critical section must be exited by the same task that entered it, though perhaps from an interrupt or exception handler running within that task's context. But if you are thinking in terms of interrupt handlers, make sure that you have considered the possibility of threaded interrupt handlers. Credit goes to Peter Zijlstra for suggesting use of the existing _raw suffix to indicate disabling lockdep over the earlier "bulkref" names. Requested-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The PowerPC pSeries platform (CONFIG_PPC_PSERIES=y) enables hypervisor-call tracing for CONFIG_TRACEPOINTS=y kernels. One of the hypervisor calls that is traced is the H_CEDE call in the idle loop that tells the hypervisor that this OS instance no longer needs the current CPU. However, tracing uses RCU, so this combination of kernel configuration variables needs to avoid telling RCU about the current CPU's idleness until after the H_CEDE-entry tracing completes on the one hand, and must tell RCU that the the current CPU is no longer idle before the H_CEDE-exit tracing starts. In all other cases, it suffices to inform RCU of CPU idleness upon idle-loop entry and exit. This commit makes the required adjustments. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Frederic Weisbecker 提交于
On the irq exit path, tick_nohz_irq_exit() may raise a softirq, which action leads to the wake up path and select_task_rq_fair() that makes use of rcu to iterate the domains. This is an illegal use of RCU because we may be in RCU extended quiescent state if we interrupted an RCU-idle window in the idle loop: [ 132.978883] =============================== [ 132.978883] [ INFO: suspicious RCU usage. ] [ 132.978883] ------------------------------- [ 132.978883] kernel/sched_fair.c:1707 suspicious rcu_dereference_check() usage! [ 132.978883] [ 132.978883] other info that might help us debug this: [ 132.978883] [ 132.978883] [ 132.978883] rcu_scheduler_active = 1, debug_locks = 0 [ 132.978883] RCU used illegally from extended quiescent state! [ 132.978883] 2 locks held by swapper/0: [ 132.978883] #0: (&p->pi_lock){-.-.-.}, at: [<ffffffff8105a729>] try_to_wake_up+0x39/0x2f0 [ 132.978883] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff8105556a>] select_task_rq_fair+0x6a/0xec0 [ 132.978883] [ 132.978883] stack backtrace: [ 132.978883] Pid: 0, comm: swapper Tainted: G W 3.0.0+ #178 [ 132.978883] Call Trace: [ 132.978883] <IRQ> [<ffffffff810a01f6>] lockdep_rcu_suspicious+0xe6/0x100 [ 132.978883] [<ffffffff81055c49>] select_task_rq_fair+0x749/0xec0 [ 132.978883] [<ffffffff8105556a>] ? select_task_rq_fair+0x6a/0xec0 [ 132.978883] [<ffffffff812fe494>] ? do_raw_spin_lock+0x54/0x150 [ 132.978883] [<ffffffff810a1f2d>] ? trace_hardirqs_on+0xd/0x10 [ 132.978883] [<ffffffff8105a7c3>] try_to_wake_up+0xd3/0x2f0 [ 132.978883] [<ffffffff81094f98>] ? ktime_get+0x68/0xf0 [ 132.978883] [<ffffffff8105aa35>] wake_up_process+0x15/0x20 [ 132.978883] [<ffffffff81069dd5>] raise_softirq_irqoff+0x65/0x110 [ 132.978883] [<ffffffff8108eb65>] __hrtimer_start_range_ns+0x415/0x5a0 [ 132.978883] [<ffffffff812fe3ee>] ? do_raw_spin_unlock+0x5e/0xb0 [ 132.978883] [<ffffffff8108ed08>] hrtimer_start+0x18/0x20 [ 132.978883] [<ffffffff8109c9c3>] tick_nohz_stop_sched_tick+0x393/0x450 [ 132.978883] [<ffffffff810694f2>] irq_exit+0xd2/0x100 [ 132.978883] [<ffffffff81829e96>] do_IRQ+0x66/0xe0 [ 132.978883] [<ffffffff81820d53>] common_interrupt+0x13/0x13 [ 132.978883] <EOI> [<ffffffff8103434b>] ? native_safe_halt+0xb/0x10 [ 132.978883] [<ffffffff810a1f2d>] ? trace_hardirqs_on+0xd/0x10 [ 132.978883] [<ffffffff810144ea>] default_idle+0xba/0x370 [ 132.978883] [<ffffffff810147fe>] amd_e400_idle+0x5e/0x130 [ 132.978883] [<ffffffff8100a9f6>] cpu_idle+0xb6/0x120 [ 132.978883] [<ffffffff817f217f>] rest_init+0xef/0x150 [ 132.978883] [<ffffffff817f20e2>] ? rest_init+0x52/0x150 [ 132.978883] [<ffffffff81ed9cf3>] start_kernel+0x3da/0x3e5 [ 132.978883] [<ffffffff81ed9346>] x86_64_start_reservations+0x131/0x135 [ 132.978883] [<ffffffff81ed944d>] x86_64_start_kernel+0x103/0x112 Fix this by calling rcu_idle_enter() after tick_nohz_irq_exit(). Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Frederic Weisbecker 提交于
Interrupts notify the idle exit state before calling irq_enter(). But the notifier code calls rcu_read_lock() and this is not allowed while rcu is in an extended quiescent state. We need to wait for irq_enter() -> rcu_idle_exit() to be called before doing so otherwise this results in a grumpy RCU: [ 0.099991] WARNING: at include/linux/rcupdate.h:194 __atomic_notifier_call_chain+0xd2/0x110() [ 0.099991] Hardware name: AMD690VM-FMH [ 0.099991] Modules linked in: [ 0.099991] Pid: 0, comm: swapper Not tainted 3.0.0-rc6+ #255 [ 0.099991] Call Trace: [ 0.099991] <IRQ> [<ffffffff81051c8a>] warn_slowpath_common+0x7a/0xb0 [ 0.099991] [<ffffffff81051cd5>] warn_slowpath_null+0x15/0x20 [ 0.099991] [<ffffffff817d6fa2>] __atomic_notifier_call_chain+0xd2/0x110 [ 0.099991] [<ffffffff817d6ff1>] atomic_notifier_call_chain+0x11/0x20 [ 0.099991] [<ffffffff81001873>] exit_idle+0x43/0x50 [ 0.099991] [<ffffffff81020439>] smp_apic_timer_interrupt+0x39/0xa0 [ 0.099991] [<ffffffff817da253>] apic_timer_interrupt+0x13/0x20 [ 0.099991] <EOI> [<ffffffff8100ae67>] ? default_idle+0xa7/0x350 [ 0.099991] [<ffffffff8100ae65>] ? default_idle+0xa5/0x350 [ 0.099991] [<ffffffff8100b19b>] amd_e400_idle+0x8b/0x110 [ 0.099991] [<ffffffff810cb01f>] ? rcu_enter_nohz+0x8f/0x160 [ 0.099991] [<ffffffff810019a0>] cpu_idle+0xb0/0x110 [ 0.099991] [<ffffffff817a7505>] rest_init+0xe5/0x140 [ 0.099991] [<ffffffff817a7468>] ? rest_init+0x48/0x140 [ 0.099991] [<ffffffff81cc5ca3>] start_kernel+0x3d1/0x3dc [ 0.099991] [<ffffffff81cc5321>] x86_64_start_reservations+0x131/0x135 [ 0.099991] [<ffffffff81cc5412>] x86_64_start_kernel+0xed/0xf4 Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Henroid <andrew.d.henroid@intel.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Frederic Weisbecker 提交于
The idle notifier, called by enter_idle(), enters into rcu read side critical section but at that time we already switched into the RCU-idle window (rcu_idle_enter() has been called). And it's illegal to use rcu_read_lock() in that state. This results in rcu reporting its bad mood: [ 1.275635] WARNING: at include/linux/rcupdate.h:194 __atomic_notifier_call_chain+0xd2/0x110() [ 1.275635] Hardware name: AMD690VM-FMH [ 1.275635] Modules linked in: [ 1.275635] Pid: 0, comm: swapper Not tainted 3.0.0-rc6+ #252 [ 1.275635] Call Trace: [ 1.275635] [<ffffffff81051c8a>] warn_slowpath_common+0x7a/0xb0 [ 1.275635] [<ffffffff81051cd5>] warn_slowpath_null+0x15/0x20 [ 1.275635] [<ffffffff817d6f22>] __atomic_notifier_call_chain+0xd2/0x110 [ 1.275635] [<ffffffff817d6f71>] atomic_notifier_call_chain+0x11/0x20 [ 1.275635] [<ffffffff810018a0>] enter_idle+0x20/0x30 [ 1.275635] [<ffffffff81001995>] cpu_idle+0xa5/0x110 [ 1.275635] [<ffffffff817a7465>] rest_init+0xe5/0x140 [ 1.275635] [<ffffffff817a73c8>] ? rest_init+0x48/0x140 [ 1.275635] [<ffffffff81cc5ca3>] start_kernel+0x3d1/0x3dc [ 1.275635] [<ffffffff81cc5321>] x86_64_start_reservations+0x131/0x135 [ 1.275635] [<ffffffff81cc5412>] x86_64_start_kernel+0xed/0xf4 [ 1.275635] ---[ end trace a22d306b065d4a66 ]--- Fix this by entering rcu extended quiescent state later, just before the CPU goes to sleep. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
由 Frederic Weisbecker 提交于
It is assumed that rcu won't be used once we switch to tickless mode and until we restart the tick. However this is not always true, as in x86-64 where we dereference the idle notifiers after the tick is stopped. To prepare for fixing this, add two new APIs: tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu(). If no use of RCU is made in the idle loop between tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch must instead call the new *_norcu() version such that the arch doesn't need to call rcu_idle_enter() and rcu_idle_exit(). Otherwise the arch must call tick_nohz_enter_idle() and tick_nohz_exit_idle() and also call explicitly: - rcu_idle_enter() after its last use of RCU before the CPU is put to sleep. - rcu_idle_exit() before the first use of RCU after the CPU is woken up. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Frederic Weisbecker 提交于
The tick_nohz_stop_sched_tick() function, which tries to delay the next timer tick as long as possible, can be called from two places: - From the idle loop to start the dytick idle mode - From interrupt exit if we have interrupted the dyntick idle mode, so that we reprogram the next tick event in case the irq changed some internal state that requires this action. There are only few minor differences between both that are handled by that function, driven by the ts->inidle cpu variable and the inidle parameter. The whole guarantees that we only update the dyntick mode on irq exit if we actually interrupted the dyntick idle mode, and that we enter in RCU extended quiescent state from idle loop entry only. Split this function into: - tick_nohz_idle_enter(), which sets ts->inidle to 1, enters dynticks idle mode unconditionally if it can, and enters into RCU extended quiescent state. - tick_nohz_irq_exit() which only updates the dynticks idle mode when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called). To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed into tick_nohz_idle_exit(). This simplifies the code and micro-optimize the irq exit path (no need for local_irq_save there). This also prepares for the split between dynticks and rcu extended quiescent state logics. We'll need this split to further fix illegal uses of RCU in extended quiescent states in the idle loop. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-