- 06 4月, 2018 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Add a new pointer argument to cpuidle_select() and to the ->select cpuidle governor callback to allow a boolean value indicating whether or not the tick should be stopped before entering the selected state to be returned from there. Make the ladder governor ignore that pointer (to preserve its current behavior) and make the menu governor return 'false" through it if: (1) the idle exit latency is constrained at 0, or (2) the selected state is a polling one, or (3) the expected idle period duration is within the tick period range. In addition to that, the correction factor computations in the menu governor need to take the possibility that the tick may not be stopped into account to avoid artificially small correction factor values. To that end, add a mechanism to record tick wakeups, as suggested by Peter Zijlstra, and use it to modify the menu_update() behavior when tick wakeup occurs. Namely, if the CPU is woken up by the tick and the return value of tick_nohz_get_sleep_length() is not within the tick boundary, the predicted idle duration is likely too short, so make menu_update() try to compensate for that by updating the governor statistics as though the CPU was idle for a long time. Since the value returned through the new argument pointer of cpuidle_select() is not used by its caller yet, this change by itself is not expected to alter the functionality of the code. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
- 29 3月, 2018 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Add a new attribute group called "s2idle" under the sysfs directory of each cpuidle state that supports the ->enter_s2idle callback and put two new attributes, "usage" and "time", into that group to represent the number of times the given state was requested for suspend-to-idle and the total time spent in suspend-to-idle after requesting that state, respectively. That will allow diagnostic information related to suspend-to-idle to be collected without enabling advanced debug features and analyzing dmesg output. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 09 11月, 2017 2 次提交
-
-
由 Gaurav Jindal 提交于
Clean up cpuidle_enable_device() to avoid doing an assignment in an expression evaluated as an argument of if (), which also makes the code in question more readable. Signed-off-by: NGaurav Jindal <gauravjindal1104@gmail.com> [ rjw: Subject & changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Gaurav Jindal 提交于
Do not fetch per CPU drv if cpuidle_curr_governor is NULL to avoid useless per CPU processing. Signed-off-by: NGaurav Jindal <gauravjindal1104@gmail.com> [ rjw: Subject & changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 28 9月, 2017 1 次提交
-
-
由 Nicholas Piggin 提交于
When failing to enter broadcast timer mode for an idle state that requires it, a new state is selected that does not require broadcast, but the broadcast variable remains set. This causes tick_broadcast_exit to be called despite not having entered broadcast mode. This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some cases. It does not appear to cause problems for code today, but seems to violate the interface so should be fixed. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 11 8月, 2017 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Rename the ->enter_freeze cpuidle driver callback to ->enter_s2idle to make it clear that it is used for entering suspend-to-idle and rename the related functions, variables and so on accordingly. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 15 5月, 2017 1 次提交
-
-
由 Peter Zijlstra 提交于
Ville reported that on his Core2, which has TSC stop in idle, we would always report very short idle durations. He tracked this down to commit: e93e59ce ("cpuidle: Replace ktime_get() with local_clock()") which replaces ktime_get() with local_clock(). Add a sched_clock_idle_wakeup_event() call, which will re-sync the clock with ktime_get_ns() when TSC is unstable and no-op otherwise. Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: e93e59ce ("cpuidle: Replace ktime_get() with local_clock()") Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 02 5月, 2017 1 次提交
-
-
由 Li, Fei 提交于
In case of there is no cpuidle devices registered, dev will be null, and panic will be triggered like below; In this patch, add checking of dev before usage, like that done in cpuidle_idle_call. Panic without fix: [ 184.961328] BUG: unable to handle kernel NULL pointer dereference at (null) [ 184.961328] IP: cpuidle_use_deepest_state+0x30/0x60 ... [ 184.961328] play_idle+0x8d/0x210 [ 184.961328] ? __schedule+0x359/0x8e0 [ 184.961328] ? _raw_spin_unlock_irqrestore+0x28/0x50 [ 184.961328] ? kthread_queue_delayed_work+0x41/0x80 [ 184.961328] clamp_idle_injection_func+0x64/0x1e0 Fixes: bb8313b6 (cpuidle: Allow enforcing deepest idle state selection) Signed-off-by: NLi, Fei <fei.li@intel.com> Tested-by: NShi, Feng <fengx.shi@intel.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 02 3月, 2017 1 次提交
-
-
由 Ingo Molnar 提交于
We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which will have to be picked up from other headers and .c files. Create a trivial placeholder <linux/sched/clock.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 12月, 2016 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Since cpuidle_use_deepest_state() is not static, add a proper kerneldoc comment to it to document its purpose. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 29 11月, 2016 1 次提交
-
-
由 Jacob Pan 提交于
When idle injection is used to cap power, we need to override the governor's choice of idle states. For this reason, make it possible the deepest idle state selection to be enforced by setting a flag on a given CPU to achieve the maximum potential power draw reduction. Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> [ rjw: Subject & changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 04 7月, 2016 1 次提交
-
-
由 Shreyas B. Prabhu 提交于
Snooze is a poll idle state in powernv and pseries platforms. Snooze has a timeout so that if a CPU stays in snooze for more than target residency of the next available idle state, then it would exit thereby giving chance to the cpuidle governor to re-evaluate and promote the CPU to a deeper idle state. Therefore whenever snooze exits due to this timeout, its last_residency will be target_residency of the next deeper state. Commit e93e59ce "cpuidle: Replace ktime_get() with local_clock()" changed the math around last_residency calculation. Specifically, while converting last_residency value from nano- to microseconds, it carries out right shift by 10. Because of that, in snooze timeout exit scenarios last_residency calculated is roughly 2.3% less than target_residency of the next available state. This pattern is picked up by get_typical_interval() in the menu governor and therefore expected_interval in menu_select() is frequently less than the target_residency of any state other than snooze. Due to this we are entering snooze at a higher rate, thereby affecting the single thread performance. Fix this by using more precise division via ktime_us_delta(). Fixes: e93e59ce "cpuidle: Replace ktime_get() with local_clock()" Reported-by: NAnton Blanchard <anton@samba.org> Bisected-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Acked-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 18 5月, 2016 1 次提交
-
-
由 Daniel Lezcano 提交于
Commit 0b89e9aa (cpuidle: delay enabling interrupts until all coupled CPUs leave idle) rightfully fixed a regression by letting the coupled idle state framework to handle local interrupt enabling when the CPU is exiting an idle state. The current code checks if the idle state is coupled and, if so, it will let the coupled code to enable interrupts. This way, it can decrement the ready-count before handling the interrupt. This mechanism prevents the other CPUs from waiting for a CPU which is handling interrupts. But the check is done against the state index returned by the back end driver's ->enter functions which could be different from the initial index passed as parameter to the cpuidle_enter_state() function. entered_state = target_state->enter(dev, drv, index); [ ... ] if (!cpuidle_state_is_coupled(drv, entered_state)) local_irq_enable(); [ ... ] If the 'index' is referring to a coupled idle state but the 'entered_state' is *not* coupled, then the interrupts are enabled again. All CPUs blocked on the sync barrier may busy loop longer if the CPU has interrupts to handle before decrementing the ready-count. That's consuming more energy than saving. Fixes: 0b89e9aa (cpuidle: delay enabling interrupts until all coupled CPUs leave idle) Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ [ rjw: Subject & changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 26 4月, 2016 1 次提交
-
-
由 Daniel Lezcano 提交于
The ktime_get() can have a non negligeable overhead, use local_clock() instead. In order to test the difference between ktime_get() and local_clock(), a quick hack has been added to trigger, via debugfs, 10000 times a call to ktime_get() and local_clock() and measure the elapsed time. Then the average value, the min and max is computed for each call. From userspace, the test above was called 100 times every 2 seconds. So, ktime_get() and local_clock() have been called 1000000 times in total. The results are: ktime_get(): ============ * average: 101 ns (stddev: 27.4) * maximum: 38313 ns * minimum: 65 ns local_clock(): ============== * average: 60 ns (stddev: 9.8) * maximum: 13487 ns * minimum: 46 ns The local_clock() is faster and more stable. Even if it is a drop in the ocean, changing the ktime_get() by the local_clock() allows to save 80ns at idle time (entry + exit). And in some circumstances, especially when there are several CPUs racing for the clock access, we save tens of microseconds. The idle duration resulting from a diff is converted from nanosec to microsec. This could be done with integer division (div 1000) - which is an expensive operation or by 10 bits shifting (div 1024) - which is fast but unprecise. The following table gives some results at the limits. ------------------------------------------ | nsec | div(1000) | div(1024) | ------------------------------------------ | 1e3 | 1 usec | 976 nsec | ------------------------------------------ | 1e6 | 1000 usec | 976 usec | ------------------------------------------ | 1e9 | 1000000 usec | 976562 usec | ------------------------------------------ There is a linear deviation of 2.34%. This loss of precision is acceptable in the context of the resulting diff which is used for statistics. These ones are processed to guess estimate an approximation of the duration of the next idle period which ends up into an idle state selection. The selection criteria takes into account the next duration based on large intervals, represented by the idle state's target residency. The 2^10 division is enough because the approximation regarding the 1e3 division is lost in all the approximations done for the next idle duration computation. Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> [ rjw: Subject ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 09 4月, 2016 1 次提交
-
-
由 Dave Gerlach 提交于
Currently the 'registered' member of the cpuidle_device struct is set to 1 during cpuidle_register_device. In this same function there are checks to see if the device is already registered to prevent duplicate calls to register the device, but this value is never set to 0 even on unregister of the device. Because of this, any attempt to call cpuidle_register_device after a call to cpuidle_unregister_device will fail which shouldn't be the case. To prevent this, set registered to 0 when the device is unregistered. Fixes: c878a52d (cpuidle: Check if device is already registered) Signed-off-by: NDave Gerlach <d-gerlach@ti.com> Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 22 1月, 2016 1 次提交
-
-
由 Sudeep Holla 提交于
Commit 51164251 "sched / idle: Drop default_idle_call() fallback from call_cpuidle()" made find_deepest_state() return non-negative value and check all the states with index > 0. Also as a result, find_deepest_state() returns 0 even when enter_freeze callbacks are not implemented and enter_freeze_proper() is called which ends up crashing the kernel. This patch updates the check for index > 0 in cpuidle_enter_freeze and cpuidle_idle_call(when idle_should_freeze is true) to restore the suspend-to-idle functionality in absence of enter_freeze callback. Fixes: 51164251 "sched / idle: Drop default_idle_call() fallback from call_cpuidle()" Signed-off-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 19 1月, 2016 1 次提交
-
-
由 Rafael J. Wysocki 提交于
After commit 9c4b2867 (cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0) it is clear that menu_select() cannot return negative values. Moreover, ladder_select_state() will never return a negative value too, so make find_deepest_state() return non-negative values too and drop the default_idle_call() fallback from call_cpuidle(). This eliminates one branch from the idle loop and makes the governors and find_deepest_state() handle the case when all states have been disabled from sysfs consistently. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NIngo Molnar <mingo@kernel.org> Tested-by: NSudeep Holla <sudeep.holla@arm.com>
-
- 28 8月, 2015 1 次提交
-
-
由 Xunlei Pang 提交于
For cpuidle_state_is_coupled(), 'dev' is not used, so remove it. Signed-off-by: NXunlei Pang <pang.xunlei@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 21 7月, 2015 1 次提交
-
-
由 Lucas Stach 提交于
Make sure to stop tracing only once we are past a point where all latency tracing events have been processed (irqs are not enabled again). This has the slight advantage of capturing more latency related events in the idle path, but most importantly it makes sure that latency tracing doesn't get re-enabled inadvertently when new events are coming in. This makes the irqsoff latency tracer useful again, as we stop capturing CPU sleep time as IRQ latency. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel@pengutronix.de Cc: patchwork-lst@pengutronix.de Link: http://lkml.kernel.org/r/1437410090-3747-1-git-send-email-l.stach@pengutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 7月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Put tick_freeze() under RCU_NONIDLE() to prevent RCU from complaining about suspicious RCU usage in idle by trace_suspend_resume() called from there. While at it, fix a comment related to another usage of RCU_NONIDLE() in enter_freeze_proper(). Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 30 5月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
The CPUIDLE_DRIVER_STATE_START symbol is defined as 1 only if CONFIG_ARCH_HAS_CPU_RELAX is set, otherwise it is defined as 0. However, if CONFIG_ARCH_HAS_CPU_RELAX is set, the first (index 0) entry in the cpuidle driver's table of states is overwritten with the default "poll" entry by the core. The "state" defined by the "poll" entry doesn't provide ->enter_dead and ->enter_freeze callbacks and its exit_latency is 0. For this reason, it is not necessary to use CPUIDLE_DRIVER_STATE_START in cpuidle_play_dead() (->enter_dead is NULL, so the "poll state" will be skipped by the loop). It also is arguably unuseful to return states with exit_latency equal to 0 from find_deepest_state(), so the function can be modified to start the loop from index 0 and the "poll state" will be skipped by it as a result of the check against latency_req. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
-
- 19 5月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Since idle_should_freeze() is defined to always return 'false' for CONFIG_SUSPEND unset, all of the code depending on it in cpuidle_idle_call() is not necessary in that case. Make that code depend on CONFIG_SUSPEND too to avoid building it when it is not going to be used. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NThomas Gleixner <tglx@linutronix.de>
-
- 15 5月, 2015 3 次提交
-
-
由 Rafael J. Wysocki 提交于
If tick_broadcast_enter() fails in cpuidle_enter_state(), try to find another idle state to enter instead of invoking default_idle_call() immediately and returning -EBUSY which should increase the chances of saving some energy in those cases. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com> Tested-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com> Tested-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NKevin Hilman <khilman@linaro.org>
-
由 Rafael J. Wysocki 提交于
The check of the cpuidle_enter() return value against -EBUSY made in call_cpuidle() will not be necessary any more if cpuidle_enter_state() calls default_idle_call() directly when it is about to return -EBUSY, so make that happen and eliminate the check. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com> Tested-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com> Tested-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NKevin Hilman <khilman@linaro.org>
-
由 Rafael J. Wysocki 提交于
Introduce a wrapper function around idle_set_state() called sched_idle_set_state() that will pass this_rq() to it as the first argument and make cpuidle_enter_state() call the new function before and after entering the target state. At the same time, remove direct invocations of idle_set_state() from call_cpuidle(). This will allow the invocation of default_idle_call() to be moved from call_cpuidle() to cpuidle_enter_state() safely and call_cpuidle() to be simplified a bit as a result. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com> Tested-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com> Tested-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NKevin Hilman <khilman@linaro.org>
-
- 10 5月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
The kerneldoc comment for cpuidle_enter_state() doesn't match the function's header any more, so fix it. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 05 5月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Avoid calling the governor's ->reflect method if the state index passed to cpuidle_reflect() is negative. This allows the analogous check to be dropped from menu_reflect(), so do that too, and ensures that arbitrary error codes can be passed to cpuidle_reflect() as the index with no adverse consequences. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
- 29 4月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Commit 335f4919 (sched/idle: Use explicit broadcast oneshot control function) replaced clockevents_notify() invocations in cpuidle_idle_call() with direct calls to tick_broadcast_enter() and tick_broadcast_exit(), but it overlooked the fact that interrupts were already enabled before calling the latter which led to functional breakage on systems using idle states with the CPUIDLE_FLAG_TIMER_STOP flag set. Fix that by moving the invocations of tick_broadcast_enter() and tick_broadcast_exit() down into cpuidle_enter_state() where interrupts are still disabled when tick_broadcast_exit() is called. Also ensure that interrupts will be disabled before running tick_broadcast_exit() even if they have been enabled by the idle state's ->enter callback. Trigger a WARN_ON_ONCE() in that case, as we generally don't want that to happen for states with CPUIDLE_FLAG_TIMER_STOP set. Fixes: 335f4919 (sched/idle: Use explicit broadcast oneshot control function) Reported-and-tested-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 03 4月, 2015 1 次提交
-
-
Thomas Schlichter reports the following issue on his Samsung NC20: "The C-states C1 and C2 to the OS when connected to AC, and additionally provides the C3 C-state when disconnected from AC. However, the number of C-states shown in sysfs is fixed to the number of C-states present at boot. If I boot with AC connected, I always only see the C-states up to C2 even if I disconnect AC. The reason is commit 130a5f69 (ACPI / cpuidle: remove dev->state_count setting). It removes the update of dev->state_count, but sysfs uses exactly this variable to show the C-states. The fix is to use drv->state_count in sysfs. As this is currently the last user of dev->state_count, this variable can be completely removed." Remove dev->state_count as per the above. Reported-by: NThomas Schlichter <thomas.schlichter@web.de> Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com> Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Cc: 3.14+ <stable@vger.kernel.org> # 3.14+ [ rjw: Changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 06 3月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Commit 38106313 (PM / sleep: Re-implement suspend-to-idle handling) overlooked the fact that entering some sufficiently deep idle states by CPUs may cause their local timers to stop and in those cases it is necessary to switch over to a broadcast timer prior to entering the idle state. If the cpuidle driver in use does not provide the new ->enter_freeze callback for any of the idle states, that problem affects suspend-to-idle too, but it is not taken into account after the changes made by commit 38106313. Fix that by changing the definition of cpuidle_enter_freeze() and re-arranging of the code in cpuidle_idle_call(), so the former does not call cpuidle_enter() any more and the fallback case is handled by cpuidle_idle_call() directly. Fixes: 38106313 (PM / sleep: Re-implement suspend-to-idle handling) Reported-and-tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
- 01 3月, 2015 2 次提交
-
-
由 Rafael J. Wysocki 提交于
Modify cpuidle_enter_freeze() to do the sanity checks done by cpuidle_select() to avoid crashing the suspend-to-idle code path in case something is missing. Fixes: 38106313 (PM / sleep: Re-implement suspend-to-idle handling) Original-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
由 Rafael J. Wysocki 提交于
Disabling interrupts at the end of cpuidle_enter_freeze() is not useful, because its caller, cpuidle_idle_call(), re-enables them right away after invoking it. To avoid that unnecessary back and forth dance with interrupts, make cpuidle_enter_freeze() enable interrupts after calling enter_freeze_proper() and drop the local_irq_disable() at its end, so that all of the code paths in it end up with interrupts enabled. Then, cpuidle_idle_call() will not need to re-enable interrupts after calling cpuidle_enter_freeze() any more, because the latter will return with interrupts enabled, in analogy with cpuidle_enter(). Reported-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
- 16 2月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
The efficiency of suspend-to-idle depends on being able to keep CPUs in the deepest available idle states for as much time as possible. Ideally, they should only be brought out of idle by system wakeup interrupts. However, timer interrupts occurring periodically prevent that from happening and it is not practical to chase all of the "misbehaving" timers in a whack-a-mole fashion. A much more effective approach is to suspend the local ticks for all CPUs and the entire timekeeping along the lines of what is done during full suspend, which also helps to keep suspend-to-idle and full suspend reasonably similar. The idea is to suspend the local tick on each CPU executing cpuidle_enter_freeze() and to make the last of them suspend the entire timekeeping. That should prevent timer interrupts from triggering until an IO interrupt wakes up one of the CPUs. It needs to be done with interrupts disabled on all of the CPUs, though, because otherwise the suspended clocksource might be accessed by an interrupt handler which might lead to fatal consequences. Unfortunately, the existing ->enter callbacks provided by cpuidle drivers generally cannot be used for implementing that, because some of them re-enable interrupts temporarily and some idle entry methods cause interrupts to be re-enabled automatically on exit. Also some of these callbacks manipulate local clock event devices of the CPUs which really shouldn't be done after suspending their ticks. To overcome that difficulty, introduce a new cpuidle state callback, ->enter_freeze, that will be guaranteed (1) to keep interrupts disabled all the time (and return with interrupts disabled) and (2) not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to look for the deepest available idle state with ->enter_freeze present and to make the CPU execute that callback with suspended tick (and the last of the online CPUs to execute it with suspended timekeeping). Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
- 14 2月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
In preparation for adding support for quiescing timers in the final stage of suspend-to-idle transitions, rework the freeze_enter() function making the system wait on a wakeup event, the freeze_wake() function terminating the suspend-to-idle loop and the mechanism by which deep idle states are entered during suspend-to-idle. First of all, introduce a simple state machine for suspend-to-idle and make the code in question use it. Second, prevent freeze_enter() from losing wakeup events due to race conditions and ensure that the number of online CPUs won't change while it is being executed. In addition to that, make it force all of the CPUs re-enter the idle loop in case they are in idle states already (so they can enter deeper idle states if possible). Next, drop cpuidle_use_deepest_state() and replace use_deepest_state checks in cpuidle_select() and cpuidle_reflect() with a single suspend-to-idle state check in cpuidle_idle_call(). Finally, introduce cpuidle_enter_freeze() that will simply find the deepest idle state available to the given CPU and enter it using cpuidle_enter(). Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
- 24 9月, 2014 1 次提交
-
-
由 Daniel Lezcano 提交于
When the cpu enters idle, it stores the cpuidle state pointer in its struct rq instance which in turn could be used to make a better decision when balancing tasks. As soon as the cpu exits its idle state, the struct rq reference is cleared. There are a couple of situations where the idle state pointer could be changed while it is being consulted: 1. For x86/acpi with dynamic c-states, when a laptop switches from battery to AC that could result on removing the deeper idle state. The acpi driver triggers: 'acpi_processor_cst_has_changed' 'cpuidle_pause_and_lock' 'cpuidle_uninstall_idle_handler' 'kick_all_cpus_sync'. All cpus will exit their idle state and the pointed object will be set to NULL. 2. The cpuidle driver is unloaded. Logically that could happen but not in practice because the drivers are always compiled in and 95% of them are not coded to unregister themselves. In any case, the unloading code must call 'cpuidle_unregister_device', that calls 'cpuidle_pause_and_lock' leading to 'kick_all_cpus_sync' as mentioned above. A race can happen if we use the pointer and then one of these two scenarios occurs at the same moment. In order to be safe, the idle state pointer stored in the rq must be used inside a rcu_read_lock section where we are protected with the 'rcu_barrier' in the 'cpuidle_uninstall_idle_handler' function. The idle_get_state() and idle_put_state() accessors should be used to that effect. Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NNicolas Pitre <nico@linaro.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Cc: linaro-kernel@lists.linaro.org Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 19 9月, 2014 1 次提交
-
-
由 Chuansheng Liu 提交于
Currently kick_all_cpus_sync() or smp_call_function() can not break the polling idle cpu immediately. Instead using wake_up_all_idle_cpus() which can wake up the polling idle cpu quickly is much more helpful for power. Signed-off-by: NChuansheng Liu <chuansheng.liu@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-pm@vger.kernel.org Cc: changcheng.liu@intel.com Cc: xiaoming.wang@intel.com Cc: souvik.k.chakravarty@intel.com Cc: luto@amacapital.net Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Link: http://lkml.kernel.org/r/1409815075-4180-3-git-send-email-chuansheng.liu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 09 7月, 2014 1 次提交
-
-
由 Sandeep Tripathy 提交于
idle_exit event is the first event after a core exits idle state. So this should be traced before local irq is ebabled. Likewise idle_entry is the last event before a core enters idle state. This will ease visualising the cpu idle state from kernel traces. Signed-off-by: NSandeep Tripathy <sandeep.tripathy@linaro.org> Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> [rjw: Subject, rebase] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 07 5月, 2014 1 次提交
-
-
由 Rafael J. Wysocki 提交于
If freeze_enter() is called, we want to bypass the current cpuidle governor and always use the deepest available (that is, not disabled) C-state, because we want to save as much energy as reasonably possible then and runtime latency constraints don't matter at that point, since the system is in a sleep state anyway. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NAubrey Li <aubrey.li@linux.intel.com>
-
- 01 5月, 2014 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Since both cpuidle_enabled() and cpuidle_select() are only called by cpuidle_idle_call(), it is not really useful to keep them separate and combining them will help to avoid complicating cpuidle_idle_call() even further if governors are changed to return error codes sometimes. This code modification shouldn't lead to any functional changes. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 12 3月, 2014 1 次提交
-
-
由 Paul Burton 提交于
As described by a comment at the end of cpuidle_enter_state_coupled it can be inefficient for coupled idle states to return with IRQs enabled since they may proceed to service an interrupt instead of clearing the coupled idle state. Until they have finished & cleared the idle state all CPUs coupled with them will spin rather than being able to enter a safe idle state. Commits e1689795 "cpuidle: Add common time keeping and irq enabling" and 554c06ba "cpuidle: remove en_core_tk_irqen flag" led to the cpuidle_enter_state enabling interrupts for all idle states, including coupled ones, making this inefficiency unavoidable by drivers & the local_irq_enable near the end of cpuidle_enter_state_coupled redundant. This patch avoids enabling interrupts in cpuidle_enter_state after a coupled state has been entered, allowing them to remain disabled until all coupled CPUs have exited the idle state and cpuidle_enter_state_coupled re-enables them. Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NPaul Burton <paul.burton@imgtec.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-