提交 · 27d50c7eeb0f03c3d3ca72aac4d2dd487ca1f3f0 · openanolis / cloud-kernel

02 3月, 2016 3 次提交

rcu: Make CPU_DYING_IDLE an explicit call · 27d50c7e

由 Thomas Gleixner 提交于 2月 26, 2016

Make the RCU CPU_DYING_IDLE callback an explicit function call, so it gets
invoked at the proper place.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel <riel@redhat.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/20160226182341.870167933@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

27d50c7e

cpu/hotplug: Make wait for dead cpu completion based · e69aab13

由 Thomas Gleixner 提交于 2月 26, 2016

Kill the busy spinning on the control side and just wait for the hotplugged
cpu to tell that it reached the dead state.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel <riel@redhat.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/20160226182341.776157858@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

e69aab13

cpu/hotplug: Let upcoming cpu bring itself fully up · 8df3e07e

由 Thomas Gleixner 提交于 2月 26, 2016

Let the upcoming cpu kick the hotplug thread and let itself complete the
bringup. That way the controll side can just wait for the completion or later
when we made the hotplug machinery async not care at all.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel <riel@redhat.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/20160226182341.697655464@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8df3e07e

22 1月, 2016 1 次提交

cpuidle: fix fallback mechanism for suspend to idle in absence of enter_freeze · 6f16886b

由 Sudeep Holla 提交于 1月 21, 2016

Commit 51164251 "sched / idle: Drop default_idle_call() fallback
from call_cpuidle()" made find_deepest_state() return non-negative
value and check all the states with index > 0. Also as a result,
find_deepest_state() returns 0 even when enter_freeze callbacks are not
implemented and enter_freeze_proper() is called which ends up crashing
the kernel.

This patch updates the check for index > 0 in cpuidle_enter_freeze and
cpuidle_idle_call(when idle_should_freeze is true) to restore the
suspend-to-idle functionality in absence of enter_freeze callback.

Fixes: 51164251 "sched / idle: Drop default_idle_call() fallback from call_cpuidle()"
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6f16886b

19 1月, 2016 1 次提交

sched / idle: Drop default_idle_call() fallback from call_cpuidle() · 51164251

由 Rafael J. Wysocki 提交于 1月 16, 2016

After commit 9c4b2867 (cpuidle: menu: Fix menu_select() for
CPUIDLE_DRIVER_STATE_START == 0) it is clear that menu_select()
cannot return negative values.  Moreover, ladder_select_state()
will never return a negative value too, so make find_deepest_state()
return non-negative values too and drop the default_idle_call()
fallback from call_cpuidle().

This eliminates one branch from the idle loop and makes the governors
and find_deepest_state() handle the case when all states have been
disabled from sysfs consistently.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NIngo Molnar <mingo@kernel.org>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>

51164251

15 1月, 2016 1 次提交

vmstat: make vmstat_updater deferrable again and shut down on idle · 0eb77e98

由 Christoph Lameter 提交于 1月 14, 2016

Currently the vmstat updater is not deferrable as a result of commit
ba4877b9 ("vmstat: do not use deferrable delayed work for
vmstat_update").  This in turn can cause multiple interruptions of the
applications because the vmstat updater may run at

Make vmstate_update deferrable again and provide a function that folds
the differentials when the processor is going to idle mode thus
addressing the issue of the above commit in a clean way.

Note that the shepherd thread will continue scanning the differentials
from another processor and will reenable the vmstat workers if it
detects any changes.

Fixes: ba4877b9 ("vmstat: do not use deferrable delayed work for vmstat_update")
Signed-off-by: NChristoph Lameter <cl@linux.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0eb77e98

12 10月, 2015 1 次提交

sched, tracing: Stop/start critical timings around the idle=poll idle loop · 9babcd79

由 Daniel Bristot de Oliveira 提交于 10月 08, 2015

When using idle=poll, the preemptoff tracer is always showing
the idle task as the culprit for long latencies. That happens
because critical timings are not stopped before idle loop. This
patch stops critical timings before entering the idle loop,
starting it again after the idle loop.

This problem does not affect the irqsoff tracer because
interruptions are enabled before entering the idle loop.
Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
Reviewed-by: NLuis Claudio R. Goncalves <lgoncalv@redhat.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/10fc3705874aef11dbe152a068b591a7be1899b4.1444314899.git.bristot@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

9babcd79

21 7月, 2015 1 次提交

sched/idle: Move latency tracing stop/start calls deeper inside the idle loop · 63caae84

由 Lucas Stach 提交于 7月 20, 2015

Make sure to stop tracing only once we are past a point where
all latency tracing events have been processed (irqs are not
enabled again). This has the slight advantage of capturing more
latency related events in the idle path, but most importantly it
makes sure that latency tracing doesn't get re-enabled
inadvertently when new events are coming in.

This makes the irqsoff latency tracer useful again, as we stop
capturing CPU sleep time as IRQ latency.
Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel@pengutronix.de
Cc: patchwork-lst@pengutronix.de
Link: http://lkml.kernel.org/r/1437410090-3747-1-git-send-email-l.stach@pengutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

63caae84

15 5月, 2015 2 次提交

sched / idle: Call default_idle_call() from cpuidle_enter_state() · 827a5aef

由 Rafael J. Wysocki 提交于 5月 10, 2015

The check of the cpuidle_enter() return value against -EBUSY
made in call_cpuidle() will not be necessary any more if
cpuidle_enter_state() calls default_idle_call() directly when it
is about to return -EBUSY, so make that happen and eliminate the
check.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NKevin Hilman <khilman@linaro.org>

827a5aef

sched / idle: Call idle_set_state() from cpuidle_enter_state() · faad3849

由 Rafael J. Wysocki 提交于 5月 10, 2015

Introduce a wrapper function around idle_set_state() called
sched_idle_set_state() that will pass this_rq() to it as the
first argument and make cpuidle_enter_state() call the new
function before and after entering the target state.

At the same time, remove direct invocations of idle_set_state()
from call_cpuidle().

This will allow the invocation of default_idle_call() to be
moved from call_cpuidle() to cpuidle_enter_state() safely
and call_cpuidle() to be simplified a bit as a result.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NKevin Hilman <khilman@linaro.org>

faad3849

05 5月, 2015 2 次提交

sched / idle: Eliminate the "reflect" check from cpuidle_idle_call() · bcf6ad8a

由 Rafael J. Wysocki 提交于 5月 04, 2015

Since cpuidle_reflect() should only be called if the idle state
to enter was selected by cpuidle_select(), there is the "reflect"
variable in cpuidle_idle_call() whose value is used to determine
whether or not that is the case.

However, if the entire code run between the conditional setting
"reflect" and the call to cpuidle_reflect() is moved to a separate
function, it will be possible to call that new function in both
branches of the conditional, in which case cpuidle_reflect() will
only need to be called from one of them too and the "reflect"
variable won't be necessary any more.

This eliminates one check made by cpuidle_idle_call() on the majority
of its invocations, so change the code as described.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

bcf6ad8a

sched / idle: Move the default idle call code to a separate function · 82f66327

由 Rafael J. Wysocki 提交于 5月 04, 2015

Move the code under the "use_default" label in cpuidle_idle_call()
into a separate (new) function.

This just allows the subsequent changes to be more stratightforward.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

82f66327

29 4月, 2015 1 次提交

cpuidle: Run tick_broadcast_exit() with disabled interrupts · df8d9eea

由 Rafael J. Wysocki 提交于 4月 29, 2015

Commit 335f4919 (sched/idle: Use explicit broadcast oneshot
control function) replaced clockevents_notify() invocations in
cpuidle_idle_call() with direct calls to tick_broadcast_enter()
and tick_broadcast_exit(), but it overlooked the fact that
interrupts were already enabled before calling the latter which
led to functional breakage on systems using idle states with the
CPUIDLE_FLAG_TIMER_STOP flag set.

Fix that by moving the invocations of tick_broadcast_enter()
and tick_broadcast_exit() down into cpuidle_enter_state() where
interrupts are still disabled when tick_broadcast_exit() is
called.  Also ensure that interrupts will be disabled before
running tick_broadcast_exit() even if they have been enabled by
the idle state's ->enter callback.  Trigger a WARN_ON_ONCE() in
that case, as we generally don't want that to happen for states
with CPUIDLE_FLAG_TIMER_STOP set.

Fixes: 335f4919 (sched/idle: Use explicit broadcast oneshot control function)
Reported-and-tested-by: NLinus Walleij <linus.walleij@linaro.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

df8d9eea

03 4月, 2015 1 次提交

sched/idle: Use explicit broadcast oneshot control function · 335f4919

由 Thomas Gleixner 提交于 4月 03, 2015

Replace the clockevents_notify() call with an explicit function call.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/6422336.RMm7oUHcXh@vostro.rjw.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>

335f4919

13 3月, 2015 2 次提交

rcu: Handle outgoing CPUs on exit from idle loop · 88428cc5

由 Paul E. McKenney 提交于 1月 28, 2015

This commit informs RCU of an outgoing CPU just before that CPU invokes
arch_cpu_idle_dead() during its last pass through the idle loop (via a
new CPU_DYING_IDLE notifier value). This change means that RCU need not
deal with outgoing CPUs passing through the scheduler after informing
RCU that they are no longer online. Note that removing the CPU from
the rcu_node ->qsmaskinit bit masks is done at CPU_DYING_IDLE time,
and orphaning callbacks is still done at CPU_DEAD time, the reason being
that at CPU_DEAD time we have another CPU that can adopt them.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

88428cc5

cpu: Make CPU-offline idle-loop transition point more precise · 528a25b0

由 Paul E. McKenney 提交于 1月 28, 2015

This commit uses a per-CPU variable to make the CPU-offline code path
through the idle loop more precise, so that the outgoing CPU is
guaranteed to make it into the idle loop before it is powered off.
This commit is in preparation for putting the RCU offline-handling
code on this code path, which will eliminate the magic one-jiffy
wait that RCU uses as the maximum time for an outgoing CPU to get
all the way through the scheduler.

The magic one-jiffy wait for incoming CPUs remains a separate issue.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

528a25b0

06 3月, 2015 1 次提交

cpuidle / sleep: Use broadcast timer for states that stop local timer · ef2b22ac

由 Rafael J. Wysocki 提交于 3月 02, 2015

Commit 38106313 (PM / sleep: Re-implement suspend-to-idle handling)
overlooked the fact that entering some sufficiently deep idle states
by CPUs may cause their local timers to stop and in those cases it
is necessary to switch over to a broadcast timer prior to entering
the idle state. If the cpuidle driver in use does not provide
the new ->enter_freeze callback for any of the idle states, that
problem affects suspend-to-idle too, but it is not taken into account
after the changes made by commit 38106313.

Fix that by changing the definition of cpuidle_enter_freeze() and
re-arranging of the code in cpuidle_idle_call(), so the former does
not call cpuidle_enter() any more and the fallback case is handled
by cpuidle_idle_call() directly.

Fixes: 38106313 (PM / sleep: Re-implement suspend-to-idle handling)
Reported-and-tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

ef2b22ac

03 3月, 2015 1 次提交

cpuidle: Clean up fallback handling in cpuidle_idle_call() · dfcacc15

由 Rafael J. Wysocki 提交于 3月 02, 2015

Move the fallback code path in cpuidle_idle_call() to the end of the
function to avoid jumping to a label in an if () branch.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

dfcacc15

01 3月, 2015 1 次提交

idle / sleep: Avoid excessive disabling and enabling interrupts · 01e04f46

由 Rafael J. Wysocki 提交于 2月 27, 2015

Disabling interrupts at the end of cpuidle_enter_freeze() is not
useful, because its caller, cpuidle_idle_call(), re-enables them
right away after invoking it.

To avoid that unnecessary back and forth dance with interrupts,
make cpuidle_enter_freeze() enable interrupts after calling
enter_freeze_proper() and drop the local_irq_disable() at its
end, so that all of the code paths in it end up with interrupts
enabled.  Then, cpuidle_idle_call() will not need to re-enable
interrupts after calling cpuidle_enter_freeze() any more, because
the latter will return with interrupts enabled, in analogy with
cpuidle_enter().
Reported-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

01e04f46

14 2月, 2015 1 次提交

PM / sleep: Re-implement suspend-to-idle handling · 38106313

由 Rafael J. Wysocki 提交于 2月 12, 2015

In preparation for adding support for quiescing timers in the final
stage of suspend-to-idle transitions, rework the freeze_enter()
function making the system wait on a wakeup event, the freeze_wake()
function terminating the suspend-to-idle loop and the mechanism by
which deep idle states are entered during suspend-to-idle.

First of all, introduce a simple state machine for suspend-to-idle
and make the code in question use it.

Second, prevent freeze_enter() from losing wakeup events due to race
conditions and ensure that the number of online CPUs won't change
while it is being executed.  In addition to that, make it force
all of the CPUs re-enter the idle loop in case they are in idle
states already (so they can enter deeper idle states if possible).

Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
checks in cpuidle_select() and cpuidle_reflect() with a single
suspend-to-idle state check in cpuidle_idle_call().

Finally, introduce cpuidle_enter_freeze() that will simply find the
deepest idle state available to the given CPU and enter it using
cpuidle_enter().
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

38106313

31 1月, 2015 1 次提交

sched/idle: Add missing checks to the exit condition of cpu_idle_poll() · ff6f2d29

由 Preeti U Murthy 提交于 1月 21, 2015

cpu_idle_poll() is entered into when either the cpu_idle_force_poll is set or
tick_check_broadcast_expired() returns true. The exit condition from
cpu_idle_poll() is tif_need_resched().

However this does not take into account scenarios where cpu_idle_force_poll
changes or tick_check_broadcast_expired() returns false, without setting
the resched flag. So a cpu will be caught in cpu_idle_poll() needlessly,
thereby wasting power. Add an explicit check on cpu_idle_force_poll and
tick_check_broadcast_expired() to the exit condition of cpu_idle_poll()
to avoid this.
Signed-off-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150121105655.15279.59626.stgit@preeti.in.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ff6f2d29

24 9月, 2014 1 次提交

sched: Let the scheduler see CPU idle states · 442bf3aa

由 Daniel Lezcano 提交于 9月 04, 2014

When the cpu enters idle, it stores the cpuidle state pointer in its
struct rq instance which in turn could be used to make a better decision
when balancing tasks.

As soon as the cpu exits its idle state, the struct rq reference is
cleared.

There are a couple of situations where the idle state pointer could be changed
while it is being consulted:

1. For x86/acpi with dynamic c-states, when a laptop switches from battery
   to AC that could result on removing the deeper idle state. The acpi driver
   triggers:
	'acpi_processor_cst_has_changed'
		'cpuidle_pause_and_lock'
			'cpuidle_uninstall_idle_handler'
				'kick_all_cpus_sync'.

All cpus will exit their idle state and the pointed object will be set to
NULL.

2. The cpuidle driver is unloaded. Logically that could happen but not
in practice because the drivers are always compiled in and 95% of them are
not coded to unregister themselves.  In any case, the unloading code must
call 'cpuidle_unregister_device', that calls 'cpuidle_pause_and_lock'
leading to 'kick_all_cpus_sync' as mentioned above.

A race can happen if we use the pointer and then one of these two scenarios
occurs at the same moment.

In order to be safe, the idle state pointer stored in the rq must be
used inside a rcu_read_lock section where we are protected with the
'rcu_barrier' in the 'cpuidle_uninstall_idle_handler' function. The
idle_get_state() and idle_put_state() accessors should be used to that
effect.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linaro-kernel@lists.linaro.org
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

442bf3aa

09 7月, 2014 1 次提交

cpuidle: move idle traces to cpuidle_enter_state() · 30fe6884

由 Sandeep Tripathy 提交于 7月 02, 2014

idle_exit event is the first event after a core exits
idle state. So this should be traced before local irq
is ebabled. Likewise idle_entry is the last event before
a core enters idle state. This will ease visualising the
cpu idle state from kernel traces.
Signed-off-by: NSandeep Tripathy <sandeep.tripathy@linaro.org>
Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
[rjw: Subject, rebase]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

30fe6884

05 7月, 2014 1 次提交

sched/idle: Drop !! while calculating 'broadcast' · 89abb5ad

由 Viresh Kumar 提交于 6月 24, 2014

We don't need 'broadcast' to be set to 'zero or one', but to 'zero or non-zero'
and so the extra operation to convert it to 'zero or one' can be skipped.

Also change type of 'broadcast' to unsigned int, i.e. type of
drv->states[*].flags.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: linaro-kernel@lists.linaro.org
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/0dfbe2976aa108c53e08d3477ea90f6360c1f54c.1403584026.git.viresh.kumar@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

89abb5ad

05 6月, 2014 2 次提交

sched/idle: Optimize try-to-wake-up IPI · e3baac47

由 Peter Zijlstra 提交于 6月 04, 2014

[ This series reduces the number of IPIs on Andy's workload by something like
  99%. It's down from many hundreds per second to very few.

  The basic idea behind this series is to make TIF_POLLING_NRFLAG be a
  reliable indication that the idle task is polling.  Once that's done,
  the rest is reasonably straightforward. ]

When enqueueing tasks on remote LLC domains, we send an IPI to do the
work 'locally' and avoid bouncing all the cachelines over.

However, when the remote CPU is idle (and polling, say x86 mwait), we
don't need to send an IPI, we can simply kick the TIF word to wake it
up and have the 'idle' loop do the work.

So when _TIF_POLLING_NRFLAG is set, but _TIF_NEED_RESCHED is not (yet)
set, set _TIF_NEED_RESCHED and avoid sending the IPI.
Much-requested-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
[Edited by Andy Lutomirski, but this is mostly Peter Zijlstra's code.]
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Cc: nicolas.pitre@linaro.org
Cc: daniel.lezcano@linaro.org
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: umgwanakikbuti@gmail.com
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/ce06f8b02e7e337be63e97597fc4b248d3aa6f9b.1401902905.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

e3baac47

sched/idle: Clear polling before descheduling the idle thread · 82c65d60

由 Andy Lutomirski 提交于 6月 04, 2014

Currently, the only real guarantee provided by the polling bit is
that, if you hold rq->lock and the polling bit is set, then you can
set need_resched to force a reschedule.

The only reason the lock is needed is that the idle thread might not
be running at all when setting its need_resched bit, and rq->lock
keeps it pinned.

This is easy to fix: just clear the polling bit before scheduling.
Now the idle thread's polling bit is only ever set when
rq->curr == rq->idle.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: nicolas.pitre@linaro.org
Cc: daniel.lezcano@linaro.org
Cc: umgwanakikbuti@gmail.com
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/b2059fcb4c613d520cb503b6fad6e47033c7c203.1401902905.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

82c65d60

08 5月, 2014 3 次提交

sched/idle: Make cpuidle_idle_call() void · 08c373e5

由 Rafael J. Wysocki 提交于 4月 21, 2014

The only value ever returned by cpuidle_idle_call() is 0 and its
only caller ignores that value anyway, so make it void.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/4717784.WmVEpDoliM@vostro.rjw.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>

08c373e5

sched/idle: Reflow cpuidle_idle_call() · 37352273

由 Peter Zijlstra 提交于 4月 11, 2014

Apply goto to reduce lines and nesting levels.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-cc6vb0snt3sr7op6rlbfeqfh@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

37352273

sched/idle: Delay clearing the polling bit · c444117f

由 Peter Zijlstra 提交于 4月 11, 2014

With the generic idle functions assuming !polling we should only clear
the polling bit at the very last opportunity in order to avoid
spurious IPIs.

Ideally we'd flip the default to polling, but that means auditing all
arch idle functions.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-vq7719foqzf6z5h4j7eh7f9e@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

c444117f

01 5月, 2014 1 次提交

cpuidle: Combine cpuidle_enabled() with cpuidle_select() · 52c324f8

由 Rafael J. Wysocki 提交于 5月 01, 2014

Since both cpuidle_enabled() and cpuidle_select() are only called by
cpuidle_idle_call(), it is not really useful to keep them separate
and combining them will help to avoid complicating cpuidle_idle_call()
even further if governors are changed to return error codes sometimes.

This code modification shouldn't lead to any functional changes.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

52c324f8

11 3月, 2014 4 次提交

sched/idle: Add more comments to the code · a1d028bd

由 Daniel Lezcano 提交于 3月 03, 2014

The idle main function is a complex and a critical function. Added more
comments to the code.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de
Cc: rjw@rjwysocki.net
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-5-git-send-email-daniel.lezcano@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

a1d028bd

sched/idle: Move idle conditions in cpuidle_idle main function · 8ca3c642

由 Daniel Lezcano 提交于 3月 03, 2014

This patch moves the condition before entering idle into the cpuidle main
function located in idle.c. That simplify the idle mainloop functions and
increase the readibility of the conditions to enter truly idle.

This patch is code reorganization and does not change the behavior of the
function.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de
Cc: rjw@rjwysocki.net
Cc: nicolas.pitre@linaro.org
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-4-git-send-email-daniel.lezcano@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

8ca3c642

sched/idle: Reorganize the idle loop · c8cc7d4d

由 Daniel Lezcano 提交于 3月 03, 2014

Now that we have the main cpuidle function in idle.c, move some code from
the idle mainloop to this function for the sake of clarity.

That removes if then else indentation difficult to follow when looking at the
code. This patch does not change the current behavior.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de
Cc: rjw@rjwysocki.net
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-3-git-send-email-daniel.lezcano@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

c8cc7d4d

cpuidle/idle: Move the cpuidle_idle_call function to idle.c · 30cdd69e

由 Daniel Lezcano 提交于 3月 03, 2014

The cpuidle_idle_call does nothing more than calling the three individuals
function and is no longer used by any arch specific code but only in the
cpuidle framework code.

We can move this function into the idle task code to ensure better
proximity to the scheduler code.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: rjw@rjwysocki.net
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-2-git-send-email-daniel.lezcano@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

30cdd69e

27 2月, 2014 1 次提交

sched/idle: Remove stale old file · 06d50c65

由 Peter Zijlstra 提交于 2月 24, 2014

Commit cf37b6b4 ("sched/idle: Move cpu/idle.c to sched/idle.c")
said to simply move a file; somehow it got mangled and created an old
version of the file and forgot to remove the old file.

Fix this fail; add the lost change and remove the now identical old
file.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: rjw@rjwysocki.net
Cc: nicolas.pitre@linaro.org
Cc: preeti@linux.vnet.ibm.com
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/20140224172207.GC9987@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

06d50c65

11 2月, 2014 2 次提交

sched/idle: Move cpu/idle.c to sched/idle.c · cf37b6b4

由 Nicolas Pitre 提交于 1月 26, 2014

Integration of cpuidle with the scheduler requires that the idle loop be
closely integrated with the scheduler proper. Moving cpu/idle.c into the
sched directory will allow for a smoother integration, and eliminate a
subdirectory which contained only one source file.
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.LFD.2.11.1401301102210.1652@knanqh.ubzrSigned-off-by: NIngo Molnar <mingo@kernel.org>

cf37b6b4

sched/idle: Move the cpuidle entry point to the generic idle loop · af8cd8ef

由 Nicolas Pitre 提交于 1月 29, 2014

In order to integrate cpuidle with the scheduler, we must have a better
proximity in the core code with what cpuidle is doing and not delegate
such interaction to arch code.

Architectures implementing arch_cpu_idle() should simply enter
a cheap idle mode in the absence of a proper cpuidle driver.

In both cases i.e. whether it is a cpuidle driver or the default
arch_cpu_idle(), the calling convention expects IRQs to be disabled
on entry and enabled on exit. There is a warning in place already but
let's add a forced IRQ enable here as well.  This will allow for
removing the forced IRQ enable some implementations do locally and
allowing for the warning to trig.
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Olof Johansson <olof@lixom.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.11.1401291526320.1652@knanqh.ubzrSigned-off-by: NIngo Molnar <mingo@kernel.org>

af8cd8ef

14 1月, 2014 1 次提交

sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding · 8cb75e0c

由 Peter Zijlstra 提交于 11月 20, 2013

With various drivers wanting to inject idle time; we get people
calling idle routines outside of the idle loop proper.

Therefore we need to be extra careful about not missing
TIF_NEED_RESCHED -> PREEMPT_NEED_RESCHED propagations.

While looking at this, I also realized there's a small window in the
existing idle loop where we can miss TIF_NEED_RESCHED; when it hits
right after the tif_need_resched() test at the end of the loop but
right before the need_resched() test at the start of the loop.

So move preempt_fold_need_resched() out of the loop where we're
guaranteed to have TIF_NEED_RESCHED set.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-x9jgh45oeayzajz2mjt0y7d6@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

8cb75e0c

25 9月, 2013 2 次提交

sched: Add NEED_RESCHED to the preempt_count · f27dde8d

由 Peter Zijlstra 提交于 8月 14, 2013

In order to combine the preemption and need_resched test we need to
fold the need_resched information into the preempt_count value.

Since the NEED_RESCHED flag is set across CPUs this needs to be an
atomic operation, however we very much want to avoid making
preempt_count atomic, therefore we keep the existing TIF_NEED_RESCHED
infrastructure in place but at 3 sites test it and fold its value into
preempt_count; namely:

 - resched_task() when setting TIF_NEED_RESCHED on the current task
 - scheduler_ipi() when resched_task() sets TIF_NEED_RESCHED on a
                   remote task it follows it up with a reschedule IPI
                   and we can modify the cpu local preempt_count from
                   there.
 - cpu_idle_loop() for when resched_task() found tsk_is_polling().

We use an inverted bitmask to indicate need_resched so that a 0 means
both need_resched and !atomic.

Also remove the barrier() in preempt_enable() between
preempt_enable_no_resched() and preempt_check_resched() to avoid
having to reload the preemption value and allow the compiler to use
the flags of the previuos decrement. I couldn't come up with any sane
reason for this barrier() to be there as preempt_enable_no_resched()
already has a barrier() before doing the decrement.
Suggested-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-7a7m5qqbn5pmwnd4wko9u6da@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

f27dde8d

sched, idle: Fix the idle polling state logic · ea811747

由 Peter Zijlstra 提交于 9月 11, 2013

Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
regressed several workloads and caused excessive reschedule
interrupts.

The patch in question failed to notice that the x86 code had an
inverted sense of the polling state versus the new generic code (x86:
default polling, generic: default !polling).

Fix the two prominent x86 mwait based idle drivers and introduce a few
new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
usage).

Also switch the idle routines to using tif_need_resched() which is an
immediate TIF_NEED_RESCHED test as opposed to need_resched which will
end up being slightly different.
Reported-by: NMike Galbraith <bitbucket@online.de>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: lenb@kernel.org
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

ea811747

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功