提交 · 5e083c5c1f94464104194ad0d29aa0c3af2b4655 · openanolis / cloud-kernel

18 3月, 2020 1 次提交

cpuidle: menu: Do not update last_state_idx in menu_select() · 5e083c5c

由 Rafael J. Wysocki 提交于 10月 02, 2018

commit eb40a380bff28f84b6583bba6786b46ef26ef548 upstream

It is not necessary to update data->last_state_idx in menu_select()
as it only is used in menu_update() which only runs when
data->needs_update is set and that is set only when updating
data->last_state_idx in menu_reflect().

Accordingly, drop the update of data->last_state_idx from
menu_select() and get rid of the (now redundant) "out" label
from it.

No intentional behavior changes.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NYihao Wu <wuyihao@linux.alibaba.com>
Acked-by: NMichael Wang <yun.wang@linux.alibaba.com>

5e083c5c

27 12月, 2019 1 次提交

sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD · 4ca637b4

由 Johannes Weiner 提交于 10月 26, 2018

commit 8508cf3ffad4defa202b303e5b6379efc4cd9054 upstream.

There are several definitions of those functions/macros in places that
mess with fixed-point load averages.  Provide an official version.

[akpm@linux-foundation.org: fix missed conversion in block/blk-iolatency.c]
Link: http://lkml.kernel.org/r/20180828172258.3185-5-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: NSuren Baghdasaryan <surenb@google.com>
Tested-by: NDaniel Drake <drake@endlessm.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <jweiner@fb.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Enderborg <peter.enderborg@sony.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
[Joseph: use stat.mean instead of stat->rqs.mean to solve conflict]
Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

Conflicts:
    block/blk-iolatency.c

4ca637b4

24 11月, 2019 1 次提交

cpuidle: menu: Fix wakeup statistics updates for polling state · 27ab8f16

由 Rafael J. Wysocki 提交于 10月 02, 2018

[ Upstream commit 5f26bdceb9c0a5e6c696aa2899d077cd3ae93413 ]

If the CPU exits the "polling" state due to the time limit in the
loop in poll_idle(), this is not a real wakeup and it just means
that the "polling" state selection was not adequate.  The governor
mispredicted short idle duration, but had a more suitable state been
selected, the CPU might have spent more time in it.  In fact, there
is no reason to expect that there would have been a wakeup event
earlier than the next timer in that case.

Handling such cases as regular wakeups in menu_update() may cause the
menu governor to make suboptimal decisions going forward, but ignoring
them altogether would not be correct either, because every time
menu_select() is invoked, it makes a separate new attempt to predict
the idle duration taking distinct time to the closest timer event as
input and the outcomes of all those attempts should be recorded.

For this reason, make menu_update() always assume that if the
"polling" state was exited due to the time limit, the next proper
wakeup event for the CPU would be the next timer event (not
including the tick).

Fixes: a37b969a "cpuidle: poll_state: Add time limit to poll_idle()"
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>

27ab8f16

25 8月, 2018 1 次提交

cpuidle: menu: Retain tick when shallow state is selected · 757ab15c

由 Rafael J. Wysocki 提交于 8月 21, 2018

The case addressed by commit 5ef499cd (cpuidle: menu: Handle
stopped tick more aggressively) in the stopped tick case is present
when the tick has not been stopped yet too. Namely, if only two CPU
idle states, shallow state A with target residency significantly
below the tick boundary and deep state B with target residency
significantly above it, are available and the predicted idle
duration is above the tick boundary, but below the target residency
of state B, state A will be selected and the CPU may spend indefinite
amount of time in it, which is not quite energy-efficient.

However, if the tick has not been stopped yet and the governor is
about to select a shallow idle state for the CPU even though the idle
duration predicted by it is above the tick boundary, it should be
fine to wake up the CPU early, so the tick can be retained then and
the governor will have a chance to select a deeper state when it runs
next time.

[Note that when this really happens, it will make the idle duration
predictor believe that the CPU might be idle longer than predicted,
which will make it more likely to predict longer idle durations going
forward, but that will also cause deeper idle states to be selected
going forward, on average, which is what's needed here.]

Fixes: 87c9fe6e (cpuidle: menu: Avoid selecting shallow states with stopped tick)
Reported-by: NLeo Yan <leo.yan@linaro.org>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+: 5ef499cd (cpuidle: menu: Handle ...)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

757ab15c

20 8月, 2018 1 次提交

cpuidle: menu: Handle stopped tick more aggressively · 5ef499cd

由 Rafael J. Wysocki 提交于 8月 14, 2018

Commit 87c9fe6e (cpuidle: menu: Avoid selecting shallow states
with stopped tick) missed the case when the target residencies of
deep idle states of CPUs are above the tick boundary which may cause
the CPU to get stuck in a shallow idle state for a long time.

Say there are two CPU idle states available: one shallow, with the
target residency much below the tick boundary and one deep, with
the target residency significantly above the tick boundary.  In
that case, if the tick has been stopped already and the expected
next timer event is relatively far in the future, the governor will
assume the idle duration to be equal to TICK_USEC and it will select
the idle state for the CPU accordingly.  However, that will cause the
shallow state to be selected even though it would have been more
energy-efficient to select the deep one.

To address this issue, modify the governor to always use the time
till the closest timer event instead of the predicted idle duration
if the latter is less than the tick period length and the tick has
been stopped already.  Also make it extend the search for a matching
idle state if the tick is stopped to avoid settling on a shallow
state if deep states with target residencies above the tick period
length are available.

In addition, make it always indicate that the tick should be stopped
if it has been stopped already for consistency.

Fixes: 87c9fe6e (cpuidle: menu: Avoid selecting shallow states with stopped tick)
Reported-by: NLeo Yan <leo.yan@linaro.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

5ef499cd

17 8月, 2018 1 次提交

cpuidle: menu: Update stale polling override comment · 50f7ccc6

由 Rafael J. Wysocki 提交于 8月 16, 2018

The comment to explain why the menu governor uses idle state 1
instead of idle state 0 as the first one sometimes is stale (among
other things it mentions a user setting not present any more),
so update it.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

50f7ccc6

15 8月, 2018 1 次提交

cpuidle: menu: Fix white space · f390c5eb

由 Rafael J. Wysocki 提交于 8月 14, 2018

Fix some damaged white space in menu_select().
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f390c5eb

31 5月, 2018 2 次提交

cpuidle: governors: Consolidate PM QoS handling · 0fc784fb

由 Rafael J. Wysocki 提交于 5月 30, 2018

There is some code duplication related to the PM QoS handling between
the existing cpuidle governors, so move that code to a common helper
function and call that from the governors.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

0fc784fb

cpuidle: governors: Drop redundant checks related to PM QoS · cf7eeea9

由 Rafael J. Wysocki 提交于 5月 30, 2018

PM_QOS_RESUME_LATENCY_NO_CONSTRAINT is defined as the 32-bit integer
maximum, so it is not necessary to test the return value of
dev_pm_qos_raw_read_value() against it directly in the menu and
ladder cpuidle governors.

Drop these redundant checks.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

cf7eeea9

09 4月, 2018 2 次提交

cpuidle: menu: Avoid selecting shallow states with stopped tick · 87c9fe6e

由 Rafael J. Wysocki 提交于 4月 05, 2018

If the scheduler tick has been stopped already and the governor
selects a shallow idle state, the CPU can spend a long time in that
state if the selection is based on an inaccurate prediction of idle
time.  That effect turns out to be relevant, so it needs to be
mitigated.

To that end, modify the menu governor to discard the result of the
idle time prediction if the tick is stopped and the predicted idle
time is less than the tick period length, unless the tick timer is
going to expire soon.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

87c9fe6e

cpuidle: menu: Refine idle state selection for running tick · 296bb1e5

由 Rafael J. Wysocki 提交于 4月 05, 2018

If the tick isn't stopped, the target residency of the state selected
by the menu governor may be greater than the actual time to the next
tick and that means lost energy.

To avoid that, make tick_nohz_get_sleep_length() return the current
time to the next event (before stopping the tick) in addition to the
estimated one via an extra pointer argument and make menu_select()
use that value to refine the state selection when necessary.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

296bb1e5

06 4月, 2018 1 次提交

cpuidle: Return nohz hint from cpuidle_select() · 45f1ff59

由 Rafael J. Wysocki 提交于 3月 22, 2018

Add a new pointer argument to cpuidle_select() and to the ->select
cpuidle governor callback to allow a boolean value indicating
whether or not the tick should be stopped before entering the
selected state to be returned from there.

Make the ladder governor ignore that pointer (to preserve its
current behavior) and make the menu governor return 'false" through
it if:
 (1) the idle exit latency is constrained at 0, or
 (2) the selected state is a polling one, or
 (3) the expected idle period duration is within the tick period
     range.

In addition to that, the correction factor computations in the menu
governor need to take the possibility that the tick may not be
stopped into account to avoid artificially small correction factor
values.  To that end, add a mechanism to record tick wakeups, as
suggested by Peter Zijlstra, and use it to modify the menu_update()
behavior when tick wakeup occurs.  Namely, if the CPU is woken up by
the tick and the return value of tick_nohz_get_sleep_length() is not
within the tick boundary, the predicted idle duration is likely too
short, so make menu_update() try to compensate for that by updating
the governor statistics as though the CPU was idle for a long time.

Since the value returned through the new argument pointer of
cpuidle_select() is not used by its caller yet, this change by
itself is not expected to alter the functionality of the code.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

45f1ff59

08 11月, 2017 1 次提交

PM / QoS: Fix device resume latency framework · 0759e80b

由 Rafael J. Wysocki 提交于 11月 07, 2017

The special value of 0 for device resume latency PM QoS means
"no restriction", but there are two problems with that.

First, device resume latency PM QoS requests with 0 as the
value are always put in front of requests with positive
values in the priority lists used internally by the PM QoS
framework, causing 0 to be chosen as an effective constraint
value.  However, that 0 is then interpreted as "no restriction"
effectively overriding the other requests with specific
restrictions which is incorrect.

Second, the users of device resume latency PM QoS have no
way to specify that *any* resume latency at all should be
avoided, which is an artificial limitation in general.

To address these issues, modify device resume latency PM QoS to
use S32_MAX as the "no constraint" value and 0 as the "no
latency at all" one and rework its users (the cpuidle menu
governor, the genpd QoS governor and the runtime PM framework)
to follow these changes.

Also add a special "n/a" value to the corresponding user space I/F
to allow user space to indicate that it cannot accept any resume
latencies at all for the given device.

Fixes: 85dc0b8a (PM / QoS: Make it possible to expose PM QoS latency constraints)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323Reported-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: NReinette Chatre <reinette.chatre@intel.com>
Tested-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Tested-by: NTero Kristo <t-kristo@ti.com>
Reviewed-by: NRamesh Thomas <ramesh.thomas@intel.com>

0759e80b

01 11月, 2017 1 次提交

Revert "PM / QoS: Fix device resume latency PM QoS" · d5919dcc

由 Rafael J. Wysocki 提交于 10月 31, 2017

This reverts commit 0cc2b4e5 (PM / QoS: Fix device resume latency PM
QoS) as it introduced regressions on multiple systems and the fix-up
in commit 2a9a86d5 (PM / QoS: Fix default runtime_pm device resume
latency) does not address all of them.

The original problem that commit 0cc2b4e5 was attempting to fix
will be addressed later.

Fixes: 0cc2b4e5 (PM / QoS: Fix device resume latency PM QoS)
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d5919dcc

24 10月, 2017 1 次提交

PM / QoS: Fix device resume latency PM QoS · 0cc2b4e5

由 Rafael J. Wysocki 提交于 10月 24, 2017

The special value of 0 for device resume latency PM QoS means
"no restriction", but there are two problems with that.

First, device resume latency PM QoS requests with 0 as the
value are always put in front of requests with positive
values in the priority lists used internally by the PM QoS
framework, causing 0 to be chosen as an effective constraint
value.  However, that 0 is then interpreted as "no restriction"
effectively overriding the other requests with specific
restrictions which is incorrect.

Second, the users of device resume latency PM QoS have no
way to specify that *any* resume latency at all should be
avoided, which is an artificial limitation in general.

To address these issues, modify device resume latency PM QoS to
use S32_MAX as the "no constraint" value and 0 as the "no
latency at all" one and rework its users (the cpuidle menu
governor, the genpd QoS governor and the runtime PM framework)
to follow these changes.

Also add a special "n/a" value to the corresponding user space I/F
to allow user space to indicate that it cannot accept any resume
latencies at all for the given device.

Fixes: 85dc0b8a (PM / QoS: Make it possible to expose PM QoS latency constraints)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323Reported-by: NReinette Chatre <reinette.chatre@intel.com>
Tested-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NAlex Shi <alex.shi@linaro.org>
Cc: All applicable <stable@vger.kernel.org>

0cc2b4e5

30 8月, 2017 1 次提交

cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol · dc2251bf

由 Rafael J. Wysocki 提交于 8月 23, 2017

On some architectures the first (index 0) idle state is a polling
one and it doesn't really save energy, so there is the
CPUIDLE_DRIVER_STATE_START symbol allowing some pieces of
cpuidle code to avoid using that state.

However, this makes the code rather hard to follow.  It is better
to explicitly avoid the polling state, so add a new cpuidle state
flag CPUIDLE_FLAG_POLLING to mark it and make the relevant code
check that flag for the first state instead of using the
CPUIDLE_DRIVER_STATE_START symbol.

In the ACPI processor driver that cannot always rely on the state
flags (like before the states table has been set up) define
a new internal symbol ACPI_IDLE_STATE_START equivalent to the
CPUIDLE_DRIVER_STATE_START one and drop the latter.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>

dc2251bf

30 6月, 2017 1 次提交

cpuidle: menu: allow state 0 to be disabled · 3ed09c94

由 Nicholas Piggin 提交于 6月 26, 2017

The menu driver does not allow state0 to be disabled completely.
If it is disabled but other enabled states don't meet latency
requirements, it is still used.

Fix this by starting with the first enabled idle state. Fall back
to state 0 if no idle states are enabled (arguably this should be
-EINVAL if it is attempted, but this is the minimal fix).
Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3ed09c94

02 3月, 2017 2 次提交

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/stat.h> · 03441a34

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/stat.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/stat.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

03441a34

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/loadavg.h> · 4f17722c

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/loadavg.h> out of <linux/sched.h>, which
will have to be picked up from a couple of .c files.

Create a trivial placeholder <linux/sched/topology.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4f17722c

27 2月, 2017 1 次提交

cpuidle: menu: Avoid taking spinlock for accessing QoS values · 6dbf5cea

由 Rafael J. Wysocki 提交于 2月 24, 2017

After commit 9908859a (cpuidle/menu: add per CPU PM QoS resume
latency consideration) the cpuidle menu governor calls
dev_pm_qos_read_value() on CPU devices to read the current resume
latency QoS constraint values for them.  That function takes a spinlock
to prevent the device's power.qos pointer from becoming NULL during
the access which is a problem for the RT patchset where spinlocks are
converted into mutexes and the idle loop stops working.

However, it is not even necessary for the menu governor to take
that spinlock, because the power.qos pointer accessed under it
cannot be modified during the access anyway.

For this reason, introduce a "raw" routine for accessing device
QoS resume latency constraints without locking and use it in the
menu governor.

Fixes: 9908859a (cpuidle/menu: add per CPU PM QoS resume latency consideration)
Acked-by: NAlex Shi <alex.shi@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6dbf5cea

30 1月, 2017 2 次提交

cpuidle/menu: add per CPU PM QoS resume latency consideration · 9908859a

由 Alex Shi 提交于 1月 12, 2017

There may be special requirements on CPU response time, like if a
interrupt is pinned to a CPU, that CPU should not go into excessively
deep idle states.  For this reason, add a mechanism for adding
PM QoS resume latency constraints for individual CPUs and modify the
menu governor to take them into account.

To that end, extend the device PM QoS pm_qos_resume_latency attribute
to CPUs, which is possible, because the exit latency for CPUs is
effectively equivalent to the resume latency for devices.
Signed-off-by: NAlex Shi <alex.shi@linaro.org>
Acked-by: NRik van Riel <riel@redhat.com>
[ rjw : Subject & changelog ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9908859a

cpuidle/menu: stop seeking deeper idle if current state is deep enough · 8e37e1a2

由 Alex Shi 提交于 1月 12, 2017

Obsolete commit 71abbbf8 (cpuidle: extend cpuidle and menu governor
to handle dynamic states) wanted to introduce dynamic C-states, but that
idea was dropped long ago.  The nonsense deeper C-state checking
remained, though.

Since both target_residency and exit_latency are longer for deeper
idle state, there's no need to waste CPU time on useless checks.
Signed-off-by: NAlex Shi <alex.shi@linaro.org>
Acked-by: NRik van Riel <riel@redhat.com>
[ rjw: Subject & changelog ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8e37e1a2

21 10月, 2016 1 次提交

cpuidle: governors: Remove remaining old module code · e5f1b245

由 Daniel Lezcano 提交于 10月 05, 2016

The governor's code use try_module_get() and put_module() to refcount
the governor's module. But the governors are not compiled as module.

The refcount does not prevent to switch the governor or unload
a module as they aren't compiled as modules. The code is pointless,
so remove it.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e5f1b245

21 3月, 2016 1 次提交

cpuidle: menu: Fall back to polling if next timer event is near · 0c313cb2

由 Rafael J. Wysocki 提交于 3月 20, 2016

Commit a9ceb78b (cpuidle,menu: use interactivity_req to disable
polling) changed the behavior of the fallback state selection part
of menu_select() so it looks at interactivity_req instead of
data->next_timer_us when it makes its decision.  That effectively
caused polling to be used more often as fallback idle which led to
significant increases of energy consumption in some cases.

Commit e132b9b3 (cpuidle: menu: use high confidence factors
only when considering polling) changed that logic again to be more
predictable, but that didn't help with the increased energy
consumption problem.

For this reason, go back to making decisions on which state to fall
back to based on data->next_timer_us which is the time we know for
sure something will happen rather than a prediction (which may be
inaccurate and turns out to be so often enough to be problematic).
However, take the target residency of the first proper idle state
(C1) into account, so that state is not used as the fallback one
if its target residency is greater than data->next_timer_us.

Fixes: a9ceb78b (cpuidle,menu: use interactivity_req to disable polling)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-and-tested-by: NDoug Smythies <dsmythies@telus.net>

0c313cb2

17 3月, 2016 1 次提交

cpuidle: menu: use high confidence factors only when considering polling · e132b9b3

由 Rik van Riel 提交于 3月 16, 2016

The menu governor uses five different factors to pick the
idle state:
 - the user configured latency_req
 - the time until the next timer (next_timer_us)
 - the typical sleep interval, as measured recently
 - an estimate of sleep time by dividing next_timer_us by an observed factor
 - a load corrected version of the above, divided again by load

Only the first three items are known with enough confidence that
we can use them to consider polling, instead of an actual CPU
idle state, because the cost of being wrong about polling can be
excessive power use.

The latter two are used in the menu governor's main selection
loop, and can result in choosing a shallower idle state when
the system is expected to be busy again soon.

This pushes a busy system in the "performance" direction of
the performance<>power tradeoff, when choosing between idle
states, but stays more strictly on the "power" state when
deciding between polling and C1.
Signed-off-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e132b9b3

17 2月, 2016 2 次提交

cpuidle: menu: help gcc generate slightly better code · 3b99669b

由 Rasmus Villemoes 提交于 2月 16, 2016

We know that the avg variable actually ends up holding a 32 bit
quantity, since it's an average of such numbers. It is only a u64
because it is temporarily used to hold the sum. Making it an actual
u32 allows gcc to generate slightly better code, e.g. when computing
the square, it can do a 32x32->64 multiply.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3b99669b

cpuidle: menu: avoid expensive square root computation · 7024b18c

由 Rasmus Villemoes 提交于 2月 16, 2016

Computing the integer square root is a rather expensive operation, at
least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
bit platforms, doing an extra comparison to a constant (variance <=
U64_MAX/36).

On 64 bit platforms, this does mean that we add a restriction on the
range of the variance where we end up using the estimate (since
previously the stddev <= ULONG_MAX was a tautology), but on the other
hand, we extend the range quite substantially on 32 bit platforms - in
both cases, we now allow standard deviations up to 715 seconds, which
is for example guaranteed if all observations are less than 1430
seconds.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7024b18c

19 1月, 2016 1 次提交

cpuidle: menu: Avoid pointless checks in menu_select() · 5bb1729c

由 Rafael J. Wysocki 提交于 1月 16, 2016

If menu_select() cannot find a suitable state to return, it will
return the state index stored in data->last_state_idx.  This
means that it is pointless to look at the states whose indices
are less than or equal to data->last_state_idx in the main loop,
so don't do that.

Given that those checks are done on every idle state selection, this
change can save quite a bit of completely unnecessary overhead.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>

5bb1729c

15 1月, 2016 1 次提交

cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0 · 9c4b2867

由 Rafael J. Wysocki 提交于 1月 14, 2016

Commit a9ceb78b (cpuidle,menu: use interactivity_req to disable
polling) exposed a bug in menu_select() causing it to return -1
on systems with CPUIDLE_DRIVER_STATE_START equal to zero, although
it should have returned 0.  As a result, idle states are not entered
by CPUs on those systems.

Namely, on the systems in question data->last_state_idx is initially
equal to -1 and the above commit modified the condition that would
have caused it to be changed to 0 to be less likely to trigger which
exposed the problem.  However, setting data->last_state_idx initially
to -1 doesn't make sense at all and on the affected systems it should
always be set to CPUIDLE_DRIVER_STATE_START (ie. 0) unconditionally,
so make that happen.

Fixes: a9ceb78b (cpuidle,menu: use interactivity_req to disable polling)
Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9c4b2867

17 11月, 2015 3 次提交

cpuidle,menu: smooth out measured_us calculation · efddfd90

由 Rik van Riel 提交于 11月 03, 2015

The cpuidle state tables contain the maximum exit latency for each
cpuidle state. On x86, that is the exit latency for when the entire
package goes into that same idle state.

However, a lot of the time we only go into the core idle state,
not the package idle state. This means we see a much smaller exit
latency.

We have no way to detect whether we went into the core or package
idle state while idle, and that is ok.

However, the current menu_update logic does have the potential to
trip up the repeating pattern detection in get_typical_interval.
If the system is experiencing an exit latency near the idle state's
exit latency, some of the samples will have exit_us subtracted,
while others will not. This turns a repeating pattern into mush,
potentially breaking get_typical_interval.

Furthermore, for smaller sleep intervals, we know the chance that
all the cores in the package went to the same idle state are fairly
small. Dividing the measured_us by two, instead of subtracting the
full exit latency when hitting a small measured_us, will reduce the
error.
Signed-off-by: NRik van Riel <riel@redhat.com>
Acked-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

efddfd90

cpuidle,menu: use interactivity_req to disable polling · a9ceb78b

由 Rik van Riel 提交于 11月 03, 2015

The menu governor carefully figures out how much time we typically
sleep for an estimated sleep interval, or whether there is a repeating
pattern going on, and corrects that estimate for the CPU load.

Then it proceeds to ignore that information when determining whether
or not to consider polling. This is not a big deal on most x86 CPUs,
which have very low C1 latencies, and the patch should not have any
effect on those CPUs.

However, certain CPUs (eg. Atom) have much higher C1 latencies, and
it would be good to not waste performance and power on those CPUs if
we are expecting a very low wakeup latency.

Disable polling based on the estimated interactivity requirement, not
on the time to the next timer interrupt.
Signed-off-by: NRik van Riel <riel@redhat.com>
Acked-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a9ceb78b

cpuidle,x86: increase forced cut-off for polling to 20us · 7884084f

由 Rik van Riel 提交于 11月 03, 2015

The cpuidle menu governor has a forced cut-off for polling at 5us,
in order to deal with firmware that gives the OS bad information
on cpuidle states, leading to the system spending way too much time
in polling.

However, at least one x86 CPU family (Atom) has chips that have
a 20us break-even point for C1. Forcing the polling cut-off to
less than that wastes performance and power.

Increase the polling cut-off to 20us.

Systems with a lower C1 latency will be found in the states table by
the menu governor, which will pick those states as appropriate.
Signed-off-by: NRik van Riel <riel@redhat.com>
Acked-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7884084f

05 5月, 2015 1 次提交

cpuidle: Check the sign of index in cpuidle_reflect() · a802ea96

由 Rafael J. Wysocki 提交于 5月 04, 2015

Avoid calling the governor's ->reflect method if the state index
passed to cpuidle_reflect() is negative.

This allows the analogous check to be dropped from menu_reflect(),
so do that too, and ensures that arbitrary error codes can be
passed to cpuidle_reflect() as the index with no adverse
consequences.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

a802ea96

17 4月, 2015 1 次提交

cpuidle: menu: use DIV_ROUND_CLOSEST_ULL() · ee3c86f3

由 Javi Merino 提交于 4月 16, 2015

Now that the kernel provides DIV_ROUND_CLOSEST_ULL(), drop the internal
implementation and use the kernel one.
Signed-off-by: NJavi Merino <javi.merino@arm.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ee3c86f3

17 12月, 2014 1 次提交

cpuidle: menu: Better idle duration measurement without using CPUIDLE_FLAG_TIME_INVALID · 4108b3d9

由 Len Brown 提交于 12月 16, 2014

When menu sees CPUIDLE_FLAG_TIME_INVALID, it ignores its timestamps,
and assumes that idle lasted as long as the time till next predicted
timer expiration.

But if an interrupt was seen and serviced before that duration,
it would actually be more accurate to use the measured time
rather than rounding up to the next predicted timer expiration.

And if an interrupt is seen and serviced such that the mesured time
exceeds the time till next predicted timer expiration, then
truncating to that expiration is the right thing to do --
since we can never stay idle past that timer expiration.

So the code can do a better job without
checking for CPUIDLE_FLAG_TIME_INVALID.
Signed-off-by: NLen Brown <len.brown@intel.com>
Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: NTuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4108b3d9

13 11月, 2014 1 次提交

cpuidle: Invert CPUIDLE_FLAG_TIME_VALID logic · b82b6cca

由 Daniel Lezcano 提交于 11月 12, 2014

The only place where the time is invalid is when the ACPI_CSTATE_FFH entry
method is not set. Otherwise for all the drivers, the time can be correctly
measured.

Instead of duplicating the CPUIDLE_FLAG_TIME_VALID flag in all the drivers
for all the states, just invert the logic by replacing it by the flag
CPUIDLE_FLAG_TIME_INVALID, hence we can set this flag only for the acpi idle
driver, remove the former flag from all the drivers and invert the logic with
this flag in the different governor.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b82b6cca

27 8月, 2014 1 次提交

drivers/cpuidle: Replace __get_cpu_var uses for address calculation · 229b6863

由 Christoph Lameter 提交于 8月 17, 2014

All of these are for address calculation. Replace with
this_cpu_ptr().

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
[cpufreq changes]
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

229b6863

07 8月, 2014 3 次提交

cpuidle: menu: Lookup CPU runqueues less · 372ba8cb

由 Mel Gorman 提交于 8月 06, 2014

The menu governer makes separate lookups of the CPU runqueue to get
load and number of IO waiters but it can be done with a single lookup.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

372ba8cb

cpuidle: menu: Call nr_iowait_cpu less times · 64b4ca5c

由 Mel Gorman 提交于 8月 06, 2014

menu_select() via inline functions calls nr_iowait_cpu() twice as much
as necessary.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

64b4ca5c

cpuidle: menu: Use ktime_to_us instead of reinventing the wheel · 107d4f46

由 Mel Gorman 提交于 8月 06, 2014

The ktime_to_us implementation is slightly better than the one implemented
in menu.c. Use it
Signed-off-by: NMel Gorman <mgorman@suse.de>
Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

107d4f46

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功