提交 · ea59ee0dc9796a4e879291cc2f4728d04c499313 · openeuler / Kernel

09 3月, 2016 9 次提交

cpufreq: governor: Drop the gov pointer from struct dbs_data · ea59ee0d

由 Rafael J. Wysocki 提交于 2月 07, 2016

Since it is possible to obtain a pointer to struct dbs_governor
from a pointer to the struct governor embedded in it with the help
of container_of(), the additional gov pointer in struct dbs_data
isn't really necessary.

Drop that pointer and make the code using it reach the dbs_governor
object via policy->governor.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

ea59ee0d

cpufreq: governor: Rework cpufreq_governor_dbs() · 906a6e5a

由 Rafael J. Wysocki 提交于 2月 07, 2016

Since it is possible to obtain a pointer to struct dbs_governor
from a pointer to the struct governor embedded in it via
container_of(), the second argument of cpufreq_governor_init()
is not necessary.  Accordingly, cpufreq_governor_dbs() doesn't
need its second argument either and the ->governor callbacks
for both the ondemand and conservative governors may be set
to cpufreq_governor_dbs() directly.  Make that happen.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

906a6e5a

cpufreq: governor: Rename some data types and variables · 7bdad34d

由 Rafael J. Wysocki 提交于 2月 07, 2016

The ondemand and conservative governors are represented by
struct common_dbs_data whose name doesn't reflect the purpose it
is used for, so rename it to struct dbs_governor and rename
variables of that type accordingly.

No functional changes.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

7bdad34d

cpufreq: governor: Put governor structure into common_dbs_data · af926185

由 Rafael J. Wysocki 提交于 2月 05, 2016

For the ondemand and conservative governors (generally, governors
that use the common code in cpufreq_governor.c), there are two static
data structures representing the governor, the struct governor
structure (the interface to the cpufreq core) and the struct
common_dbs_data one (the interface to the cpufreq_governor.c code).

There's no fundamental reason why those two structures have to be
separate. Moreover, if the struct governor one is included into
struct common_dbs_data, it will be possible to reach the latter from
the policy via its policy->governor pointer, so it won't be necessary
to pass a separate pointer to it around. For this reason, embed
struct governor in struct common_dbs_data.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

af926185

cpufreq: governor: Avoid passing dbs_data pointers around unnecessarily · 5da3dd1e

由 Rafael J. Wysocki 提交于 2月 05, 2016

Do not pass struct dbs_data pointers to the family of functions
implementing governor operations in cpufreq_governor.c as they can
take that pointer from policy->governor by themselves.

The cpufreq_governor_init() case is slightly more complicated, since
policy->governor may be NULL when it is invoked, but then it can reach
the pointer in question via its cdata argument just fine.

While at it, rework cpufreq_governor_dbs() to avoid a pointless
policy_governor check in the CPUFREQ_GOV_POLICY_INIT case.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

5da3dd1e

cpufreq: governor: Use common mutex for dbs_data protection · 2bb8d94f

由 Rafael J. Wysocki 提交于 2月 07, 2016

Every governor relying on the common code in cpufreq_governor.c
has to provide its own mutex in struct common_dbs_data.  However,
there actually is no need to have a separate mutex per governor
for this purpose, they may be using the same global mutex just
fine.  Accordingly, introduce a single common mutex for that and
drop the mutex field from struct common_dbs_data.

That at least will ensure that the mutex is always present and
initialized regardless of what the particular governors do.

Another benefit is that the common code does not need a pointer to
a governor-related structure to get to the mutex which sometimes
helps.

Finally, it makes the code generally easier to follow.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

2bb8d94f

cpufreq: governor: Replace timers with utilization update callbacks · 9be4fd2c

由 Rafael J. Wysocki 提交于 2月 10, 2016

Instead of using a per-CPU deferrable timer for queuing up governor
work items, register a utilization update callback that will be
invoked from the scheduler on utilization changes.

The sampling rate is still the same as what was used for the
deferrable timers and the added irq_work overhead should be offset by
the eliminated timers overhead, so in theory the functional impact of
this patch should not be significant.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Tested-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>

9be4fd2c

cpufreq: intel_pstate: Replace timers with utilization update callbacks · a4675fbc

由 Rafael J. Wysocki 提交于 2月 05, 2016

Instead of using a per-CPU deferrable timer for utilization sampling
and P-states adjustments, register a utilization update callback that
will be invoked from the scheduler on utilization changes.

The sampling rate is still the same as what was used for the deferrable
timers, so the functional impact of this patch should not be significant.

Based on an earlier patch from Srinivas Pandruvada.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

a4675fbc

cpufreq: Add mechanism for registering utilization update callbacks · 34e2c555

由 Rafael J. Wysocki 提交于 2月 15, 2016

Introduce a mechanism by which parts of the cpufreq subsystem
("setpolicy" drivers or the core) can register callbacks to be
executed from cpufreq_update_util() which is invoked by the
scheduler's update_load_avg() on CPU utilization changes.

This allows the "setpolicy" drivers to dispense with their timers
and do all of the computations they need and frequency/voltage
adjustments in the update_load_avg() code path, among other things.

The update_load_avg() changes were suggested by Peter Zijlstra.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NIngo Molnar <mingo@kernel.org>

34e2c555

05 2月, 2016 1 次提交

cpufreq: Clean up default and fallback governor setup · de1df26b

由 Rafael J. Wysocki 提交于 2月 05, 2016

The preprocessor magic used for setting the default cpufreq governor
(and for using the performance governor as a fallback one for that
matter) is really nasty, so replace it with __weak functions and
overrides.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

de1df26b

28 1月, 2016 4 次提交

cpufreq: cpufreq-dt: avoid uninitialized variable warnings: · b331bc20

由 Arnd Bergmann 提交于 1月 25, 2016

gcc warns quite a bit about values returned from allocate_resources()
in cpufreq-dt.c:

cpufreq-dt.c: In function 'cpufreq_init':
cpufreq-dt.c:327:6: error: 'cpu_dev' may be used uninitialized in this function [-Werror=maybe-uninitialized]
cpufreq-dt.c:197:17: note: 'cpu_dev' was declared here
cpufreq-dt.c:376:2: error: 'cpu_clk' may be used uninitialized in this function [-Werror=maybe-uninitialized]
cpufreq-dt.c:199:14: note: 'cpu_clk' was declared here
cpufreq-dt.c: In function 'dt_cpufreq_probe':
cpufreq-dt.c:461:2: error: 'cpu_clk' may be used uninitialized in this function [-Werror=maybe-uninitialized]
cpufreq-dt.c:447:14: note: 'cpu_clk' was declared here

The problem is that it's slightly hard for gcc to follow return
codes across PTR_ERR() calls.

This patch uses explicit assignments to the "ret" variable to make
it easier for gcc to verify that the code is actually correct,
without the need to add a bogus initialization.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b331bc20

cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype · fb2a24a1

由 Arnd Bergmann 提交于 1月 25, 2016

There are two definitions of pxa_cpufreq_change_voltage, with slightly
different prototypes after one of them had its argument marked 'const'.
Now the other one (for !CONFIG_REGULATOR) produces a harmless warning:

drivers/cpufreq/pxa2xx-cpufreq.c: In function 'pxa_set_target':
drivers/cpufreq/pxa2xx-cpufreq.c:291:36: warning: passing argument 1 of 'pxa_cpufreq_change_voltage' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
   ret = pxa_cpufreq_change_voltage(&pxa_freq_settings[idx]);
                                    ^
drivers/cpufreq/pxa2xx-cpufreq.c:205:12: note: expected 'struct pxa_freqs *' but argument is of type 'const struct pxa_freqs *'
 static int pxa_cpufreq_change_voltage(struct pxa_freqs *pxa_freq)
            ^

This changes the prototype in the same way as the other, which
avoids the warning.

Fixes: 03c22990 (cpufreq: pxa: make pxa_freqs arrays const)
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: 4.2+ <stable@vger.kernel.org> # 4.2+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

fb2a24a1

cpufreq: Use list_is_last() to check last entry of the policy list · 2dadfd75

由 Gautham R Shenoy 提交于 1月 27, 2016

Currently next_policy() explicitly checks if a policy is the last
policy in the cpufreq_policy_list. Use the standard list_is_last
primitive instead.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2dadfd75

cpufreq: Fix NULL reference crash while accessing policy->governor_data · e4b133cc

由 Viresh Kumar 提交于 1月 25, 2016

There is a race discovered by Juri, where we are able to:
- create and read a sysfs file before policy->governor_data is being set
  to a non NULL value.
  OR
- set policy->governor_data to NULL, and reading a file before being
  destroyed.

And so such a crash is reported:

Unable to handle kernel NULL pointer dereference at virtual address 0000000c
pgd = edfc8000
[0000000c] *pgd=bfc8c835
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
Hardware name: ARM-Versatile Express
task: ee8e8480 ti: ee930000 task.ti: ee930000
PC is at show_ignore_nice_load_gov_pol+0x24/0x34
LR is at show+0x4c/0x60
pc : [<c058f1bc>]    lr : [<c058ae88>]    psr: a0070013
sp : ee931dd0  ip : ee931de0  fp : ee931ddc
r10: ee4bc290  r9 : 00001000  r8 : ef2cb000
r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
r3 : 00000000  r2 : 00000000  r1 : c0928df4  r0 : ef2cb000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: adfc806a  DAC: 00000051
Process cat (pid: 1730, stack limit = 0xee930210)
Stack: (0xee931dd0 to 0xee932000)
1dc0:                                     ee931dfc ee931de0 c058ae88 c058f1a4
1de0: edce3bc0 c07bfca4 edce3ac0 00001000 ee931e24 ee931e00 c01fcb90 c058ae48
1e00: 00000001 edce3bc0 00000000 00000001 ee931e50 ee8ff480 ee931e34 ee931e28
1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
1e40: 00000000 edce3bf0 befe4a00 ee931f78 00000000 00000000 000001e4 00000000
1e60: c00545a8 edce3ac0 00001000 00001000 befe4a00 ee931f78 00000000 00001000
1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 00020000 00000000 00000000
1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 00001000 befe4a00 ee931f78
1ec0: 00000000 00001000 ee931f44 ee931ed8 c017c328 c01fbdc4 00001000 00000000
1ee0: ee8ff480 00001000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 00000020 ee8ff480
1f20: ee8ff480 befe4a00 00001000 ee931f78 00000000 00000000 ee931f74 ee931f48
1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 00001000 befe4a00
1f60: 00000000 00000000 ee931fa4 ee931f78 c017d2a8 c017d160 00000000 00000000
1f80: 000a9f20 00001000 befe4a00 00000003 c000ffe4 ee930000 00000000 ee931fa8
1fa0: c000fe40 c017d264 000a9f20 00001000 00000003 befe4a00 00001000 00000000
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
1fc0: 000a9f20 00001000 befe4a00 00000003 00000000 00000000 00000003 00000001
pgd = edfc4000
[0000000c] *pgd=bfcac835
1fe0: 00000000 befe49dc 000197f8 b6e35dfc 60070010 00000003 3065b49d 134ac2c9

[<c058f1bc>] (show_ignore_nice_load_gov_pol) from [<c058ae88>] (show+0x4c/0x60)
[<c058ae88>] (show) from [<c01fcb90>] (sysfs_kf_seq_show+0x90/0xfc)
[<c01fcb90>] (sysfs_kf_seq_show) from [<c01fb33c>] (kernfs_seq_show+0x34/0x38)
[<c01fb33c>] (kernfs_seq_show) from [<c01a5210>] (seq_read+0x1e4/0x4e4)
[<c01a5210>] (seq_read) from [<c01fbed8>] (kernfs_fop_read+0x120/0x1a0)
[<c01fbed8>] (kernfs_fop_read) from [<c017c328>] (__vfs_read+0x3c/0xe0)
[<c017c328>] (__vfs_read) from [<c017d1ec>] (vfs_read+0x98/0x104)
[<c017d1ec>] (vfs_read) from [<c017d2a8>] (SyS_read+0x50/0x90)
[<c017d2a8>] (SyS_read) from [<c000fe40>] (ret_fast_syscall+0x0/0x1c)
Code: e5903044 e1a00001 e3081df4 e34c1092 (e593300c)
---[ end trace 5994b9a5111f35ee ]---

Fix that by making sure, policy->governor_data is updated at the right
places only.

Cc: 4.2+ <stable@vger.kernel.org> # 4.2+
Reported-and-tested-by: NJuri Lelli <juri.lelli@arm.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e4b133cc

05 1月, 2016 2 次提交

cpufreq-dt: fix handling regulator_get_voltage() result · 929ca89c

由 Andrzej Hajda 提交于 12月 30, 2015

The function can return negative values so it should be assigned
to signed type.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci.

Link: http://permalink.gmane.org/gmane.linux.kernel/2038576Signed-off-by: NAndrzej Hajda <a.hajda@samsung.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

929ca89c

cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC · 0df35026

由 Chen Yu 提交于 12月 16, 2015

It is reported that, with CONFIG_HZ_PERIODIC=y cpu stays at the
lowest frequency even if the usage goes to 100%, neither ondemand
nor conservative governor works, however performance and
userspace work as expected. If set with CONFIG_NO_HZ_FULL=y,
everything goes well.

This problem is caused by improper calculation of the idle_time
when the load is extremely high(near 100%). Firstly, cpufreq_governor
uses get_cpu_idle_time to get the total idle time for specific cpu, then:

1.If the system is configured with CONFIG_NO_HZ_FULL, the idle time is
returned by ktime_get, which is always increasing, it's OK.
2.However, if the system is configured with CONFIG_HZ_PERIODIC,
get_cpu_idle_time might not guarantee to be always increasing,
because it will leverage get_cpu_idle_time_jiffy to calculate the
idle_time, consider the following scenario:

At T1:
idle_tick_1 = total_tick_1 - user_tick_1

sample period(80ms)...

At T2: ( T2 = T1 + 80ms):
idle_tick_2 = total_tick_2 - user_tick_2

Currently the algorithm is using (idle_tick_2 - idle_tick_1) to
get the delta idle_time during the past sample period, however
it CAN NOT guarantee that idle_tick_2 >= idle_tick_1, especially
when cpu load is high.
(Yes, total_tick_2 >= total_tick_1, and user_tick_2 >= user_tick_1,
but how about idle_tick_2 and idle_tick_1? No guarantee.)
So governor might get a negative value of idle_time during the past
sample period, which might mislead the system that the idle time is
very big(converted to unsigned int), and the busy time is nearly zero,
which causes the governor to always choose the lowest cpufreq,
then cause this problem.

In theory there are two solutions:

1.The logic should not rely on the idle tick during every sample period,
but be based on the busy tick directly, as this is how 'top' is
implemented.

2.Or the logic must make sure that the idle_time is strictly increasing
during each sample period, then there would be no negative idle_time
anymore. This solution requires minimum modification to current code
and this patch uses method 2.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=69821Reported-by: NJan Fikar <j.fikar@gmail.com>
Signed-off-by: NChen Yu <yu.c.chen@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

0df35026

02 1月, 2016 1 次提交

cpufreq: mt8173: migrate to use operating-points-v2 bindings · a889331d

由 Pi-Cheng Chen 提交于 12月 27, 2015

Modify mt8173-cpufreq driver to get OPP-sharing information and set up
OPP table provided by operating-points-v2 bindings.
Signed-off-by: NPi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a889331d

01 1月, 2016 3 次提交

cpufreq: Simplify core code related to boost support · 7a6c79f2

由 Rafael J. Wysocki 提交于 12月 27, 2015

Notice that the boost_supported field in struct cpufreq_driver is
redundant, because the driver's ->set_boost callback may be left
unset if "boost" is not supported.  Moreover, the only driver
populating the ->set_boost callback is acpi_cpufreq, so make it
avoid populating that callback if "boost" is not supported, rework
the core to check ->set_boost instead of boost_supported to
verify "boost" support and drop boost_supported which isn't
used any more.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

7a6c79f2

cpufreq: acpi-cpufreq: Simplify boost-related code · 17135782

由 Rafael J. Wysocki 提交于 12月 27, 2015

The store_boost() routine is only used by store_cpb(), so move
the code from it directly to that function and rename _store_boost()
to set_boost() to make its name reflect the name of the driver
callback pointing to it.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

17135782

cpufreq: Make cpufreq_boost_supported() static · 41669da0

由 Rafael J. Wysocki 提交于 12月 27, 2015

cpufreq_boost_supported() is not used outside of cpufreq.c, so make
it static.

While at it, refactor it as a one-liner (which it really is).
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

41669da0

28 12月, 2015 2 次提交

blackfin-cpufreq: Mark cpu_set_cclk() as static · d23f8cad

由 Markus Elfring 提交于 12月 21, 2015

The cpu_set_cclk() function was only used in a single source file so far.
Indicate this setting also by the corresponding linkage specifier.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d23f8cad

blackfin-cpufreq: Change return type of cpu_set_cclk() to that of clk_set_rate() · a5810b4f

由 Markus Elfring 提交于 12月 21, 2015

The return type "unsigned long" was used by the cpu_set_cclk() function
while the type "int" is provided by the clk_set_rate() function.
Let us make this usage consistent.

This issue was detected by using the Coccinelle software.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a5810b4f

24 12月, 2015 1 次提交

cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info() · a7def561

由 Dan Carpenter 提交于 12月 15, 2015

The "domain" variable needs to be signed for the error handling to work.

Fixes: 8def3103 (cpufreq: arm_big_little: add SCPI interface driver)
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a7def561

17 12月, 2015 1 次提交

powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL · e4d54f71

由 Stewart Smith 提交于 12月 09, 2015

Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OPALv3 to
be cared about or supported by mainline kernels.

So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL
exclusively.
Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e4d54f71

12 12月, 2015 4 次提交

cpufreq: st: Provide runtime initialised driver for ST's platforms · ab0ea257

由 Lee Jones 提交于 12月 10, 2015

The bootloader is charged with the responsibility to provide platform
specific Dynamic Voltage and Frequency Scaling (DVFS) information via
Device Tree.  This driver takes the supplied configuration and
registers it with the new generic OPP framework, to then be used with
CPUFreq.
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ab0ea257

cpufreq: intel_pstate: Minor cleanup for FRAC_BITS · 88b7b7c0

由 Prarit Bhargava 提交于 12月 08, 2015

785ee278 ("cpufreq: intel_pstate: Fix limits->max_perf rounding error")
hardcodes the value of FRAC_BITS. This patch fixes that minor issue.

Fixes: 785ee278 (cpufreq: intel_pstate: Fix limits->max_perf rounding error)
Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

88b7b7c0

cpufreq: mt8173: Move resources allocation into ->probe() · 89b56047

由 Pi-Cheng Chen 提交于 12月 10, 2015

Since the return value of ->init() of cpufreq driver is not propagated
to the device driver model now, move resources allocation into
->probe() to handle -EPROBE_DEFER properly.
Signed-off-by: NPi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

89b56047

cpufreq: tegra: add regulator dependency for T124 · b5832e4b

由 Arnd Bergmann 提交于 12月 08, 2015

This driver is the only one that calls regulator_sync_voltage(), but it
can currently be built with CONFIG_REGULATOR disabled, producing
this build error:

drivers/cpufreq/tegra124-cpufreq.c: In function 'tegra124_cpu_switch_to_pllx':
drivers/cpufreq/tegra124-cpufreq.c:68:2: error: implicit declaration of function 'regulator_sync_voltage' [-Werror=implicit-function-declaration]
regulator_sync_voltage(priv->vdd_cpu_reg);

My first attempt was to implement a helper for this function
for regulator_sync_voltage, but Mark Brown explained:

We don't do this for *all* regulator API functions - there's some where
using them strongly suggests that there is actually a dependency on
the regulator API. This does seem like it might be falling into the
specialist category [...]
Looking at the code I'm pretty unclear on what the authors think the
use of _sync_voltage() is doing in the first place so it may be even
better to just remove the call. It seems to have been included in the
first commit so there's not changelog explaining things and there's
no comment either. I'd *expect* it to be a noop as far as I can see.

This adds the dependency to make the driver always build successfully
or not be enabled at all. Alternatively, we could investigate if the
driver should stop calling regulator_sync_voltage instead.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NJon Hunter <jonathanh@nvidia.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b5832e4b

10 12月, 2015 12 次提交

cpufreq: intel_pstate: Account for IO wait time · 63d1d656

由 Philippe Longepe 提交于 12月 04, 2015

In cases where we have many IOs, the global load becomes low and the
load algorithm will decrease the requested P-State. Because of that,
the IOs overheads will increase and impact the IO performances.

To improve IO bound work, we can count the io-wait time as busy time
in calculating CPU busy.

This change uses get_cpu_iowait_time_us() to obtain the IO wait time value
and converts time into number of cycles spent waiting on IO at the TSC
rate. At the moment, this trick is only used for Atom.
Signed-off-by: NPhilippe Longepe <philippe.longepe@intel.com>
Signed-off-by: NStephane Gasparini <stephane.gasparini@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

63d1d656

cpufreq: intel_pstate: Account for non C0 time · e70eed2b

由 Philippe Longepe 提交于 12月 04, 2015

The current function to calculate cpu utilization uses the average P-state
ratio (APerf/Mperf) scaled by the ratio of the current P-state to the
max available non-turbo one. This leads to an overestimation of
utilization which causes higher-performance P-states to be selected more
often and that leads to increased energy consumption.

This is a problem for low-power systems, so it is better to use a
different utilization calculation algorithm for them.

Namely, the Percent Busy value (or load) can be estimated as the ratio of the
MPERF counter that runs at a constant rate only during active periods (C0) to
the time stamp counter (TSC) that also runs (at the same rate) during idle.
That is:

Percent Busy = 100 * (delta_mperf / delta_tsc)

Use this algorithm for platforms with SoCs based on the Airmont and Silvermont
Atom cores.
Signed-off-by: NPhilippe Longepe <philippe.longepe@intel.com>
Signed-off-by: NStephane Gasparini <stephane.gasparini@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e70eed2b

cpufreq: intel_pstate: Configurable algorithm to get target pstate · 157386b6

由 Philippe Longepe 提交于 12月 04, 2015

Target systems using different cpus have different power and performance
requirements. They may use different algorithms to get the next P-state
based on their power or performance preference.

For example, power-constrained systems may not want to use
high-performance P-states as aggressively as a full-size desktop or a
server platform. A server platform may want to run close to the max to
achieve better performance, while laptop-like systems may prefer
sacrificing performance for longer battery lifes.

For the above reasons, modify intel_pstate to allow the target P-state
selection algorithm to be depend on the CPU ID.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NPhilippe Longepe <philippe.longepe@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

157386b6

cpufreq: mt8173: check return value of regulator_get_voltage() call · 40be4c3c

由 Pi-Cheng Chen 提交于 11月 29, 2015

Sometimes regulator_get_voltage() call returns negative values for
reasons(e.g. underlying I2C bus timeout). Add check for the return
values and fail out early.
Signed-off-by: NPi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

40be4c3c

cpufreq: mt8173: remove redundant regulator_get_voltage() call · 93625d52

由 Pi-Cheng Chen 提交于 11月 29, 2015

Remove redundant regulator_get_voltage() call to get Vsram value
since it will be obtained later at the beginning of voltage tracking
loop.
Signed-off-by: NPi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

93625d52

cpufreq: mt8173: add CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag · 9bb46b87

由 Pi-Cheng Chen 提交于 11月 29, 2015

Add CPUFREQ_HAVE_GOVERNOR_PER_POLICY to have individual set of tunables
for each cluster of MT8173.
Signed-off-by: NPi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9bb46b87

cpufreq: qoriq: Register cooling device based on device tree · 8ae1702a

由 Hongtao Jia 提交于 11月 26, 2015

Register the qoriq cpufreq driver as a cooling device, based on the
thermal device tree framework. When temperature crosses the passive trip
point cpufreq is used to throttle CPUs.
Signed-off-by: NJia Hongtao <hongtao.jia@freescale.com>
Reviewed-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8ae1702a

cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency · 790d849b

由 Jacob Tanenbaum 提交于 11月 19, 2015

The cpufreq documentation specifies

policy->cpuinfo.transition_latency   the time it takes on this CPU to
                                switch between two frequencies in
                                nanoseconds (if appropriate, else
                                specify CPUFREQ_ETERNAL)

currently pcc-cpufreq does not expose the value and sets it to zero. I
changed the pcc-cpufreq driver and it's documentation to conform to the
default value specified in Documentation/cpu-freq/cpu-drivers.txt
Signed-off-by: NJacob Tanenbaum <jtanenba@redhat.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

790d849b

cpufreq: arm_big_little: Add support to register a cpufreq cooling device · 2f7e8a17

由 Punit Agrawal 提交于 11月 17, 2015

Register passive cooling devices when initialising cpufreq on
big.LITTLE systems. If the device tree provides a dynamic power
coefficient for the CPUs then the bound cooling device will support
the extensions that allow it to be used with all the existing thermal
governors including the power allocator governor.

A cooling device will be created per individual frequency domain and
can be bound to thermal zones via the thermal DT bindings.
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2f7e8a17

cpufreq-dt: Supply power coefficient when registering cooling devices · f8fa8ae0

由 Punit Agrawal 提交于 11月 17, 2015

Support registering cooling devices with dynamic power coefficient
where provided by the device tree. This allows OF registered cooling
devices driver to be used with the power_allocator thermal governor.
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: NJavi Merino <javi.merino@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f8fa8ae0

cpufreq: governor: Use lockless timer function · 2dd3e724

由 Rafael J. Wysocki 提交于 12月 08, 2015

It is possible to get rid of the timer_lock spinlock used by the
governor timer function for synchronization, but a couple of races
need to be avoided.

The first race is between multiple dbs_timer_handler() instances
that may be running in parallel with each other on different
CPUs.  Namely, one of them has to queue up the work item, but it
cannot be queued up more than once.  To achieve that,
atomic_inc_return() can be used on the skip_work field of
struct cpu_common_dbs_info.

The second race is between an already running dbs_timer_handler()
and gov_cancel_work().  In that case the dbs_timer_handler() might
not notice the skip_work incrementation in gov_cancel_work() and
it might queue up its work item after gov_cancel_work() had
returned (and that work item would corrupt skip_work going
forward).  To prevent that from happening, gov_cancel_work()
can be made wait for the timer function to complete (on all CPUs)
right after skip_work has been incremented.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

2dd3e724

cpufreq: ondemand: update update_sampling_rate() to make it more efficient · f08f638b

由 Viresh Kumar 提交于 12月 03, 2015

Currently update_sampling_rate() runs over each online CPU and
cancels/queues timers on all policy->cpus every time. This should be
done just once for any cpu belonging to a policy.

Create a cpumask and keep on clearing it as and when we process
policies, so that we don't have to traverse through all CPUs of the same
policy.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f08f638b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功