提交 · e788892ba3cc71d385b75895f7a375fbc659ce86 · openanolis / cloud-kernel

03 6月, 2016 1 次提交

cpufreq: governor: Get rid of governor events · e788892b

由 Rafael J. Wysocki 提交于 6月 02, 2016

The design of the cpufreq governor API is not very straightforward,
as struct cpufreq_governor provides only one callback to be invoked
from different code paths for different purposes.  The purpose it is
invoked for is determined by its second "event" argument, causing it
to act as a "callback multiplexer" of sorts.

Unfortunately, that leads to extra complexity in governors, some of
which implement the ->governor() callback as a switch statement
that simply checks the event argument and invokes a separate function
to handle that specific event.

That extra complexity can be eliminated by replacing the all-purpose
->governor() callback with a family of callbacks to carry out specific
governor operations: initialization and exit, start and stop and policy
limits updates.  That also turns out to reduce the code size too, so
do it.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

e788892b

30 5月, 2016 4 次提交

cpufreq: Split cpufreq_governor() into simpler functions · a92604b4

由 Rafael J. Wysocki 提交于 5月 14, 2016

The cpufreq_governor() routine is used by the cpufreq core to invoke
the current governor's ->governor() callback with appropriate arguments
and do some housekeeping related to that.  Unfortunately, the way it
mixes different governor events in one code path makes it rather hard
to follow the code.

For this reason, split cpufreq_governor() into five simpler functions
that each will handle just one specific governor event and put all of
the code related to the given event into its own function.

This change is a prerequisite for a redesign of the cpufreq governor
API that will be done subsequently.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

a92604b4

cpufreq: governor: Simplify performance and powersave governors · 16de72b9

由 Rafael J. Wysocki 提交于 5月 14, 2016

The performance and powersave cpufreq governors handle the
CPUFREQ_GOV_START event in the same way as CPUFREQ_GOV_LIMITS.
However, the cpufreq core always invokes cpufreq_governor() with the
event argument equal to CPUFREQ_GOV_LIMITS right after invoking it with
event equal to CPUFREQ_GOV_START.  As a result, for both the governors
in question, __cpufreq_driver_target() is executed twice in a row
with the same arguments which is not useful.

For this reason, simplify the performance and powersave governors
to handle the CPUFREQ_GOV_LIMITS event only as that's going to be
sufficient for the governor start too.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

16de72b9

cpufreq: governor: Check transition latecy at init time only · 9e8c0a89

由 Rafael J. Wysocki 提交于 5月 14, 2016

It is not necessary to check the governor's max_transition_latency
attribute every time cpufreq_governor() runs, so check it only if
the event argument is CPUFREQ_GOV_POLICY_INIT.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

9e8c0a89

cpufreq: governor: CPUFREQ_GOV_LIMITS never fails · d6ff44d6

由 Rafael J. Wysocki 提交于 5月 14, 2016

None of the cpufreq governors currently in the tree will ever fail
an invocation of the ->governor() callback with the event argument
equal to CPUFREQ_GOV_LIMITS (unless invoked with incorrect arguments
which doesn't matter anyway) and had it ever failed, the result of
it wouldn't have been very clean.

For this reason, rearrange the code in the core to ignore the return
value of cpufreq_governor() when called with event equal to
CPUFREQ_GOV_LIMITS.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

d6ff44d6

28 5月, 2016 1 次提交

remove lots of IS_ERR_VALUE abuses · 287980e4

由 Arnd Bergmann 提交于 5月 27, 2016

Most users of IS_ERR_VALUE() in the kernel are wrong, as they
pass an 'int' into a function that takes an 'unsigned long'
argument. This happens to work because the type is sign-extended
on 64-bit architectures before it gets converted into an
unsigned type.

However, anything that passes an 'unsigned short' or 'unsigned int'
argument into IS_ERR_VALUE() is guaranteed to be broken, as are
8-bit integers and types that are wider than 'unsigned long'.

Andrzej Hajda has already fixed a lot of the worst abusers that
were causing actual bugs, but it would be nice to prevent any
users that are not passing 'unsigned long' arguments.

This patch changes all users of IS_ERR_VALUE() that I could find
on 32-bit ARM randconfig builds and x86 allmodconfig. For the
moment, this doesn't change the definition of IS_ERR_VALUE()
because there are probably still architecture specific users
elsewhere.

Almost all the warnings I got are for files that are better off
using 'if (err)' or 'if (err < 0)'.
The only legitimate user I could find that we get a warning for
is the (32-bit only) freescale fman driver, so I did not remove
the IS_ERR_VALUE() there but changed the type to 'unsigned long'.
For 9pfs, I just worked around one user whose calling conventions
are so obscure that I did not dare change the behavior.

I was using this definition for testing:

 #define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \
       unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO))

which ends up making all 16-bit or wider types work correctly with
the most plausible interpretation of what IS_ERR_VALUE() was supposed
to return according to its users, but also causes a compile-time
warning for any users that do not pass an 'unsigned long' argument.

I suggested this approach earlier this year, but back then we ended
up deciding to just fix the users that are obviously broken. After
the initial warning that caused me to get involved in the discussion
(fs/gfs2/dir.c) showed up again in the mainline kernel, Linus
asked me to send the whole thing again.

[ Updated the 9p parts as per Al Viro  - Linus ]
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.org/lkml/2016/1/7/363
Link: https://lkml.org/lkml/2016/5/27/486
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

287980e4

18 5月, 2016 4 次提交

cpufreq: simplified goto out in cpufreq_register_driver() · 3834abb4

由 Pankaj Gupta 提交于 5月 16, 2016

simplified goto out in cpufreq_register_driver for increasing
code readability
Signed-off-by: NPankaj Gupta <pankaj.gupta@spreadtrum.com>
Signed-off-by: NSanjeev Yadav <sanjeev.yadav@spreadtrum.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3834abb4

cpufreq: governor: CPUFREQ_GOV_STOP never fails · 45482c70

由 Rafael J. Wysocki 提交于 5月 12, 2016

None of the cpufreq governors currently in the tree will ever fail
an invocation of the ->governor() callback with the event argument
equal to CPUFREQ_GOV_STOP (unless invoked with incorrect arguments
which doesn't matter anyway) and it is rather difficult to imagine
a valid reason for such a failure.

Accordingly, rearrange the code in the core to make it clear that
this call never fails.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

45482c70

cpufreq: governor: CPUFREQ_GOV_POLICY_EXIT never fails · 36be3418

由 Rafael J. Wysocki 提交于 5月 12, 2016

None of the cpufreq governors currently in the tree will ever fail
an invocation of the ->governor() callback with the event argument
equal to CPUFREQ_GOV_POLICY_EXIT (unless invoked with incorrect
arguments which doesn't matter anyway) and it wouldn't really
make sense to fail it, because the caller won't be able to handle
that failure in a meaningful way.

Accordingly, rearrange the code in the core to make it clear that
this call never fails.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

36be3418

intel_pstate: Simplify conditional in intel_pstate_set_policy() · c749c64f

由 Rafael J. Wysocki 提交于 5月 12, 2016

One of the if () statements in intel_pstate_set_policy() causes
another if () to be evaluated if the condition is true and it
doesn't do anything else, so merge the two if () statements into
one.

No functional changes.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

c749c64f

17 5月, 2016 1 次提交

thermal: mediatek: Add cpu dynamic power cooling model. · d2901603

由 Dawei Chien 提交于 12月 16, 2015

MT8173 cpufreq driver select of_cpufreq_power_cooling_register registering
cooling devices with dynamic power coefficient.
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NDawei Chien <dawei.chien@mediatek.com>
Signed-off-by: NEduardo Valentin <edubezval@gmail.com>

d2901603

13 5月, 2016 5 次提交

CPUFREQ: Loongson1: Replace goto out with return in ls1x_cpufreq_probe() · 65b2849a

由 Kelvin Cheung 提交于 4月 12, 2016

This patch replaces goto out with return in ls1x_cpufreq_probe().
Signed-off-by: NKelvin Cheung <keguang.zhang@gmail.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13056/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

65b2849a

CPUFREQ: Loongson1: Use devm_kzalloc() instead of global structure · 99bf2e68

由 Kelvin Cheung 提交于 4月 12, 2016

This patch uses devm_kzalloc() instead of global structure.
Signed-off-by: NKelvin Cheung <keguang.zhang@gmail.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13055/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

99bf2e68

CPUFREQ: Loongson1: Use dev_get_platdata() to get platform_data · 25581d2b

由 Kelvin Cheung 提交于 4月 12, 2016

This patch uses dev_get_platdata() to get the platform_data
instead of referencing it directly.
Signed-off-by: NKelvin Cheung <keguang.zhang@gmail.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13054/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

25581d2b

CPUFREQ: Loongson1: Replace kzalloc() with kcalloc() · 379e38a7

由 Kelvin Cheung 提交于 4月 12, 2016

This patch replaces kzalloc() with kcalloc() when allocating
frequency table, and remove unnecessary 'out of memory' message.
Signed-off-by: NKelvin Cheung <keguang.zhang@gmail.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13053/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

379e38a7

CPUFREQ: Loongson1: Rename the file to loongson1-cpufreq.c · 6a1d55cc

由 Kelvin Cheung 提交于 4月 12, 2016

This patch renames the file to loongson1-cpufreq.c,
and also includes some minor updates.
Signed-off-by: NKelvin Cheung <keguang.zhang@gmail.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13052/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

6a1d55cc

12 5月, 2016 5 次提交

intel_pstate: Clean up get_target_pstate_use_performance() · 1aa7a6e2

由 Rafael J. Wysocki 提交于 5月 11, 2016

The comments and the core_busy variable name in
get_target_pstate_use_performance() are totally confusing,
so modify them to reflect what's going on.

The results of the computations should be the same as before.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1aa7a6e2

intel_pstate: Use sample.core_avg_perf in get_avg_pstate() · 8edb0a6e

由 Rafael J. Wysocki 提交于 5月 11, 2016

Notice that get_avg_pstate() can use sample.core_avg_perf instead of
carrying the same division again, so make it do that.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8edb0a6e

intel_pstate: Clarify average performance computation · a1c9787d

由 Rafael J. Wysocki 提交于 5月 11, 2016

The core_pct_busy field of struct sample actually contains the
average performace during the last sampling period (in percent)
and not the utilization of the core as suggested by its name
which is confusing.

For this reason, change the name of that field to core_avg_perf
and rename the function that computes its value accordingly.

Also notice that storing this value as percentage requires a costly
integer multiplication to be carried out in a hot path, so instead
store it as an "extended fixed point" value with more fraction bits
and update the code using it accordingly (it is better to change the
name of the field along with its meaning in one go than to make those
two changes separately, as that would likely lead to more
confusion).
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a1c9787d

intel_pstate: Avoid unnecessary synchronize_sched() during initialization · 4578ee7e

由 Chen Yu 提交于 5月 11, 2016

Currently, in intel_pstate_clear_update_util_hook(), after
clearing the utilization update hook, we leverage
synchronize_sched() to deal with synchronization, which
is a little bit time-costly because synchronize_sched()
has to wait for all the CPUs to go through a grace period.

Actually, the synchronize_sched() is not necessary if the utilization
update hook has not been set for the given CPU yet, so make the driver
check if that's the case and avoid the synchronize_sched() call then.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=116371Tested-by: NTian Ye <yex.tian@intel.com>
Signed-off-by: NChen Yu <yu.c.chen@intel.com>
[ rjw : Rebase ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4578ee7e

cpufreq: schedutil: Make default depend on CONFIG_SMP · cfe9492f

由 Arnd Bergmann 提交于 5月 11, 2016

CPU_FREQ_GOV_SCHEDUTIL gained a dependency on SMP, so now we
get a warning if it gets selected by CPU_FREQ_DEFAULT_GOV_SCHEDUTIL
without SMP:

warning: (CPU_FREQ_DEFAULT_GOV_SCHEDUTIL) selects CPU_FREQ_GOV_SCHEDUTIL which has unmet direct dependencies (CPU_FREQ && SMP)

This adds another dependency to avoid the problem.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: bf7cdff1 (cpufreq: schedutil: Make it depend on CONFIG_SMP)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

cfe9492f

11 5月, 2016 2 次提交

cpufreq: powernv: del_timer_sync when global and local pstate are equal · 0bc10b93

由 Akshay Adiga 提交于 5月 03, 2016

When global and local pstate are equal in a powernv_target_index() call,
we don't queue a timer. But we may have timer already queued for future.
This could cause the timer to fire one additional time for no use.
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

0bc10b93

cpufreq: powernv: Move smp_call_function_any() out of irq safe block · 1fd3ff28

由 Akshay Adiga 提交于 5月 03, 2016

Fix a WARN_ON caused by smp_call_function_any() when irq is disabled,
because of changes made in the patch ('cpufreq: powernv: Ramp-down
 global pstate slower than local-pstate')
https://patchwork.ozlabs.org/patch/612058/

 WARNING: CPU: 0 PID: 4 at kernel/smp.c:291
smp_call_function_single+0x170/0x180

 Call Trace:
 [c0000007f648f9f0] [c0000007f648fa90] 0xc0000007f648fa90 (unreliable)
 [c0000007f648fa30] [c0000000001430e0] smp_call_function_any+0x170/0x1c0
 [c0000007f648fa90] [c0000000007b4b00]
powernv_cpufreq_target_index+0xe0/0x250
 [c0000007f648fb00] [c0000000007ac9dc]
__cpufreq_driver_target+0x20c/0x3d0
 [c0000007f648fbc0] [c0000000007b1b4c] od_dbs_timer+0xcc/0x260
 [c0000007f648fc10] [c0000000007b3024] dbs_work_handler+0x54/0xa0
 [c0000007f648fc50] [c0000000000c49a8] process_one_work+0x1d8/0x590
 [c0000007f648fce0] [c0000000000c4e08] worker_thread+0xa8/0x660
 [c0000007f648fd80] [c0000000000cca88] kthread+0x108/0x130
 [c0000007f648fe30] [c0000000000095e8] ret_from_kernel_thread+0x5c/0x74

- Calling smp_call_function_any() with interrupt disabled (through
 spin_lock_irqsave) could cause a deadlock, as smp_call_function_any()
 relies on the IPI to complete. This is detected in the
 smp_call_function_any() call and hence the WARN_ON.

- As the spinlock (gpstates->lock) is only used to synchronize access of
 global_pstate_info  between timer irq handler and target_index calls. And
 the timer irq handler just try_locks() hence it would not cause a
 deadlock. Hence could do without making spinlocks irq safe.

- As the smp_call_function_any() is a blocking call and does not access
 global_pstates_info, it could reduce the critcal section by moving
 smp_call_function_any() after giving up the lock.
Reported-by: NAbdul Haleem <abdhalee@linux.vnet.linux.com>
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1fd3ff28

10 5月, 2016 1 次提交

intel_pstate: Clean up intel_pstate_get() · f96fd0c8

由 Rafael J. Wysocki 提交于 5月 07, 2016

intel_pstate_get() contains a local variable that's initialized but
never used and it can be written in fewer lines of code, so clean
it up.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

f96fd0c8

07 5月, 2016 1 次提交

cpufreq: schedutil: Make it depend on CONFIG_SMP · bf7cdff1

由 Rafael J. Wysocki 提交于 5月 06, 2016

Make the schedutil cpufreq governor depend on CONFIG_SMP, because
the scheduler-provided utilization numbers used by it are only
available with CONFIG_SMP set.

Fixes: 9bdcb44e (cpufreq: schedutil: New governor based on scheduler utilization data)
Reported-by: NSteve Muckle <steve.muckle@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

bf7cdff1

06 5月, 2016 1 次提交

cpufreq: governor: Fix handling of special cases in dbs_update() · 9485e4ca

由 Rafael J. Wysocki 提交于 5月 06, 2016

As reported in KBZ 69821:

"With CONFIG_HZ_PERIODIC=y cpu stays at the lowest frequcency 800MHz
 even if usage goes to 100%, frequency does not scale up, the governor
 in use is ondemand. Neither works conservative. Performance and
 userspace governors work as expected.

 With CONFIG_NO_HZ_IDLE or CONFIG_NO_HZ_FULL cpu scales up with ondemand
 as expected."

Analysis carried out by Chen Yu leads to the conclusion that the
observed issue is due to idle_time in dbs_update() representing a
negative number in which case the function will return 0 as the load
(unless load is greater than 0 for another CPU sharing the policy),
although that need not be the right choice.

Indeed, idle_time representing a negative number means that during
the last sampling interval the CPU was almost 100% busy on the rough
average, so 100 should be returned as the load in that case.

Modify the code accordingly and rearrange it to clarify the handling
of all of the special cases in it.  While at it, also avoid returning
zero as the load if time_elapsed is 0 (it doesn't really make sense
to return 0 then).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=69821Tested-by: NChen Yu <yu.c.chen@intel.com>
Tested-by: NTimo Valtoaho <timo.valtoaho@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

9485e4ca

05 5月, 2016 4 次提交

cpufreq: intel_pstate: Ignore _PPC processing under HWP · e59a8f7f

由 Srinivas Pandruvada 提交于 5月 04, 2016

When HWP (hardware P states) feature is active, the ACPI _PSS and _PPC
is not used. So ignore processing for _PPC limits.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e59a8f7f

cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table · d9975b0b

由 Sudeep Holla 提交于 5月 03, 2016

Currently when performing random CPU hot-plugs and suspend-to-ram(S2R)
on systems using arm_big_little cpufreq driver, we get warnings similar
to something like below:

cpu cpu1: _opp_add: duplicate OPPs detected. Existing: freq: 600000000,
	volt: 800000, enabled: 1. New: freq: 600000000, volt: 800000, enabled: 1

This is mainly because the OPPs for the shared cpus are not set. We can
just use dev_pm_opp_of_cpumask_add_table in case the OPPs are obtained
from DT(arm_big_little_dt.c) or use dev_pm_opp_set_sharing_cpus if the
OPPs are obtained by other means like firmware(e.g. scpi-cpufreq.c)

Also now that the generic dev_pm_opp{,_of}_cpumask_remove_table can
handle removal of opp table and entries for all associated CPUs, we can
re-use dev_pm_opp{,_of}_cpumask_remove_table as free_opp_table in
cpufreq_arm_bL_ops.

This patch makes necessary changes to reuse the generic OPP functions for
{init,free}_opp_table and thereby eliminating the warnings.
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d9975b0b

cpufreq: tango: Use generic platdev driver · d9c99acb

由 Marc Gonzalez 提交于 5月 02, 2016

Add tango4 compatible string to the list.
Signed-off-by: NMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d9c99acb

cpufreq: Fix GOV_LIMITS handling for the userspace governor · e43e94c1

由 Sai Gurrappadi 提交于 4月 29, 2016

Currently, the userspace governor only updates frequency on GOV_LIMITS
if policy->cur falls outside policy->{min/max}. However, it is also
necessary to update current frequency on GOV_LIMITS to match the user
requested value if it can be achieved within the new policy->{max/min}.

This was previously the behaviour in the governor until commit d1922f02
("cpufreq: Simplify userspace governor") which incorrectly assumed that
policy->cur == user requested frequency via scaling_setspeed. This won't
be true if the user requested frequency falls outside policy->{min/max}.
Ex: a temporary thermal cap throttled the user requested frequency.

Fix this by storing the user requested frequency in a seperate variable.
The governor will then try to achieve this request on every GOV_LIMITS
change.

Fixes: d1922f02 (cpufreq: Simplify userspace governor)
Signed-off-by: NSai Gurrappadi <sgurrappadi@nvidia.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e43e94c1

04 5月, 2016 1 次提交

intel_pstate: Fix intel_pstate_get() · 6d45b719

由 Rafael J. Wysocki 提交于 5月 04, 2016

After commit 8fa520af "intel_pstate: Remove freq calculation from
intel_pstate_calc_busy()" intel_pstate_get() calls get_avg_frequency()
to compute the average frequency, which is problematic for two reasons.

First, intel_pstate_get() may be invoked before the driver reads the
CPU feedback registers for the first time and if that happens,
get_avg_frequency() will attempt to divide by zero.

Second, the get_avg_frequency() call in intel_pstate_get() is racy
with respect to intel_pstate_sample() and it may end up returning
completely meaningless values for this reason.

Moreover, after commit 7349ec04 "intel_pstate: Move
intel_pstate_calc_busy() into get_target_pstate_use_performance()"
sample.core_pct_busy is never computed on Atom, but it is used in
intel_pstate_adjust_busy_pstate() in that case too.

To address those problems notice that if sample.core_pct_busy
was used in the average frequency computation carried out by
get_avg_frequency(), both the divide by zero problem and the
race with respect to intel_pstate_sample() would be avoided.

Accordingly, move the invocation of intel_pstate_calc_busy() from
get_target_pstate_use_performance() to intel_pstate_update_util(),
which also will take care of the uninitialized sample.core_pct_busy
on Atom, and modify get_avg_frequency() to use sample.core_pct_busy
as per the above.
Reported-by: Nkernel test robot <ying.huang@linux.intel.com>
Link: http://marc.info/?l=linux-kernel&m=146226437623173&w=4
Fixes: 8fa520af "intel_pstate: Remove freq calculation from intel_pstate_calc_busy()"
Fixes: 7349ec04 "intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()"
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6d45b719

02 5月, 2016 1 次提交

cpufreq: intel_pstate: Fix HWP on boot CPU after system resume · ba41e1bc

由 Rafael J. Wysocki 提交于 5月 02, 2016

Commit 41cfd64c "Update frequencies of policy->cpus only from
->set_policy()" changed the way the intel_pstate driver's ->set_policy
callback updates the HWP (hardware-managed P-states) settings.
A side effect of it is that if those settings are modified on the
boot CPU during system suspend and wakeup, they will never be
restored during subsequent system resume.

To address this problem, allow cpufreq drivers that don't provide
->target or ->target_index callbacks to use ->suspend and ->resume
callbacks and add a ->resume callback to intel_pstate to restore
the HWP settings on the CPUs that belong to the given policy.

Fixes: 41cfd64c "Update frequencies of policy->cpus only from ->set_policy()"
Tested-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

ba41e1bc

01 5月, 2016 1 次提交

powerpc/mm: Abstraction for switch_mmu_context() · d2adba3f

由 Aneesh Kumar K.V 提交于 4月 29, 2016

How we switch MMU context differs between hash and radix. For hash we
need to switch the SLB details and for radix we need to switch the PID
SPR.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d2adba3f

28 4月, 2016 7 次提交

cpufreq: st: enable selective initialization based on the platform · 2482bc31

由 Sudeep Holla 提交于 4月 27, 2016

The sti-cpufreq does unconditional registration of the cpufreq-dt driver
which causes issue on an multi-platform build. For example, on Vexpress
TC2 platform, we get the following error on boot:

cpu cpu0: OPP-v2 not supported
cpu cpu0: Not doing voltage scaling
cpu: dev_pm_opp_of_cpumask_add_table: couldn't find opp table
	for cpu:0, -19
cpu cpu0: dev_pm_opp_get_max_volt_latency: Invalid regulator (-6)
...
arm_big_little: bL_cpufreq_register: Failed registering platform driver:
		vexpress-spc, err: -17

The actual driver fails to initialise as cpufreq-dt is probed
successfully, which is incorrect. This issue can happen to any platform
not using cpufreq-dt in a multi-platform build.

This patch adds a check to do selective initialization of the driver.

Fixes: ab0ea257 (cpufreq: st: Provide runtime initialised driver for ST's platforms)
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NLee Jones <lee.jones@linaro.org>
Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2482bc31

cpufreq: mvebu: Move cpufreq code into drivers/cpufreq/ · 9f123def

由 Viresh Kumar 提交于 4月 27, 2016

Move cpufreq bits for mvebu into drivers/cpufreq/ directory, that's
where they really belong to.

Compiled tested only.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9f123def

cpufreq: dt: Kill platform-data · eb96924a

由 Viresh Kumar 提交于 4月 27, 2016

There are no more users of platform-data for cpufreq-dt driver, get rid
of it.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

eb96924a

cpufreq: dt: Identify cpu-sharing for platforms without operating-points-v2 · 1530b996

由 Viresh Kumar 提交于 4月 27, 2016

Existing platforms, which do not support operating-points-v2, can
explicitly tell the opp core that some of the CPUs share opp tables,
with help of dev_pm_opp_set_sharing_cpus().

For such platforms, explicitly ask the opp core to provide list of CPUs
sharing the opp table with current cpu device, before falling back to
platform data.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1530b996

cpufreq: governor: Change confusing struct field and variable names · b4f4b4b3

由 Rafael J. Wysocki 提交于 4月 28, 2016

The name of the prev_cpu_wall field in struct cpu_dbs_info is
confusing, because it doesn't represent wall time, but the previous
update time as returned by get_cpu_idle_time() (that may be the
current value of jiffies_64 in some cases, for example).

Moreover, the names of some related variables in dbs_update() take
that confusion further.

Rename all of those things to make their names reflect the purpose
more accurately.  While at it, drop unnecessary parens from one of
the updated expressions.

No functional changes.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NChen Yu <yu.c.chen@intel.com>

b4f4b4b3

cpufreq: intel_pstate: Enable PPC enforcement for servers · 2b3ec765

由 Srinivas Pandruvada 提交于 4月 27, 2016

For platforms which are controlled via remove node manager, enable _PPC by
default. These platforms are mostly categorized as enterprise server or
performance servers. These platforms needs to go through some
certifications tests, which tests control via _PPC.
The relative risk of enabling by default is low as this is is less likely
that these systems have broken _PSS table.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2b3ec765

cpufreq: intel_pstate: Adjust policy->max · 3be9200d

由 Srinivas Pandruvada 提交于 4月 27, 2016

When policy->max is changed via _PPC or sysfs and is more than the max non
turbo frequency, it does not really change resulting performance in some
processors. When policy->max results in a P-State ratio more than the
turbo activation ratio, then processor can choose any P-State up to max
turbo. So the user or _PPC setting has no value, but this can cause
undesirable side effects like:
- Showing reduced max percentage in Intel P-State sysfs
- It can cause reduced max performance under certain boundary conditions:
The requested max scaling frequency either via _PPC or via cpufreq-sysfs,
will be converted into a fixed floating point max percent scale. In
majority of the cases this will result in correct max. But not 100% of the
time. If the _PPC is requested at a point where the calculation lead to a
lower max, this can result in a lower P-State then expected and it will
impact performance.
Example of this condition using a Broadwell laptop with config TDP.

ACPI _PSS table from a Broadwell laptop
2301000 2300000 2200000 2000000 1900000 1800000 1700000 1500000 1400000
1300000 1100000 1000000 900000 800000 600000 500000

The actual results by disabling config TDP so that we can get what is
requested on or below 2300000Khz.

scaling_max_freq        Max Requested P-State   Resultant scaling
max
---------------------------------------- ----------------------
2400000                 18                      2900000 (max
turbo)
2300000                 17                      2300000 (max
physical non turbo)
2200000                 15                      2100000
2100000                 15                      2100000
2000000                 13                      1900000
1900000                 13                      1900000
1800000                 12                      1800000
1700000                 11                      1700000
1600000                 10                      1600000
1500000                 f                       1500000
1400000                 e                       1400000
1300000                 d                       1300000
1200000                 c                       1200000
1100000                 a                       1000000
1000000                 a                       1000000
900000                  9                        900000
800000                  8                        800000
700000                  7                        700000
600000                  6                        600000
500000                  5                        500000
------------------------------------------------------------------

Now set the config TDP level 1 ratio as 0x0b (equivalent to 1100000KHz)
in BIOS (not every system will let you adjust this).
The turbo activation ratio will be set to one less than that, which will
be 0x0a (So any request above 1000000KHz should result in turbo region
assuming no thermal limits).
Here _PPC will request max to 1100000KHz (which basically should still
result in turbo as this is more than the turbo activation ratio up to
max allowable turbo frequency), but actual calculation resulted in a max
ceiling P-State which is 0x0a. So under any load condition, this driver
will not request turbo P-States. This will be a huge performance hit.

When config TDP feature is ON, if the _PPC points to a frequency above
turbo activation ratio, the performance can still reach max turbo. In this
case we don't need to treat this as the reduced frequency in set_policy
callback.

In this change when config TDP is active (by checking if the physical max
non turbo ratio is more than the current max non turbo ratio), any request
above current max non turbo is treated as full performance.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw : Minor cleanups ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3be9200d

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功