提交 · 036399782bf51dafb932b680b260936b2b5f8dd6 · openeuler / Kernel

23 5月, 2018 1 次提交

cpufreq: Rename cpufreq_can_do_remote_dvfs() · 03639978

由 Viresh Kumar 提交于 5月 22, 2018

This routine checks if the CPU running this code belongs to the policy
of the target CPU or if not, can it do remote DVFS for it remotely. But
the current name of it implies as if it is only about doing remote
updates.

Rename it to make it more relevant.
Suggested-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

03639978

10 4月, 2018 1 次提交

cpufreq: Drop cpufreq_table_validate_and_show() · 2dd0df84

由 Viresh Kumar 提交于 4月 03, 2018

This isn't used anymore. Remove the helper and update documentation
accordingly.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2dd0df84

28 2月, 2018 1 次提交

cpufreq: Validate frequency table in the core · d417e069

由 Viresh Kumar 提交于 2月 22, 2018

By design, cpufreq drivers are responsible for calling
cpufreq_frequency_table_cpuinfo() from their ->init()
callbacks to validate the frequency table.

However, if a cpufreq driver is buggy and fails to do so properly, it
lead to unexpected behavior of the driver or the cpufreq core at a
later point in time.  It would be better if the core could
validate the frequency table during driver initialization.

To that end, introduce cpufreq_table_validate_and_sort() and make
the cpufreq core call it right after invoking the ->init() callback
of the driver and destroy the cpufreq policy if the table is invalid.

For the time being the validation of the table happens twice, once
from the driver and then from the core.  The individual drivers will
be updated separately to drop table validation if they don't need it
for other reasons.

The frequency table is marked "sorted" or "unsorted" by the new helper
now instead of in cpufreq_table_validate_and_show(), as it should only
be done after validating the table (which the drivers won't do going
forward).
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject/changelog ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d417e069

08 2月, 2018 1 次提交

cpufreq: Add and use cpufreq_for_each_{valid_,}entry_idx() · ffd81dcf

由 Dominik Brodowski 提交于 1月 30, 2018

Pointer subtraction is slow and tedious. Therefore, replace all instances
where cpufreq_for_each_{valid_,}entry loops contained such substractions
with an iteration macro providing an index to the frequency_table entry.
Suggested-by: NAl Viro <viro@ZenIV.linux.org.uk>
Link: http://lkml.kernel.org/r/20180120020237.GM13338@ZenIV.linux.org.ukAcked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ffd81dcf

16 11月, 2017 1 次提交

x86 / CPU: Always show current CPU frequency in /proc/cpuinfo · 7d5905dc

由 Rafael J. Wysocki 提交于 11月 15, 2017

After commit 890da9cf (Revert "x86: do not use cpufreq_quick_get()
for /proc/cpuinfo "cpu MHz"") the "cpu MHz" number in /proc/cpuinfo
on x86 can be either the nominal CPU frequency (which is constant)
or the frequency most recently requested by a scaling governor in
cpufreq, depending on the cpufreq configuration. That is somewhat
inconsistent and is different from what it was before 4.13, so in
order to restore the previous behavior, make it report the current
CPU frequency like the scaling_cur_freq sysfs file in cpufreq.

To that end, modify the /proc/cpuinfo implementation on x86 to use
aperfmperf_snapshot_khz() to snapshot the APERF and MPERF feedback
registers, if available, and use their values to compute the CPU
frequency to be reported as "cpu MHz".

However, do that carefully enough to avoid accumulating delays that
lead to unacceptable access times for /proc/cpuinfo on systems with
many CPUs. Run aperfmperf_snapshot_khz() once on all CPUs
asynchronously at the /proc/cpuinfo open time, add a single delay
upfront (if necessary) at that point and simply compute the current
frequency while running show_cpuinfo() for each individual CPU.

Also, to avoid slowing down /proc/cpuinfo accesses too much, reduce
the default delay between consecutive APERF and MPERF reads to 10 ms,
which should be sufficient to get large enough numbers for the
frequency computation in all cases.

Fixes: 890da9cf (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NIngo Molnar <mingo@kernel.org>

7d5905dc

03 10月, 2017 1 次提交

cpufreq: provide default frequency-invariance setter function · e7d5459d

由 Dietmar Eggemann 提交于 9月 26, 2017

Frequency-invariant accounting support based on the ratio of current
frequency and maximum supported frequency is an optional feature an arch
can implement.

Since there are cpufreq drivers (e.g. cpufreq-dt) which can be build for
different arch's a default implementation of the frequency-invariance
setter function arch_set_freq_scale() is needed.

This default implementation is an empty weak function which will be
overwritten by a strong function in case the arch provides one.

The setter function passes the cpumask of related (to the frequency
change) cpus (online and offline cpus), the (new) current frequency and
the maximum supported frequency.
Signed-off-by: NDietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e7d5459d

08 8月, 2017 1 次提交

cpufreq: Simplify cpufreq_can_do_remote_dvfs() · d6344d4b

由 Rafael J. Wysocki 提交于 8月 04, 2017

The if () in cpufreq_can_do_remote_dvfs() is superfluous, so drop
it and simply return the value of the expression under it.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d6344d4b

01 8月, 2017 2 次提交

cpufreq: Process remote callbacks from any CPU if the platform permits · 99d14d0e

由 Viresh Kumar 提交于 7月 28, 2017

On many platforms, CPUs can do DVFS across cpufreq policies. i.e CPU
from policy-A can change frequency of CPUs belonging to policy-B.

This is quite common in case of ARM platforms where we don't
configure any per-cpu register.

Add a flag to identify such platforms and update
cpufreq_can_do_remote_dvfs() to allow remote callbacks if this flag is
set.

Also enable the flag for cpufreq-dt driver which is used only on ARM
platforms currently.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

99d14d0e

sched: cpufreq: Allow remote cpufreq callbacks · 674e7541

由 Viresh Kumar 提交于 7月 28, 2017

With Android UI and benchmarks the latency of cpufreq response to
certain scheduling events can become very critical. Currently, callbacks
into cpufreq governors are only made from the scheduler if the target
CPU of the event is the same as the current CPU. This means there are
certain situations where a target CPU may not run the cpufreq governor
for some time.

One testcase to show this behavior is where a task starts running on
CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the
system is configured such that the new tasks should receive maximum
demand initially, this should result in CPU0 increasing frequency
immediately. But because of the above mentioned limitation though, this
does not occur.

This patch updates the scheduler core to call the cpufreq callbacks for
remote CPUs as well.

The schedutil, ondemand and conservative governors are updated to
process cpufreq utilization update hooks called for remote CPUs where
the remote CPU is managed by the cpufreq policy of the local CPU.

The intel_pstate driver is updated to always reject remote callbacks.

This is tested with couple of usecases (Android: hackbench, recentfling,
galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit
octa-core, single policy). Only galleryfling showed minor improvements,
while others didn't had much deviation.

The reason being that this patch only targets a corner case, where
following are required to be true to improve performance and that
doesn't happen too often with these tests:

- Task is migrated to another CPU.
- The task has high demand, and should take the target CPU to higher
  OPPs.
- And the target CPU doesn't call into the cpufreq governor until the
  next tick.

Based on initial work from Steve Muckle.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

674e7541

26 7月, 2017 2 次提交

cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag · fe829ed8

由 Viresh Kumar 提交于 7月 19, 2017

The policy->transition_latency field is used for multiple purposes
today and its not straight forward at all. This is how it is used:

A. Set the correct transition_latency value.

B. Set it to CPUFREQ_ETERNAL because:
   1. We don't want automatic dynamic switching (with
      ondemand/conservative) to happen at all.
   2. We don't know the transition latency.

This patch handles the B.1. case in a more readable way. A new flag for
the cpufreq drivers is added to disallow use of cpufreq governors which
have dynamic_switching flag set.

All the current cpufreq drivers which are setting transition_latency
unconditionally to CPUFREQ_ETERNAL are updated to use it. They don't
need to set transition_latency anymore.

There shouldn't be any functional change after this patch.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

fe829ed8

cpufreq: Replace "max_transition_latency" with "dynamic_switching" · ed4676e2

由 Viresh Kumar 提交于 7月 19, 2017

There is no limitation in the ondemand or conservative governors which
disallow the transition_latency to be greater than 10 ms.

The max_transition_latency field is rather used to disallow automatic
dynamic frequency switching for platforms which didn't wanted these
governors to run.

Replace max_transition_latency with a boolean (dynamic_switching) and
check for transition_latency == CPUFREQ_ETERNAL along with that. This
makes it pretty straight forward to read/understand now.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ed4676e2

22 7月, 2017 2 次提交

cpufreq: Use transition_delay_us for legacy governors as well · aa7519af

由 Viresh Kumar 提交于 7月 19, 2017

The policy->transition_delay_us field is used only by the schedutil
governor currently, and this field describes how fast the driver wants
the cpufreq governor to change CPUs frequency. It should rather be a
common thing across all governors, as it doesn't have any schedutil
dependency here.

Create a new helper cpufreq_policy_transition_delay_us() to get the
transition delay across all governors.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

aa7519af

cpufreq: governor: Drop min_sampling_rate · 2d045036

由 Viresh Kumar 提交于 7月 19, 2017

The cpufreq core and governors aren't supposed to set a limit on how
fast we want to try changing the frequency. This is currently done for
the legacy governors with help of min_sampling_rate.

At worst, we may end up setting the sampling rate to a value lower than
the rate at which frequency can be changed and then one of the CPUs in
the policy will be only changing frequency for ever.

But that is something for the user to decide and there is no need to
have special handling for such cases in the core. Leave it for the user
to figure out.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2d045036

27 6月, 2017 1 次提交

x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF · f8475cef

由 Len Brown 提交于 6月 23, 2017

The goal of this change is to give users a uniform and meaningful
result when they read /sys/...cpufreq/scaling_cur_freq
on modern x86 hardware, as compared to what they get today.

Modern x86 processors include the hardware needed
to accurately calculate frequency over an interval --
APERF, MPERF, and the TSC.

Here we provide an x86 routine to make this calculation
on supported hardware, and use it in preference to any
driver driver-specific cpufreq_driver.get() routine.

MHz is computed like so:

MHz = base_MHz * delta_APERF / delta_MPERF

MHz is the average frequency of the busy processor
over a measurement interval.  The interval is
defined to be the time between successive invocations
of aperfmperf_khz_on_cpu(), which are expected to to
happen on-demand when users read sysfs attribute
cpufreq/scaling_cur_freq.

As with previous methods of calculating MHz,
idle time is excluded.

base_MHz above is from TSC calibration global "cpu_khz".

This x86 native method to calculate MHz returns a meaningful result
no matter if P-states are controlled by hardware or firmware
and/or if the Linux cpufreq sub-system is or is-not installed.

When this routine is invoked more frequently, the measurement
interval becomes shorter.  However, the code limits re-computation
to 10ms intervals so that average frequency remains meaningful.

Discerning users are encouraged to take advantage of
the turbostat(8) utility, which can gracefully handle
concurrent measurement intervals of arbitrary length.
Signed-off-by: NLen Brown <len.brown@intel.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f8475cef

28 5月, 2017 1 次提交

cpufreq: create cpufreq_table_count_valid_entries() · 55d85293

由 Viresh Kumar 提交于 4月 25, 2017

We need such a routine at two places already, lets create one.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: NLukasz Luba <lukasz.luba@arm.com>
Tested-by: NLukasz Luba <lukasz.luba@arm.com>
Signed-off-by: NEduardo Valentin <edubezval@gmail.com>

55d85293

18 4月, 2017 1 次提交

cpufreq: schedutil: Use policy-dependent transition delays · 1b72e7fd

由 Rafael J. Wysocki 提交于 4月 11, 2017

Make the schedutil governor take the initial (default) value of the
rate_limit_us sysfs attribute from the (new) transition_delay_us
policy parameter (to be set by the scaling driver).

That will allow scaling drivers to make schedutil use smaller default
values of rate_limit_us and reduce the default average time interval
between consecutive frequency changes.

Make intel_pstate set transition_delay_us to 500.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

1b72e7fd

04 2月, 2017 3 次提交

cpufreq: Fix typos in comments · 565ebe80

由 Viresh Kumar 提交于 2月 03, 2017

- s/freqnency/frequency/
- s/accomodating/accommodating/
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

565ebe80

cpufreq: Remove CPUFREQ_START notifier event · 052f573f

由 Viresh Kumar 提交于 1月 05, 2017

Its not used anymore, remove it.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

052f573f

cpufreq: Remove policy create/remove notifiers · f9f41e3e

由 Viresh Kumar 提交于 1月 05, 2017

Those were added by:

commit fcd7af91 ("cpufreq: stats: handle cpufreq_unregister_driver()
and suspend/resume properly")

but aren't used anymore since:

commit 1aefc75b ("cpufreq: stats: Make the stats code non-modular").

Remove them. Also remove the redundant parameter to the respective
routines.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f9f41e3e

21 11月, 2016 1 次提交

cpufreq: Make cpufreq_update_policy() void · 30248fef

由 Rafael J. Wysocki 提交于 11月 18, 2016

The return value of cpufreq_update_policy() is never used, so make
it void.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

30248fef

11 11月, 2016 1 次提交

cpufreq: stats: New sysfs attribute for clearing statistics · ee7930ee

由 Markus Mayer 提交于 11月 07, 2016

Allow CPUfreq statistics to be cleared by writing anything to
/sys/.../cpufreq/stats/reset.
Signed-off-by: NMarkus Mayer <mmayer@broadcom.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ee7930ee

20 10月, 2016 1 次提交

cpufreq: fix overflow in cpufreq_table_find_index_dl() · c6fe46a7

由 Sergey Senozhatsky 提交于 10月 18, 2016

'best' is always less or equals to 'pos', so `best - pos' returns
a negative value which is then getting casted to `unsigned int'
and passed to __cpufreq_driver_target()->acpi_cpufreq_target()
for policy->freq_table selection. This results in

 BUG: unable to handle kernel paging request at ffff881019b469f8
 IP: [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
 PGD 267f067
 PUD 0

 Oops: 0000 [#1] PREEMPT SMP
 CPU: 6 PID: 70 Comm: kworker/6:1 Not tainted 4.9.0-rc1-next-20161017-dbg-dirty
 Workqueue: events dbs_work_handler
 task: ffff88041b808000 task.stack: ffff88041b810000
 RIP: 0010:[<ffffffffa00356c1>]  [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
 RSP: 0018:ffff88041b813c60  EFLAGS: 00010282
 RAX: ffff880419b46a00 RBX: ffff88041b848400 RCX: ffff880419b20f80
 RDX: 00000000001dff38 RSI: 00000000ffffffff RDI: ffff88041b848400
 RBP: ffff88041b813cb0 R08: 0000000000000006 R09: 0000000000000040
 R10: ffffffff8207f9e0 R11: ffffffff8173595b R12: 0000000000000000
 R13: ffff88041f1dff38 R14: 0000000000262900 R15: 0000000bfffffff4
 FS:  0000000000000000(0000) GS:ffff88041f000000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff881019b469f8 CR3: 000000041a2d3000 CR4: 00000000001406e0
 Stack:
  ffff88041b813cb0 ffffffff813347f9 ffff88041b813ca0 ffffffff81334663
  ffff88041f1d4bc0 ffff88041b848400 0000000000000000 0000000000000000
  0000000000262900 0000000000000000 ffff88041b813d00 ffffffff813355dc
 Call Trace:
  [<ffffffff813347f9>] ? cpufreq_freq_transition_begin+0xf1/0xfc
  [<ffffffff81334663>] ? get_cpu_idle_time+0x97/0xa6
  [<ffffffff813355dc>] __cpufreq_driver_target+0x3b6/0x44e
  [<ffffffff81336ca3>] cs_dbs_timer+0x11a/0x135
  [<ffffffff81336fda>] dbs_work_handler+0x39/0x62
  [<ffffffff81057823>] process_one_work+0x280/0x4a5
  [<ffffffff81058719>] worker_thread+0x24f/0x397
  [<ffffffff810584ca>] ? rescuer_thread+0x30b/0x30b
  [<ffffffff81418380>] ? nl80211_get_key+0x29/0x36a
  [<ffffffff8105d2b7>] kthread+0xfc/0x104
  [<ffffffff8107ceea>] ? put_lock_stats.isra.9+0xe/0x20
  [<ffffffff8105d1bb>] ? kthread_create_on_node+0x3f/0x3f
  [<ffffffff814b2092>] ret_from_fork+0x22/0x30
 Code: 56 4d 6b ff 0c 41 55 41 54 53 48 83 ec 28 48 8b 15 ad 1e 00 00 44 8b 41
 08 48 8b 87 c8 00 00 00 49 89 d5 4e 03 2c c5 80 b2 78 81 <46> 8b 74 38 04 45
 3b 75 00 75 11 31 c0 83 39 00 0f 84 1c 01 00
 RIP  [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
  RSP <ffff88041b813c60>
 CR2: ffff881019b469f8
 ---[ end trace 16d9fc7a17897d37 ]---

[ rjw: In some cases this bug may also cause incorrect frequencies to
  be selected by cpufreq governors. ]

Fixes: 899bb664 (cpufreq: skip invalid entries when searching the frequency)
Link: http://marc.info/?l=linux-kernel&m=147672030714331&w=2Reported-and-tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Reported-and-tested-by: NJörg Otte <jrg.otte@gmail.com>
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c6fe46a7

13 10月, 2016 1 次提交

cpufreq: skip invalid entries when searching the frequency · 899bb664

由 Aaro Koskinen 提交于 10月 12, 2016

Skip invalid entries when searching the frequency. This fixes cpufreq
at least on loongson2 MIPS board.

Fixes: da0c6dc0 (cpufreq: Handle sorted frequency tables more efficiently)
Signed-off-by: NAaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

899bb664

21 7月, 2016 1 次提交

cpufreq: add cpufreq_driver_resolve_freq() · e3c06236

由 Steve Muckle 提交于 7月 13, 2016

Cpufreq governors may need to know what a particular target frequency
maps to in the driver without necessarily wanting to set the frequency.
Support this operation via a new cpufreq API,
cpufreq_driver_resolve_freq(). This API returns the lowest driver
frequency equal or greater than the target frequency
(CPUFREQ_RELATION_L), subject to any policy (min/max) or driver
limitations. The mapping is also cached in the policy so that a
subsequent fast_switch operation can avoid repeating the same lookup.

The API will call a new cpufreq driver callback, resolve_freq(), if it
has been registered by the driver. Otherwise the frequency is resolved
via cpufreq_frequency_table_target(). Rather than require ->target()
style drivers to provide a resolve_freq() callback it is left to the
caller to ensure that the driver implements this callback if necessary
to use cpufreq_driver_resolve_freq().
Suggested-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NSteve Muckle <smuckle@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e3c06236

07 7月, 2016 1 次提交

cpufreq: Handle sorted frequency tables more efficiently · da0c6dc0

由 Viresh Kumar 提交于 6月 27, 2016

cpufreq drivers aren't required to provide a sorted frequency table
today, and even the ones which provide a sorted table aren't handled
efficiently by cpufreq core.

This patch adds infrastructure to verify if the freq-table provided by
the drivers is sorted or not, and use efficient helpers if they are
sorted.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

da0c6dc0

09 6月, 2016 3 次提交

cpufreq: Return index from cpufreq_frequency_table_target() · d218ed77

由 Viresh Kumar 提交于 6月 03, 2016

This routine can't fail unless the frequency table is invalid and
doesn't contain any valid entries.

Make it return the index and WARN() in case it is used for an invalid
table.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d218ed77

cpufreq: Drop freq-table param to cpufreq_frequency_table_target() · 7ab4aabb

由 Viresh Kumar 提交于 6月 03, 2016

The policy already has this pointer set, use it instead.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7ab4aabb

cpufreq: Remove cpufreq_frequency_get_table() · f8bfc116

由 Viresh Kumar 提交于 6月 03, 2016

Most of the callers of cpufreq_frequency_get_table() already have the
pointer to a valid 'policy' structure and they don't really need to go
through the per-cpu variable first and then a check to validate the
frequency, in order to find the freq-table for the policy.

Directly use the policy->freq_table field instead for them.

Only one user of that API is left after above changes, cpu_cooling.c and
it accesses the freq_table in a racy way as the policy can get freed in
between.

Fix it by using cpufreq_cpu_get() properly.

Since there are no more users of cpufreq_frequency_get_table() left, get
rid of it.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: Javi Merino <javi.merino@arm.com> (cpu_cooling.c)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f8bfc116

03 6月, 2016 4 次提交

cpufreq: stats: Make the stats code non-modular · 1aefc75b

由 Rafael J. Wysocki 提交于 5月 31, 2016

The modularity of cpufreq_stats is quite problematic.

First off, the usage of policy notifiers for the initialization
and cleanup in the cpufreq_stats module is inherently racy with
respect to CPU offline/online and the initialization and cleanup
of the cpufreq driver.

Second, fast frequency switching (used by the schedutil governor)
cannot be enabled if any transition notifiers are registered, so
if the cpufreq_stats module (that registers a transition notifier
for updating transition statistics) is loaded, the schedutil governor
cannot use fast frequency switching.

On the other hand, allowing cpufreq_stats to be built as a module
doesn't really add much value.  Arguably, there's not much reason
for that code to be modular at all.

For the above reasons, make the cpufreq stats code non-modular,
modify the core to invoke functions provided by that code directly
and drop the notifiers from it.

Make the stats sysfs attributes appear empty if fast frequency
switching is enabled as the statistics will not be updated in that
case anyway (and returning -EBUSY from those attributes breaks
powertop).

While at it, clean up Kconfig help for the CPU_FREQ_STAT and
CPU_FREQ_STAT_DETAILS options.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

1aefc75b

cpufreq: Drop the 'initialized' field from struct cpufreq_governor · 9a15fb2c

由 Rafael J. Wysocki 提交于 5月 18, 2016

The 'initialized' field in struct cpufreq_governor is only used by
the conservative governor (as a usage counter) and the way that
happens is far from straightforward and arguably incorrect.

Namely, the value of 'initialized' is checked by
cpufreq_dbs_governor_init() and cpufreq_dbs_governor_exit() and
the results of those checks are passed (as the second argument) to
the ->init() and ->exit() callbacks in struct dbs_governor.  Those
callbacks are only implemented by the ondemand and conservative
governors and ondemand doesn't use their second argument at all.
In turn, the conservative governor uses it to decide whether or not
to either register or unregister a transition notifier.

That whole mechanism is not only unnecessarily convoluted, but also
racy, because the 'initialized' field of struct cpufreq_governor is
updated in cpufreq_init_governor() and cpufreq_exit_governor() under
policy->rwsem which doesn't help if one of these functions is run
twice in parallel for different policies (which isn't impossible in
principle), for example.

Instead of it, add a proper usage counter to the conservative
governor and update it from cs_init() and cs_exit() which is
guaranteed to be non-racy, as those functions are only called
under gov_dbs_data_mutex which is global.

With that in place, drop the 'initialized' field from struct
cpufreq_governor as it is not used any more.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

9a15fb2c

cpufreq: governor: Create cpufreq_policy_apply_limits() · bf2be2de

由 Viresh Kumar 提交于 5月 18, 2016

Create a new helper to avoid code duplication across governors.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

bf2be2de

cpufreq: governor: Get rid of governor events · e788892b

由 Rafael J. Wysocki 提交于 6月 02, 2016

The design of the cpufreq governor API is not very straightforward,
as struct cpufreq_governor provides only one callback to be invoked
from different code paths for different purposes.  The purpose it is
invoked for is determined by its second "event" argument, causing it
to act as a "callback multiplexer" of sorts.

Unfortunately, that leads to extra complexity in governors, some of
which implement the ->governor() callback as a switch statement
that simply checks the event argument and invokes a separate function
to handle that specific event.

That extra complexity can be eliminated by replacing the all-purpose
->governor() callback with a family of callbacks to carry out specific
governor operations: initialization and exit, start and stop and policy
limits updates.  That also turns out to reduce the code size too, so
do it.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

e788892b

09 4月, 2016 1 次提交

cpufreq: Call cpufreq_disable_fast_switch() in sugov_exit() · 6c9d9c81

由 Rafael J. Wysocki 提交于 4月 07, 2016

Due to differences in the cpufreq core's handling of runtime CPU
offline and nonboot CPUs disabling during system suspend-to-RAM,
fast frequency switching gets disabled after a suspend-to-RAM and
resume cycle on all of the nonboot CPUs.

To prevent that from happening, move the invocation of
cpufreq_disable_fast_switch() from cpufreq_exit_governor() to
sugov_exit(), as the schedutil governor is the only user of fast
frequency switching today anyway.

That simply prevents cpufreq_disable_fast_switch() from being called
without invoking the ->governor callback for the CPUFREQ_GOV_POLICY_EXIT
event (which happens during system suspend now).

Fixes: b7898fda (cpufreq: Support for fast frequency switching)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

6c9d9c81

02 4月, 2016 3 次提交

cpufreq: Support for fast frequency switching · b7898fda

由 Rafael J. Wysocki 提交于 3月 30, 2016

Modify the ACPI cpufreq driver to provide a method for switching
CPU frequencies from interrupt context and update the cpufreq core
to support that method if available.

Introduce a new cpufreq driver callback, ->fast_switch, to be
invoked for frequency switching from interrupt context by (future)
governors supporting that feature via (new) helper function
cpufreq_driver_fast_switch().

Add two new policy flags, fast_switch_possible, to be set by the
cpufreq driver if fast frequency switching can be used for the
given policy and fast_switch_enabled, to be set by the governor
if it is going to use fast frequency switching for the given
policy.  Also add a helper for setting the latter.

Since fast frequency switching is inherently incompatible with
cpufreq transition notifiers, make it possible to set the
fast_switch_enabled only if there are no transition notifiers
already registered and make the registration of new transition
notifiers fail if fast_switch_enabled is set for at least one
policy.

Implement the ->fast_switch callback in the ACPI cpufreq driver
and make it set fast_switch_possible during policy initialization
as appropriate.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

b7898fda

cpufreq: Move governor symbols to cpufreq.h · 379480d8

由 Rafael J. Wysocki 提交于 3月 22, 2016

Move definitions of symbols related to transition latency and
sampling rate to include/linux/cpufreq.h so they can be used by
(future) goverernors located outside of drivers/cpufreq/.

No functional changes.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

379480d8

cpufreq: Move governor attribute set headers to cpufreq.h · 66893b6a

由 Rafael J. Wysocki 提交于 3月 22, 2016

Move definitions and function headers related to struct gov_attr_set
to include/linux/cpufreq.h so they can be used by (future) goverernors
located outside of drivers/cpufreq/.

No functional changes.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

66893b6a

11 3月, 2016 1 次提交

cpufreq: Move scheduler-related code to the sched directory · adaf9fcd

由 Rafael J. Wysocki 提交于 3月 10, 2016

Create cpufreq.c under kernel/sched/ and move the cpufreq code
related to the scheduler to that file and to sched.h.

Redefine cpufreq_update_util() as a static inline function to avoid
function calls at its call sites in the scheduler code (as suggested
by Peter Zijlstra).

Also move the definition of struct update_util_data and declaration
of cpufreq_set_update_util_data() from include/linux/cpufreq.h to
include/linux/sched.h.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

adaf9fcd

09 3月, 2016 3 次提交

cpufreq: Remove 'policy->governor_enabled' · 242aa883

由 Viresh Kumar 提交于 2月 22, 2016

The entire sequence of events (like INIT/START or STOP/EXIT) for which
cpufreq_governor() is called, is guaranteed to be protected by
policy->rwsem now.

The additional checks that were added earlier (as we were forced to drop
policy->rwsem before calling cpufreq_governor() for EXIT event), aren't
required anymore.

Over that, they weren't sufficient really. They just take care of
START/STOP events, but not INIT/EXIT and the state machine was never
maintained properly by them.

Kill the unnecessary checks and policy->governor_enabled field.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

242aa883

Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT" · 68e80dae

由 Viresh Kumar 提交于 2月 09, 2016

Earlier, when the struct freq-attr was used to represent governor
attributes, the standard cpufreq show/store sysfs attribute callbacks
were applied to the governor tunable attributes and they always acquire
the policy->rwsem lock before carrying out the operation. That could
have resulted in an ABBA deadlock if governor tunable attributes are
removed under policy->rwsem while one of them is being accessed
concurrently (if sysfs attributes removal wins the race, it will wait
for the access to complete with policy->rwsem held while the attribute
callback will block on policy->rwsem indefinitely).

We attempted to address this issue by dropping policy->rwsem around
governor tunable attributes removal (that is, around invocations of the
->governor callback with the event arg equal to CPUFREQ_GOV_POLICY_EXIT)
in cpufreq_set_policy(), but that opened up race conditions that had not
been possible with policy->rwsem held all the time.

The previous commit, "cpufreq: governor: New sysfs show/store callbacks
for governor tunables", fixed the original ABBA deadlock by adding new
governor specific show/store callbacks.

We don't have to drop rwsem around invocations of governor event
CPUFREQ_GOV_POLICY_EXIT anymore, and original fix can be reverted now.

Fixes: 955ef483 (cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT)
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Reported-by: NJuri Lelli <juri.lelli@arm.com>
Tested-by: NJuri Lelli <juri.lelli@arm.com>
Tested-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

68e80dae

cpufreq: Add mechanism for registering utilization update callbacks · 34e2c555

由 Rafael J. Wysocki 提交于 2月 15, 2016

Introduce a mechanism by which parts of the cpufreq subsystem
("setpolicy" drivers or the core) can register callbacks to be
executed from cpufreq_update_util() which is invoked by the
scheduler's update_load_avg() on CPU utilization changes.

This allows the "setpolicy" drivers to dispense with their timers
and do all of the computations they need and frequency/voltage
adjustments in the update_load_avg() code path, among other things.

The update_load_avg() changes were suggested by Peter Zijlstra.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NIngo Molnar <mingo@kernel.org>

34e2c555

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功