提交 · c27a0ccc3c715c55fea6709eab2f9c6f551fcfaa · openeuler / Kernel

02 9月, 2020 2 次提交

cpufreq: intel_pstate: Update cached EPP in the active mode · c27a0ccc

由 Rafael J. Wysocki 提交于 8月 27, 2020

Make intel_pstate update the cached EPP value when setting the EPP
via sysfs in the active mode just like it is the case in the passive
mode, for consistency, but also for the benefit of subsequent
changes.

No intentional functional impact.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

c27a0ccc

cpufreq: intel_pstate: Refuse to turn off with HWP enabled · 43298db3

由 Rafael J. Wysocki 提交于 8月 20, 2020

After commit f6ebbcf0 ("cpufreq: intel_pstate: Implement passive
mode with HWP enabled") it is possible to change the driver status
to "off" via sysfs with HWP enabled, which effectively causes the
driver to unregister itself, but HWP remains active and it forces the
minimum performance, so even if another cpufreq driver is loaded,
it will not be able to control the CPU frequency.

For this reason, make the driver refuse to change the status to
"off" with HWP enabled.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

43298db3

11 8月, 2020 1 次提交

cpufreq: intel_pstate: Implement passive mode with HWP enabled · f6ebbcf0

由 Rafael J. Wysocki 提交于 8月 06, 2020

Allow intel_pstate to work in the passive mode with HWP enabled and
make it set the HWP minimum performance limit (HWP floor) to the
P-state value given by the target frequency supplied by the cpufreq
governor, so as to prevent the HWP algorithm and the CPU scheduler
from working against each other, at least when the schedutil governor
is in use, and update the intel_pstate documentation accordingly.

Among other things, this allows utilization clamps to be taken
into account, at least to a certain extent, when intel_pstate is
in use and makes it more likely that sufficient capacity for
deadline tasks will be provided.

After this change, the resulting behavior of an HWP system with
intel_pstate in the passive mode should be close to the behavior
of the analogous non-HWP system with intel_pstate in the passive
mode, except that the HWP algorithm is generally allowed to make the
CPU run at a frequency above the floor P-state set by intel_pstate in
the entire available range of P-states, while without HWP a CPU can
run in a P-state above the requested one if the latter falls into the
range of turbo P-states (referred to as the turbo range) or if the
P-states of all CPUs in one package are coordinated with each other
at the hardware level.

[Note that in principle the HWP floor may not be taken into account
 by the processor if it falls into the turbo range, in which case the
 processor has a license to choose any P-state, either below or above
 the HWP floor, just like a non-HWP processor in the case when the
 target P-state falls into the turbo range.]

With this change applied, intel_pstate in the passive mode assumes
complete control over the HWP request MSR and concurrent changes of
that MSR (eg. via the direct MSR access interface) are overridden by
it.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: NFrancisco Jerez <currojerez@riseup.net>

f6ebbcf0

04 8月, 2020 1 次提交

cpufreq: intel_pstate: Fix cpuinfo_max_freq when MSR_TURBO_RATIO_LIMIT is 0 · 4daca379

由 Srinivas Pandruvada 提交于 8月 03, 2020

The MSR_TURBO_RATIO_LIMIT can be 0. This is not an error. User can update
this MSR via BIOS settings on some systems or can use msr tools to update.
Also some systems boot with value = 0.

This results in display of cpufreq/cpuinfo_max_freq wrong. This value
will be equal to cpufreq/base_frequency, even though turbo is enabled.

But platform will still function normally in HWP mode as we get max
1-core frequency from the MSR_HWP_CAPABILITIES. This MSR is already used
to calculate cpu->pstate.turbo_freq, which is used for to set
policy->cpuinfo.max_freq. But some other places cpu->pstate.turbo_pstate
is used. For example to set policy->max.

To fix this, also update cpu->pstate.turbo_pstate when updating
cpu->pstate.turbo_freq.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4daca379

31 7月, 2020 2 次提交

cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode · de002c55

由 Rafael J. Wysocki 提交于 7月 28, 2020

Because intel_pstate_set_energy_pref_index() reads and writes the
MSR_HWP_REQUEST register without using the cached value of it used by
intel_pstate_hwp_boost_up() and intel_pstate_hwp_boost_down(), those
functions may overwrite the value written by it and so the EPP value
set via sysfs may be lost.

To avoid that, make intel_pstate_set_energy_pref_index() take the
cached value of MSR_HWP_REQUEST just like the other two routines
mentioned above and update it with the new EPP value coming from
user space in addition to updating the MSR.

Note that the MSR itself still needs to be updated too in case
hwp_boost is unset or the boosting mechanism is not active at the
EPP change time.

Fixes: e0efd5be ("cpufreq: intel_pstate: Add HWP boost utility and sched util hooks")
Reported-by: NFrancisco Jerez <currojerez@riseup.net>
Cc: 4.18+ <stable@vger.kernel.org> # 4.18+: 3da97d4db8ee cpufreq: intel_pstate: Rearrange ...
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NFrancisco Jerez <currojerez@riseup.net>

de002c55

cpufreq: intel_pstate: Rearrange the storing of new EPP values · 3a957176

由 Rafael J. Wysocki 提交于 7月 27, 2020

Move the locking away from intel_pstate_set_energy_pref_index()
into its only caller and drop the (now redundant) return_pref label
from it.

Also move the "raw" EPP value check into the caller of that function,
so as to do it before acquiring the mutex, and reduce code duplication
related to the "raw" EPP values processing somewhat.

No intentional functional impact.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NFrancisco Jerez <currojerez@riseup.net>

3a957176

16 7月, 2020 2 次提交

cpufreq: intel_pstate: Avoid enabling HWP if EPP is not supported · 7aa10312

由 Rafael J. Wysocki 提交于 7月 14, 2020

Although there are processors supporting hardware-managed P-states
(HWP) without the energy-performance preference (EPP) feature, they
are not expected to be run with HWP enabled (the BIOS should disable
HWP on those systems).  Missing EPP support generally indicates an
incomplete HWP implementation and so it is better to avoid using
HWP on those systems in production.

However, intel_pstate currently enables HWP on such systems, which
is questionable, so prevent it from doing that by making it check
EPP support before enabling HWP and avoid enabling it if EPP is not
supported by the processor at hand.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7aa10312

cpufreq: intel_pstate: Clean up aperf_mperf_shift description · 23a522e3

由 Rafael J. Wysocki 提交于 7月 15, 2020

The kerneldoc description of the aperf_mperf_shift field in
struct global_params is unclear and there is a typo in it, so
simplify it and clean it up.
Reported-by: NLee Jones <lee.jones@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NLee Jones <lee.jones@linaro.org>

23a522e3

15 7月, 2020 1 次提交

cpufreq: intel_pstate: Supply struct attribute description for get_aperf_mperf_shift() · 8f23d1f1

由 Lee Jones 提交于 7月 15, 2020

Fixes the following W=1 kernel build warning(s):

 drivers/cpufreq/intel_pstate.c:293: warning: Function parameter or member 'get_aperf_mperf_shift' not described in 'pstate_funcs'
Suggested-by: N"Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
[ rjw: Remove line break ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8f23d1f1

13 7月, 2020 2 次提交

cpufreq: intel_pstate: Fix active mode setting from command line · 39a188b8

由 Rafael J. Wysocki 提交于 7月 13, 2020

If intel_pstate starts in the passive mode by default (that happens
when the processor in the system doesn't support HWP), passing
intel_pstate=active in the kernel command line doesn't work, so
fix that.

Fixes: 33aa46f2 ("cpufreq: intel_pstate: Use passive mode by default without HWP")
Reported-by: NDoug Smythies <dsmythies@telus.net>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NDoug Smythies <dsmythies@telus.net>

39a188b8

cpufreq: intel_pstate: Fix static checker warning for epp variable · 3ff79754

由 Srinivas Pandruvada 提交于 7月 09, 2020

Fix warning for:
drivers/cpufreq/intel_pstate.c:731 store_energy_performance_preference()
error: uninitialized symbol 'epp'.

This warning is for a case, when energy_performance_preference attribute
matches pre defined strings. In this case the value of raw epp will not
be used to set EPP bits in MSR_HWP_REQUEST. So initializing with any
value is fine.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3ff79754

02 7月, 2020 2 次提交

cpufreq: intel_pstate: Allow raw energy performance preference value · f473bf39

由 Srinivas Pandruvada 提交于 6月 26, 2020

Currently using attribute "energy_performance_preference", user space can
write one of the four per-defined preference string. These preference
strings gets mapped to a hard-coded Energy-Performance Preference (EPP) or
Energy-Performance Bias (EPB) knob.

These four values are supposed to cover broad spectrum of use cases, but
are not uniformly distributed in the range. There are number of cases,
where this is not enough. For example:

Suppose user wants more performance when connected to AC. Instead of using
default "balance performance", the "performance" setting can be used. This
changes EPP value from 0x80 to 0x00. But setting EPP to 0, results in
electrical and thermal issues on some platforms. This results in
aggressive throttling, which causes a drop in performance. But some value
between 0x80 and 0x00 results in better performance. But that value can't
be fixed as the power curve is not linear. In some cases just changing EPP
from 0x80 to 0x75 is enough to get significant performance gain.

Similarly on battery the default "balance_performance" mode can be
aggressive in power consumption. But picking up the next choice
"balance power" results in too much loss of performance, which results in
bad user experience in use cases like "Google Hangout". It was observed
that some value between these two EPP is optimal.

This change allows fine grain EPP tuning for platform like Chromebook or
for users who wants to fine tune power and performance.
Here based on the product and use cases, different EPP values can be set.
This change is similar to the change done for:
/sys/devices/system/cpu/cpu*/power/energy_perf_bias
where user has choice to write a predefined string or raw value.

The change itself is trivial. When user preference doesn't match
predefined string preferences and value is an unsigned integer and in
range, use that value for EPP. When the EPP feature is not present
writing raw value is not supported.
Suggested-by: NLen Brown <lenb@kernel.org>
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f473bf39

cpufreq: intel_pstate: Allow enable/disable energy efficiency · ed7bde7a

由 Srinivas Pandruvada 提交于 6月 26, 2020

By default intel_pstate the driver disables energy efficiency by setting
MSR_IA32_POWER_CTL bit 19 for Kaby Lake desktop CPU model in HWP mode.
This CPU model is also shared by Coffee Lake desktop CPUs. This allows
these systems to reach maximum possible frequency. But this adds power
penalty, which some customers don't want. They want some way to enable/
disable dynamically.

So, add an additional attribute "energy_efficiency" under
/sys/devices/system/cpu/intel_pstate/ for these CPU models. This allows
to read and write bit 19 ("Disable Energy Efficiency Optimization") in
the MSR IA32_POWER_CTL.

This attribute is present in both HWP and non-HWP mode as this has an
effect in both modes. Refer to Intel Software Developer's manual for
details.

The scope of this bit is package wide. Also these systems are single
package systems. So read/write MSR on the current CPU is enough.

The energy efficiency (EE) bit setting needs to be preserved during
suspend/resume and CPU offline/online operation. To do this:
- Restoring the EE setting from the cpufreq resume() callback, if there
is change from the system default.
- By default, don't disable EE from cpufreq init() callback for matching
CPU models. Since the scope is package wide and is a single package
system, move the disable EE calls from init() callback to
intel_pstate_init() function, which is called only once.
Suggested-by: NLen Brown <lenb@kernel.org>
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ed7bde7a

23 6月, 2020 1 次提交

cpufreq: intel_pstate: Add one more OOB control bit · 589bab6b

由 Srinivas Pandruvada 提交于 6月 12, 2020

Add one more bit for OOB (Out Of Band) enabling of P-states.

If OOB handling of P-states is enabled, intel_pstate shouldn't load.
Currently, only "BIT(8) == 1" of the MSR MSR_MISC_PWR_MGMT is
considered as OOB, but "BIT(18) == 1" needs to be taken into
consideration as OOB condition too.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Add an empty code line, edit subject and changelog ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

589bab6b

27 4月, 2020 1 次提交

cpufreq: intel_pstate: Only mention the BIOS disabling turbo mode once · 8c539776

由 Chris Wilson 提交于 4月 10, 2020

Make a note of the first time we discover the turbo mode has been
disabled by the BIOS, as otherwise we complain every time we try to
update the mode.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8c539776

17 4月, 2020 1 次提交

cpufreq: intel_pstate: Use passive mode by default without HWP · 33aa46f2

由 Rafael J. Wysocki 提交于 3月 25, 2020

After recent changes allowing scale-invariant utilization to be
used on x86, the schedutil governor on top of intel_pstate in the
passive mode should be on par with (or better than) the active mode
"powersave" algorithm of intel_pstate on systems in which
hardware-managed P-states (HWP) are not used, so it should not be
necessary to use the internal scaling algorithm in those cases.

Accordingly, modify intel_pstate to start in the passive mode by
default if the processor at hand does not support HWP of if the driver
is requested to avoid using HWP through the kernel command line.

Among other things, that will allow utilization clamps and the
support for RT/DL tasks in the schedutil governor to be utilized on
systems in which intel_pstate is used.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

33aa46f2

27 3月, 2020 1 次提交

cpufreq: intel_pstate: Simplify intel_pstate_cpu_init() · 5ac54113

由 Rafael J. Wysocki 提交于 3月 25, 2020

The initial policy value set by intel_pstate_cpu_init() depends on
whether or not CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is set, but
that is not necessary, because the core will set the policy to
"performance" in cpufreq_init_policy() if the default governor is
"performance" anyway.

Accordingly, change intel_pstate_cpu_init() to always set policy
to CPUFREQ_POLICY_POWERSAVE initially to provide a valid fallback
value to cpufreq_init_policy() in case the default cpufreq governor
is neither "powersave" nor "performance".
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

5ac54113

25 3月, 2020 2 次提交

cpufreq/intel_pstate: Fix wrong macro conversion · d9782807

由 Thomas Gleixner 提交于 3月 25, 2020

The feature flag hwp_support_ids are supposed to match on is
X86_FEATURE_HWP, not X86_FEATURE_APERFMPERF. Fix it.

 [ bp: Write commit message. ]

Fixes: b11d77fa ("cpufreq: Convert to new X86 CPU match macros")
Reported-by: Nkernel test robot <rong.a.chen@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200324060124.GC11705@shao2-debian

d9782807

cpufreq: Convert to new X86 CPU match macros · b11d77fa

由 Thomas Gleixner 提交于 3月 24, 2020

The new macro set has a consistent namespace and uses C99 initializers
instead of the grufty C89 ones.

Get rid the of most local macro wrappers for consistency. The ones which
make sense for readability are renamed to X86_MATCH*.

In the centrino driver this also removes the two extra duplicates of family
6 model 13 which have no value at all.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lkml.kernel.org/r/87eetheu88.fsf@nanos.tec.linutronix.de

b11d77fa

14 3月, 2020 1 次提交

cpufreq: intel_pstate: Consolidate policy verification · d5a2a6bb

由 Rafael J. Wysocki 提交于 3月 06, 2020

There is still some code duplication between intel_pstate_verify_policy()
and intel_cpufreq_verify_policy(), so avoid it by moving the common
code into a separate function and calling it from both these places.

No intentional functional impact.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d5a2a6bb

29 1月, 2020 1 次提交

x86/intel_pstate: Handle runtime turbo disablement/enablement in frequency invariance · 918229cd

由 Giovanni Gherdovich 提交于 1月 22, 2020

On some platforms such as the Dell XPS 13 laptop the firmware disables turbo
when the machine is disconnected from AC, and viceversa it enables it again
when it's reconnected. In these cases a _PPC ACPI notification is issued.

The scheduler needs to know freq_max for frequency-invariant calculations.
To account for turbo availability to come and go, record freq_max at boot as
if turbo was available and store it in a helper variable. Use a setter
function to swap between freq_base and freq_max every time turbo goes off or on.
Signed-off-by: NGiovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lkml.kernel.org/r/20200122151617.531-7-ggherdovich@suse.cz

918229cd

27 1月, 2020 1 次提交

cpufreq: Avoid creating excessively large stack frames · 1e4f63ae

由 Rafael J. Wysocki 提交于 1月 26, 2020

In the process of modifying a cpufreq policy, the cpufreq core makes
a copy of it including all of the internals which is stored on the
CPU stack. Because struct cpufreq_policy is relatively large, this
may cause the size of the stack frame to exceed the 2 KB limit and
so the GCC complains when -Wframe-larger-than= is used.

In fact, it is not necessary to copy the entire policy structure
in order to modify it, however.

First, because cpufreq_set_policy() obtains the min and max policy
limits from frequency QoS now, it is not necessary to pass the limits
to it from the callers. The only things that need to be passed to it
from there are the new governor pointer or (if there is a built-in
governor in the driver) the "policy" value representing the governor
choice. They both can be passed as individual arguments, though, so
make cpufreq_set_policy() take them this way and rework its callers
accordingly. This avoids making copies of cpufreq policies in the
callers of cpufreq_set_policy().

Second, cpufreq_set_policy() still needs to pass the new policy
data to the ->verify() callback of the cpufreq driver whose task
is to sanitize the min and max policy limits. It still does not
need to make a full copy of struct cpufreq_policy for this purpose,
but it needs to pass a few items from it to the driver in case they
are needed (different drivers have different needs in that respect
and all of them have to be covered). For this reason, introduce
struct cpufreq_policy_data to hold copies of the members of
struct cpufreq_policy used by the existing ->verify() driver
callbacks and pass a pointer to a temporary structure of that
type to ->verify() (instead of passing a pointer to full struct
cpufreq_policy to it).

While at it, notice that intel_pstate and longrun don't really need
to verify the "policy" value in struct cpufreq_policy, so drop those
check from them to avoid copying "policy" into struct
cpufreq_policy_data (which allows it to be slightly smaller).

Also while at it fix up white space in a couple of places and make
cpufreq_set_policy() static (as it can be so).

Fixes: 3000ce3c ("cpufreq: Use per-policy frequency QoS")
Link: https://lore.kernel.org/linux-pm/CAMuHMdX6-jb1W8uC2_237m8ctCpsnGp=JCxqt8pCWVqNXHmkVg@mail.gmail.comReported-by: Nkbuild test robot <lkp@intel.com>
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

1e4f63ae

13 1月, 2020 1 次提交

cpufreq: intel_pstate: fix spelling mistake: "Whethet" -> "Whether" · 731e6b97

由 Harry Pan 提交于 1月 13, 2020

Fix a spelling typo in the comment, no function change.
Signed-off-by: NHarry Pan <harry.pan@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

731e6b97

08 11月, 2019 1 次提交

cpufreq: intel_pstate: Fix invalid EPB setting · c31432fa

由 Srinivas Pandruvada 提交于 10月 31, 2019

The max value of EPB can only be 0x0F. Attempting to set more than that
triggers an "unchecked MSR access error" warning which happens in
intel_pstate_hwp_force_min_perf() called via cpufreq stop_cpu().

However, it is not even necessary to touch the EPB from intel_pstate,
because it is restored on every CPU online by the intel_epb.c code,
so let that code do the right thing and drop the redundant (and
incorrect) EPB update from intel_pstate.

Fixes: af3b7379 ("cpufreq: intel_pstate: Force HWP min perf before offline")
Reported-by: NQian Cai <cai@lca.pw>
Cc: 5.2+ <stable@vger.kernel.org> # 5.2+
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c31432fa

06 11月, 2019 1 次提交

cpufreq: intel_pstate: Fix plain int as pointer warning from sparse · 8d2eecea

由 Jamal Shareef 提交于 11月 04, 2019

Fix sparse warning: Using plain integer as NULL pointer.

Replace assignment of 0 to pointers with NULL assignment.
Signed-off-by: NJamal Shareef <jamal.k.shareef@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8d2eecea

21 10月, 2019 1 次提交

cpufreq: Use per-policy frequency QoS · 3000ce3c

由 Rafael J. Wysocki 提交于 10月 16, 2019

Replace the CPU device PM QoS used for the management of min and max
frequency constraints in cpufreq (and its users) with per-policy
frequency QoS to avoid problems with cpufreq policies covering
more then one CPU.

Namely, a cpufreq driver is registered with the subsys interface
which calls cpufreq_add_dev() for each CPU, starting from CPU0, so
currently the PM QoS notifiers are added to the first CPU in the
policy (i.e. CPU0 in the majority of cases).

In turn, when the cpufreq driver is unregistered, the subsys interface
doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0,
and the PM QoS notifiers are only removed when cpufreq_remove_dev() is
called for the last CPU in the policy, say CPUx, which as a rule is
not CPU0 if the policy covers more than one CPU. Then, the PM QoS
notifiers cannot be removed, because CPUx does not have them, and
they are still there in the device PM QoS notifiers list of CPU0,
which prevents new PM QoS notifiers from being registered for CPU0
on the next attempt to register the cpufreq driver.

The same issue occurs when the first CPU in the policy goes offline
before unregistering the driver.

After this change it does not matter which CPU is the policy CPU at
the driver registration time and whether or not it is online all the
time, because the frequency QoS is per policy and not per CPU.

Fixes: 67d874c3 ("cpufreq: Register notifiers with the PM QoS framework")
Reported-by: NDmitry Osipenko <digetx@gmail.com>
Tested-by: NDmitry Osipenko <digetx@gmail.com>
Reported-by: NSudeep Holla <sudeep.holla@arm.com>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>
Diagnosed-by: NViresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29bSigned-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

3000ce3c

28 8月, 2019 4 次提交

x86/intel: Aggregate microserver naming · 5ebb34ed

由 Peter Zijlstra 提交于 8月 27, 2019

Currently big microservers have _XEON_D while small microservers have
_X, Make it uniformly: _D.

for i in `git grep -l "\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*_\(X\|XEON_D\)"`
do
	sed -i -e 's/\(\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*ATOM.*\)_X/\1_D/g' \
	       -e 's/\(\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*\)_XEON_D/\1_D/g' ${i}
done
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20190827195122.677152989@infradead.org

5ebb34ed

x86/intel: Aggregate big core graphics naming · 5e741407

由 Peter Zijlstra 提交于 8月 27, 2019

Currently big core clients with extra graphics on have:

 - _G
 - _GT3E

Make it uniformly: _G

for i in `git grep -l "\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*_GT3E"`
do
	sed -i -e 's/\(\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*\)_GT3E/\1_G/g' ${i}
done
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20190827195122.622802314@infradead.org

5e741407

x86/intel: Aggregate big core mobile naming · af239c44

由 Peter Zijlstra 提交于 8月 27, 2019

Currently big core mobile chips have either:

 - _L
 - _ULT
 - _MOBILE

Make it uniformly: _L.

for i in `git grep -l "\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*_\(MOBILE\|ULT\)"`
do
	sed -i -e 's/\(\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*\)_\(MOBILE\|ULT\)/\1_L/g' ${i}
done
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190827195122.568978530@infradead.org

af239c44

x86/intel: Aggregate big core client naming · c66f78a6

由 Peter Zijlstra 提交于 8月 27, 2019

Currently the big core client models either have:

 - no OPTDIFF
 - _CORE
 - _DESKTOP

Make it uniformly: 'no OPTDIFF'.

for i in `git grep -l "\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*_\(CORE\|DESKTOP\)"`
do
	sed -i -e 's/\(\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*\)_\(CORE\|DESKTOP\)/\1/g' ${i}
done
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190827195122.513945586@infradead.org

c66f78a6

21 8月, 2019 1 次提交

cpufreq: intel_pstate: Implement QoS supported freq constraints · da5c504c

由 Viresh Kumar 提交于 8月 09, 2019

Intel pstate driver exposes min_perf_pct and max_perf_pct sysfs files,
which can be used to force a limit on the min/max P state of the driver.
Though these files eventually control the min/max frequencies that the
CPUs will run at, they don't make a change to policy->min/max values.

When the values of these files are changed (in passive mode of the
driver), it leads to calling ->limits() callback of the cpufreq
governors, like schedutil. On a call to it the governors shall
forcefully update the frequency to come within the limits. Since the
limits, i.e. policy->min/max, aren't updated by the driver, the
governors fails to get the target freq within limit and sometimes aborts
the update believing that the frequency is already set to the target
value.

This patch implements the QoS supported frequency constraints to update
policy->min/max values whenever min_perf_pct or max_perf_pct files are
updated. This is only done for the passive mode as of now, as the driver
is already working fine in active mode.

Fixes: ecd28842 ("cpufreq: schedutil: Don't set next_freq to UINT_MAX")
Reported-by: NDoug Smythies <dsmythies@telus.net>
Tested-by: NDoug Smythies <dsmythies@telus.net>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

da5c504c

09 7月, 2019 1 次提交

cpufreq: intel_pstate: Reuse refresh_frequency_limits() · c57b25bd

由 Viresh Kumar 提交于 7月 04, 2019

The implementation of intel_pstate_update_max_freq() is quite similar to
refresh_frequency_limits(), lets reuse it.

Finding minimum of policy->user_policy.max and policy->cpuinfo.max_freq
in intel_pstate_update_max_freq() is redundant as cpufreq_set_policy()
will call the ->verify() callback of intel-pstate driver, which will do
this comparison anyway and so dropping it from
intel_pstate_update_max_freq() doesn't harm.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c57b25bd

05 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 · b886d83c

由 Thomas Gleixner 提交于 6月 01, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 of the license

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 315 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Reviewed-by: NArmijn Hemel <armijn@tjaldur.nl>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b886d83c

08 4月, 2019 2 次提交

drivers/cpufreq: Convert some slow-path static_cpu_has() callers to boot_cpu_has() · 108ec36b

由 Borislav Petkov 提交于 3月 30, 2019

Using static_cpu_has() is pointless on those paths, convert them to the
boot_cpu_has() variant.

No functional changes.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

108ec36b

cpufreq: intel_pstate: Update max frequency on global turbo changes · 9083e498

由 Rafael J. Wysocki 提交于 3月 26, 2019

While the cpuinfo.max_freq value doesn't really matter for
intel_pstate in the active mode, in the passive mode it is used by
governors as the maximum physical frequency of the CPU and the
results of governor computations generally depend on it.  Also it
is made available to user space via sysfs and it should match the
current HW configuration.

For this reason, make intel_pstate update cpuinfo.max_freq for all
CPUs if it detects a global change of turbo frequency settings from
"disable" to "enable" or the other way associated with a _PPC change
notification from the platform firmware.

Note that policy_is_inactive(),  cpufreq_cpu_acquire(),
cpufreq_cpu_release(), and cpufreq_set_policy() need to be made
available to it for this purpose.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759Reported-by: NGabriele Mazzotta <gabriele.mzt@gmail.com>
Tested-by: NGabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

9083e498

02 4月, 2019 2 次提交

cpufreq: intel_pstate: Driver-specific handling of _PPC updates · 5a25e3f7

由 Rafael J. Wysocki 提交于 3月 26, 2019

In some cases, the platform firmware disables or enables turbo
frequencies for all CPUs globally before triggering a _PPC change
notification for one of them.  Obviously, that global change affects
all CPUs, not just the notified one, and it needs to be acted upon by
cpufreq.

The intel_pstate driver is able to detect such global changes of
the settings, but it also needs to update policy limits for all
CPUs if that happens, in particular if turbo frequencies are
enabled globally - to allow them to be used.

For this reason, introduce a new cpufreq driver callback to be
invoked on _PPC notifications, if present, instead of simply
calling cpufreq_update_policy() for the notified CPU and make
intel_pstate use it to trigger policy updates for all CPUs
in the system if global settings change.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759Reported-by: NGabriele Mazzotta <gabriele.mzt@gmail.com>
Tested-by: NGabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

5a25e3f7

cpufreq/intel_pstate: Load only on Intel hardware · 4ab52646

由 Borislav Petkov 提交于 4月 01, 2019

This driver is Intel-only so loading on anything which is not Intel is
pointless. Prevent it from doing so.

While at it, correct the "not supported" print statement to say CPU
"model" which is what that test does.

Fixes: 076b862c (cpufreq: intel_pstate: Add reasons for failure and debug messages)
Suggested-by: NErwan Velu <e.velu@criteo.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NThomas Renninger <trenn@suse.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4ab52646

26 3月, 2019 1 次提交

cpufreq: intel_pstate: Also use CPPC nominal_perf for base_frequency · 92a3e426

由 Srinivas Pandruvada 提交于 3月 25, 2019

The ACPI specification states that if the "Guaranteed Performance
Register" is not implemented, the OSPM assumes guaranteed performance
to always be equal to nominal performance.

So for invalid or unimplemented guaranteed performance register, use
nominal performance as guaranteed performance.

This change will fall back to nominal_perf when guranteed_perf is
invalid.  If nominal_perf is also invalid or not present, fall back
to the existing implementation, which is to read from HWP Capabilities
MSR.

Fixes: 86d333a8 ("cpufreq: intel_pstate: Add base_frequency attribute")
Suggested-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 4.20+ <stable@vger.kernel.org> # 4.20+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

92a3e426

12 3月, 2019 1 次提交

cpufreq: intel_pstate: Fix up iowait_boost computation · 8e3b4039

由 Rafael J. Wysocki 提交于 3月 11, 2019

After commit b8bd1581 ("cpufreq: intel_pstate: Rework iowait
boosting to be less aggressive") the handling of the case when
the SCHED_CPUFREQ_IOWAIT flag is set again after a few iterations of
intel_pstate_update_util() is a bit inconsistent, because the
new value of cpu->iowait_boost may be lower than ONE_EIGHTH_FP
if it was set before, but has not dropped down to zero just yet.

Fix that up by ensuring that the new value of cpu->iowait_boost
will always be at least ONE_EIGHTH_FP then.

Fixes: b8bd1581 ("cpufreq: intel_pstate: Rework iowait boosting to be less aggressive")
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8e3b4039

18 2月, 2019 1 次提交

cpufreq: intel_pstate: Rework iowait boosting to be less aggressive · b8bd1581

由 Rafael J. Wysocki 提交于 2月 07, 2019

The current iowait boosting mechanism in intel_pstate_update_util()
is quite aggressive, as it goes to the maximum P-state right away,
and may cause excessive amounts of energy to be used, which is not
desirable and arguably isn't necessary too.

Follow commit a5a0809b ("cpufreq: schedutil: Make iowait boost
more energy efficient") that reworked the analogous iowait boost
mechanism in the schedutil governor and make the iowait boosting
in intel_pstate_update_util() work along the same lines.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b8bd1581

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功