- 10 4月, 2016 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Jörg Otte reports that commit a4675fbc (cpufreq: intel_pstate: Replace timers with utilization update callbacks) caused the CPUs in his Haswell-based system to stay in the very high frequency region even if the system is completely idle. That turns out to be an existing problem in the intel_pstate driver's P-state selection algorithm for Core processors. Namely, all decisions made by that algorithm are based on the average frequency of the CPU between sampling events and on the P-state requested on the last invocation, so it may get stuck at a very hight frequency even if the utilization of the CPU is very low (in fact, it may get stuck in a inadequate P-state regardless of the CPU utilization). The only way to kick it out of that limbo is a sufficiently long idle period (3 times longer than the prescribed sampling interval), but if that doesn't happen often enough (eg. due to a timing change like after the above commit), the P-state of the CPU may be inadequate pretty much all the time. To address the most egregious manifestations of that issue, reset the core_busy value used to determine the next P-state to request if the utilization of the CPU, determined with the help of the MPERF feedback register and the TSC, is below 1%. Link: https://bugzilla.kernel.org/show_bug.cgi?id=115771Reported-and-tested-by: NJörg Otte <jrg.otte@gmail.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 05 4月, 2016 2 次提交
-
-
由 Srinivas Pandruvada 提交于
No code change. Only added kernel doc style comments for structures. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Srinivas Pandruvada 提交于
When user sets performance policy using cpufreq interface, it is possible that because of policy->max limits, the actual performance is still limited. But the current implementation will silently switch the policy to powersave and start using powersave limits. If user modifies any limits using intel_pstate sysfs, this is actually changing powersave limits. The current implementation tracks limits under powersave and performance policy using two different variables. When policy->max is less than policy->cpuinfo.max_freq, only powersave limit variable is used. This fix causes the performance limits variable to be used always when the policy is performance. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 02 4月, 2016 1 次提交
-
-
由 Rafael J. Wysocki 提交于
The initialization of intel_pstate for a given CPU involves populating the fields of its struct cpudata that represent the previous sample, but currently that is done in a problematic way. Namely, intel_pstate_init_cpu() makes an extra call to intel_pstate_sample() so it reads the current register values that will be used to populate the "previous sample" record during the next invocation of intel_pstate_sample(). However, after commit a4675fbc (cpufreq: intel_pstate: Replace timers with utilization update callbacks) that doesn't work for last_sample_time, because the time value is passed to intel_pstate_sample() as an argument now. Passing 0 to it from intel_pstate_init_cpu() is problematic, because that causes cpu->last_sample_time == 0 to be visible in get_target_pstate_use_performance() (and hence the extra cpu->last_sample_time > 0 check in there) and effectively allows the first invocation of intel_pstate_sample() from intel_pstate_update_util() to happen immediately after the initialization which may lead to a significant "turn on" effect in the governor algorithm. To mitigate that issue, rework the initialization to avoid the extra intel_pstate_sample() call from intel_pstate_init_cpu(). Instead, make intel_pstate_sample() return false if it has been called with cpu->sample.time equal to zero, which will make intel_pstate_update_util() skip the sample in that case, and reset cpu->sample.time from intel_pstate_set_update_util_hook() to make the algorithm start properly every time the hook is set. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 31 3月, 2016 1 次提交
-
-
由 Rafael J. Wysocki 提交于
The utilization update hook in the intel_pstate driver is set too early, as it only should be set after the policy has been fully initialized by the core. That may cause intel_pstate_update_util() to use incorrect data and put the CPUs into incorrect P-states as a result. To prevent that from happening, make intel_pstate_set_policy() set the utilization update hook instead of intel_pstate_init_cpu() so intel_pstate_update_util() only runs when all things have been initialized as appropriate. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 20 3月, 2016 1 次提交
-
-
由 Rafael J. Wysocki 提交于
After commit a4675fbc (cpufreq: intel_pstate: Replace timers with utilization update callbacks) wrmsrl_on_cpu() cannot be called in the intel_pstate_adjust_busy_pstate() path as that is executed with disabled interrupts. However, atom_set_pstate() called from there via intel_pstate_set_pstate() uses wrmsrl_on_cpu() to update the IA32_PERF_CTL MSR which triggers the WARN_ON_ONCE() in smp_call_function_single(). The reason why wrmsrl_on_cpu() is used by atom_set_pstate() is because intel_pstate_set_pstate() calling it is also invoked during the initialization and cleanup of the driver and in those cases it is not guaranteed to be run on the CPU that is being updated. However, in the case when intel_pstate_set_pstate() is called by intel_pstate_adjust_busy_pstate(), wrmsrl() can be used to update the register safely. Moreover, intel_pstate_set_pstate() already contains code that only is executed if the function is called by intel_pstate_adjust_busy_pstate() and there is a special argument passed to it because of that. To fix the problem at hand, rearrange the code taking the above observations into account. First, replace the ->set() callback in struct pstate_funcs with a ->get_val() one that will return the value to be written to the IA32_PERF_CTL MSR without updating the register. Second, split intel_pstate_set_pstate() into two functions, intel_pstate_update_pstate() to be called by intel_pstate_adjust_busy_pstate() that will contain all of the intel_pstate_set_pstate() code which only needs to be executed in that case and will use wrmsrl() to update the MSR (after obtaining the value to write to it from the ->get_val() callback), and intel_pstate_set_min_pstate() to be invoked during the initialization and cleanup that will set the P-state to the minimum one and will update the MSR using wrmsrl_on_cpu(). Finally, move the code shared between intel_pstate_update_pstate() and intel_pstate_set_min_pstate() to a new static inline function intel_pstate_record_pstate() and make them both call it. Of course, that unifies the handling of the IA32_PERF_CTL MSR writes between Atom and Core. Fixes: a4675fbc (cpufreq: intel_pstate: Replace timers with utilization update callbacks) Reported-and-tested-by: NJosh Boyer <jwboyer@fedoraproject.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 11 3月, 2016 5 次提交
-
-
由 Rafael J. Wysocki 提交于
If the current value of MPERF or the current value of TSC is the same as the previous one, respectively, intel_pstate_sample() bails out early and skips the sample. However, intel_pstate_adjust_busy_pstate() is still called in that case which is not correct, so modify intel_pstate_sample() to return a bool value indicating whether or not the sample has been taken and use it to decide whether or not to call intel_pstate_adjust_busy_pstate(). While at it, remove redundant parentheses from the MPERF/TSC check in intel_pstate_sample(). Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
-
由 Philippe Longepe 提交于
Use a helper function to compute the average pstate and call it only where it is needed (only when tracing or in intel_pstate_get). Signed-off-by: NPhilippe Longepe <philippe.longepe@linux.intel.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Philippe Longepe 提交于
The cpu_load algorithm doesn't need to invoke intel_pstate_calc_busy(), so move that call from intel_pstate_sample() to get_target_pstate_use_performance(). Signed-off-by: NPhilippe Longepe <philippe.longepe@linux.intel.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Philippe Longepe 提交于
mul_fp(int_tofp(A), B) expands to: ((A << FRAC_BITS) * B) >> FRAC_BITS, so the same result can be obtained via simple multiplication A * B. Apply this observation to max_perf * limits->max_perf and max_perf * limits->min_perf in intel_pstate_get_min_max()." Signed-off-by: NPhilippe Longepe <philippe.longepe@linux.intel.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Philippe Longepe 提交于
pid->setpoint and pid->deadband can be initialized in fixed point, so we can avoid the int_tofp in pid_calc. Signed-off-by: NPhilippe Longepe <philippe.longepe@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 09 3月, 2016 2 次提交
-
-
由 Rafael J. Wysocki 提交于
Use the observation that cpufreq_update_util() is only called by the scheduler with rq->lock held, so the callers of cpufreq_set_update_util_data() can use synchronize_sched() instead of synchronize_rcu() to wait for cpufreq_update_util() to complete. Moreover, if they are updated to do that, rcu_read_(un)lock() calls in cpufreq_update_util() might be replaced with rcu_read_(un)lock_sched(), respectively, but those aren't really necessary, because the scheduler calls that function from RCU-sched read-side critical sections already. In addition to that, if cpufreq_set_update_util_data() checks the func field in the struct update_util_data before setting the per-CPU pointer to it, the data->func check may be dropped from cpufreq_update_util() as well. Make the above changes to reduce the overhead from cpufreq_update_util() in the scheduler paths invoking it and to make the cleanup after removing its callbacks less heavy-weight somewhat. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
由 Rafael J. Wysocki 提交于
Instead of using a per-CPU deferrable timer for utilization sampling and P-states adjustments, register a utilization update callback that will be invoked from the scheduler on utilization changes. The sampling rate is still the same as what was used for the deferrable timers, so the functional impact of this patch should not be significant. Based on an earlier patch from Srinivas Pandruvada. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
-
- 27 2月, 2016 2 次提交
-
-
由 Srinivas Pandruvada 提交于
Disable HWP Interrupt notification before enabling HWP. Since we don't have HWP interrupt handling for possible performance interrupts, there is not much use of enabling HWP interrupts. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Srinivas Pandruvada 提交于
If the processor supports HWP, enable it by default without checking for the cpu model. This will allow to enable HWP in all supported processors without driver change. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 23 2月, 2016 1 次提交
-
-
由 Viresh Kumar 提交于
The intel-pstate driver is using intel_pstate_hwp_set() from two separate paths, i.e. ->set_policy() callback and sysfs update path for the files present in /sys/devices/system/cpu/intel_pstate/ directory. While an update to the sysfs path applies to all the CPUs being managed by the driver (which essentially means all the online CPUs), the update via the ->set_policy() callback applies to a smaller group of CPUs managed by the policy for which ->set_policy() is called. And so, intel_pstate_hwp_set() should update frequencies of only the CPUs that are part of policy->cpus mask, while it is called from ->set_policy() callback. In order to do that, add a parameter (cpumask) to intel_pstate_hwp_set() and apply the frequency changes only to the concerned CPUs. For ->set_policy() path, we are only concerned about policy->cpus, and so policy->rwsem lock taken by the core prior to calling ->set_policy() is enough to take care of any races. The larger lock acquired by get_online_cpus() is required only for the updates to sysfs files. Add another routine, intel_pstate_hwp_set_online_cpus(), and call it from the sysfs update paths. This also fixes a lockdep reported recently, where policy->rwsem and get_online_cpus() could have been acquired in any order causing an ABBA deadlock. The sequence of events leading to that was: intel_pstate_init(...) ...cpufreq_online(...) down_write(&policy->rwsem); // Locks policy->rwsem ... cpufreq_init_policy(policy); ...intel_pstate_hwp_set(); get_online_cpus(); // Temporarily locks cpu_hotplug.lock ... up_write(&policy->rwsem); pm_suspend(...) ...disable_nonboot_cpus() _cpu_down() cpu_hotplug_begin(); // Locks cpu_hotplug.lock __cpu_notify(CPU_DOWN_PREPARE, ...); ...cpufreq_offline_prepare(); down_write(&policy->rwsem); // Locks policy->rwsem Reported-and-tested-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 12 12月, 2015 1 次提交
-
-
由 Prarit Bhargava 提交于
785ee278 ("cpufreq: intel_pstate: Fix limits->max_perf rounding error") hardcodes the value of FRAC_BITS. This patch fixes that minor issue. Fixes: 785ee278 (cpufreq: intel_pstate: Fix limits->max_perf rounding error) Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 10 12月, 2015 3 次提交
-
-
由 Philippe Longepe 提交于
In cases where we have many IOs, the global load becomes low and the load algorithm will decrease the requested P-State. Because of that, the IOs overheads will increase and impact the IO performances. To improve IO bound work, we can count the io-wait time as busy time in calculating CPU busy. This change uses get_cpu_iowait_time_us() to obtain the IO wait time value and converts time into number of cycles spent waiting on IO at the TSC rate. At the moment, this trick is only used for Atom. Signed-off-by: NPhilippe Longepe <philippe.longepe@intel.com> Signed-off-by: NStephane Gasparini <stephane.gasparini@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Philippe Longepe 提交于
The current function to calculate cpu utilization uses the average P-state ratio (APerf/Mperf) scaled by the ratio of the current P-state to the max available non-turbo one. This leads to an overestimation of utilization which causes higher-performance P-states to be selected more often and that leads to increased energy consumption. This is a problem for low-power systems, so it is better to use a different utilization calculation algorithm for them. Namely, the Percent Busy value (or load) can be estimated as the ratio of the MPERF counter that runs at a constant rate only during active periods (C0) to the time stamp counter (TSC) that also runs (at the same rate) during idle. That is: Percent Busy = 100 * (delta_mperf / delta_tsc) Use this algorithm for platforms with SoCs based on the Airmont and Silvermont Atom cores. Signed-off-by: NPhilippe Longepe <philippe.longepe@intel.com> Signed-off-by: NStephane Gasparini <stephane.gasparini@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Philippe Longepe 提交于
Target systems using different cpus have different power and performance requirements. They may use different algorithms to get the next P-state based on their power or performance preference. For example, power-constrained systems may not want to use high-performance P-states as aggressively as a full-size desktop or a server platform. A server platform may want to run close to the max to achieve better performance, while laptop-like systems may prefer sacrificing performance for longer battery lifes. For the above reasons, modify intel_pstate to allow the target P-state selection algorithm to be depend on the CPU ID. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NPhilippe Longepe <philippe.longepe@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 26 11月, 2015 1 次提交
-
-
由 Alexandra Yates 提交于
If hardware-driven P-state selection (HWP) is enabled, the "performance" mode of intel_pstate should only allow the processor to use the highest-performance P-state available. That is not the case currently, so make it actually happen. Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NAlexandra Yates <alexandra.yates@linux.intel.com> [ rjw: Subject and changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 24 11月, 2015 2 次提交
-
-
由 Prarit Bhargava 提交于
A rounding error was found in the calculation of limits->max_perf in intel_pstate_set_policy(), which is used to calculate the max and min pstate values in intel_pstate_get_min_max(). In that code, limits->max_perf is truncated to 2 hex digits such that, for example, 0x169 was incorrectly calculated to 0x16 instead of 0x17. This resulted in the pstate being set one level too low. This patch rounds the value of limits->max_perf up instead of down so that the correct max pstate can be reached. Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Prarit Bhargava 提交于
I have a Intel (6,63) processor with a "marketing" frequency (from /proc/cpuinfo) of 2100MHz, and a max turbo frequency of 2600MHz. I can execute cpupower frequency-set -g powersave --min 1200MHz --max 2100MHz and the max_freq_pct is set to 80. When adding load to the system I noticed that the cpu frequency only reached 2000MHZ and not 2100MHz as expected. This is because limits->max_policy_pct is calculated as 2100 * 100 /2600 = 80.7 and is rounded down to 80 when it should be rounded up to 81. This patch adds a DIV_ROUND_UP() which will return the correct value. Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 19 11月, 2015 4 次提交
-
-
由 Philippe Longepe 提交于
There are two flavors of Atom cores to be supported by intel_pstate, Silvermont and Airmont, so make the driver distinguish between them by adding separate frequency tables. Separate the CPU defaults params for each of them and match the CPU IDs against them as appropriate. Signed-off-by: NPhilippe Longepe <philippe.longepe@linux.intel.com> Signed-off-by: NStephane Gasparini <stephane.gasparini@linux.intel.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject and changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Philippe Longepe 提交于
Rename symbol and function names starting with "BYT" or "byt" to start with "ATOM" or "atom", respectively, so as to make it clear that they may apply to Atom in general and not just to Baytrail (the goal is to support several Atoms architectures eventually). This should not lead to any functional changes. Signed-off-by: NPhilippe Longepe <philippe.longepe@linux.intel.com> Signed-off-by: NStephane Gasparini <stephane.gasparini@linux.intel.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw : Changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
Revert commit 37afb000 (cpufreq: intel_pstate: Use ACPI perf configuration) that is reported to cause a regression to happen on a system where invalid data are returned by the ACPI _PSS object. Since that commit makes assumptions regarding the _PSS output correctness that may turn out to be overly optimistic in general, there is a concern that it may introduce regression on more systems, so it's better to revert it now and we'll revisit the underlying issue in the next cycle with a more robust solution. Conflicts: drivers/cpufreq/intel_pstate.c Fixes: 37afb000 (cpufreq: intel_pstate: Use ACPI perf configuration) Reported-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
Revert commit 4ef45148 (cpufreq: intel_pstate: Avoid calculation for max/min) as it depends on commit 37afb000 (cpufreq: intel_pstate: Use ACPI perf configuration) that causes problems to happen and needs to be reverted. Conflicts: drivers/cpufreq/intel_pstate.c Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 02 11月, 2015 1 次提交
-
-
由 Prarit Bhargava 提交于
When booting an HWP enabled system the kernel displays one "HWP enabled" message for each cpu. The messages are superfluous since HWP is globally enabled across all CPUs. This patch also adds an informational message when HWP is disabled via intel_pstate=no_hwp. Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Reviewed-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 17 10月, 2015 1 次提交
-
-
由 Prarit Bhargava 提交于
On systems that initialize the intel_pstate driver with the performance governor, and then switch to the powersave governor will not transition to lower cpu frequencies until /sys/devices/system/cpu/intel_pstate/min_perf_pct is set to a low value. The behavior of governor switching changed after commit a0475992 ("[cpufreq] intel_pstate: honor user space min_perf_pct override on resume"). The commit introduced tracking of performance percentage changes via sysfs in order to restore userspace changes during suspend/resume. The problem occurs because the global values of the newly introduced max_sysfs_pct and min_sysfs_pct are not lowered on the governor change and this causes the powersave governor to inherit the performance governor's settings. A simple change would have been to reset max_sysfs_pct to 100 and min_sysfs_pct to 0 on a governor change, which fixes the problem with governor switching. However, since we cannot break userspace[1] the fix is now to give each governor its own limits storage area so that governor specific changes are tracked. I successfully tested this by booting with both the performance governor and the powersave governor by default, and switching between the two governors (while monitoring /sys/devices/system/cpu/intel_pstate/ values, and looking at the output of cpupower frequency-info). Suspend/Resume testing was performed by Doug Smythies. [1] Systems which suspend/resume using the unmaintained pm-utils package will always transition to the performance governor before the suspend and after the resume. This means a system using the powersave governor will go from powersave to performance, then suspend/resume, performance to powersave. The simple change during governor changes would have been overwritten when the governor changed before and after the suspend/resume. I have submitted https://bugzilla.redhat.com/show_bug.cgi?id=1271225 against Fedora to remove the 94cpufreq file that causes the problem. It should be noted that pm-utils is obsoleted with newer versions of systemd. Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 16 10月, 2015 1 次提交
-
-
由 Srinivas Pandruvada 提交于
This is a workaround for KNL platform, where in some cases MPERF counter will not have updated value before next read of MSR_IA32_MPERF. In this case divide by zero will occur. This change ignores current sample for busy calculation in this case. Fixes: b34ef932 (intel_pstate: Knights Landing support) Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Cc: 4.1+ <stable@vger.kernel.org> # 4.1+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 15 10月, 2015 4 次提交
-
-
由 Srinivas Pandruvada 提交于
When requested from cpufreq to set policy, look into _pss and get control values, instead of using max/min perf calculations. These calculation misses next control state in boundary conditions. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Srinivas Pandruvada 提交于
Use ACPI _PSS to limit the Intel P State turbo, max and min ratios. This driver uses acpi processor perf lib calls to register performance. The following logic is used to adjust Intel P state driver limits: - If there is no turbo entry in _PSS, then disable Intel P state turbo and limit to non turbo max - If the non turbo max ratio is more than _PSS max non turbo value, then set the max non turbo ratio to _PSS non turbo max - If the min ratio is less than _PSS min then change the min ratio matching _PSS min - Scale the _PSS turbo frequency to max turbo frequency based on control value. This feature can be disabled by using kernel parameters: intel_pstate=no_acpi Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Srinivas Pandruvada 提交于
Systems with configurable TDP have multiple max non turbo p state. Intel P state uses max non turbo P state for scaling. But using the real max non turbo p state causes underestimation of next P state. So using the physical max non turbo P state as before for scaling. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Srinivas Pandruvada 提交于
After Ivybridge, the max non turbo ratio obtained from platform info msr is not always guaranteed P1 on client platforms. The max non turbo activation ratio (TAR), determines the max for the current level of TDP. The ratio in platform info is physical max. The TAR MSR can be locked, so updating this value is not possible on all platforms. This change gets this ratio from MSR TURBO_ACTIVATION_RATIO if available, but also do some sanity checking to make sure that this value is correct. The sanity check involves reading the TDP ratio for the current tdp control value when platform has configurable TDP present and matching TAC with this. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 10 9月, 2015 2 次提交
-
-
由 Kristen Carlson Accardi 提交于
PCT_TO_HWP does not take the actual range of pstates exported by HWP_CAPABILITIES in account, and is broken on most platforms. Remove the macro and set the min and max pstate for hwp by determining the range and adjusting by the min and max percent limits values. Signed-off-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Chen Yu 提交于
In current code, max_perf_pct might be smaller than min_perf_pct by improper user input: $ grep . /sys/devices/system/cpu/intel_pstate/m*_perf_pct /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 $ echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct $ grep . /sys/devices/system/cpu/intel_pstate/m*_perf_pct /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 Fix this problem by 2 steps: 1. Normalize the user input to [min_policy, max_policy]. 2. Make sure max_perf_pct>=min_perf_pct, suggested by Seiichi Ikarashi. Signed-off-by: NChen Yu <yu.c.chen@intel.com> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 07 8月, 2015 2 次提交
-
-
由 Ethan Zhao 提交于
Append more Oracle X86 servers that have their own power management, SUN FIRE X4275 M3 SUN FIRE X4170 M3 and SUN FIRE X6-2 Signed-off-by: NEthan Zhao <ethan.zhao@oracle.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Kristen Carlson Accardi 提交于
Whitelist the SKL-S processor Signed-off-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 01 8月, 2015 1 次提交
-
-
由 Chen Yu 提交于
Coverity scanning performed on intel_pstate.c shows possible overflow when doing left shifting: val = pstate << 8; since pstate is of type integer, while val is of u64, left shifting pstate might lead to potential loss of upper bits. Say, if pstate equals 0x4000 0000, after pstate << 8 we will get zero assigned to val. Although pstate will not likely be that big, this patch cast the left operand to u64 before performing the left shift, to avoid complaining from Coverity. Reported-by: NCoquard, Christophe <christophe.coquard@intel.com> Signed-off-by: NChen Yu <yu.c.chen@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 27 7月, 2015 1 次提交
-
-
由 Lukasz Anaczkowski 提交于
Scaling for Knights Landing is same as the default scaling (100000). When Knigts Landing support was added to the pstate driver, this parameter was omitted resulting in a kernel panic during boot. Fixes: b34ef932 (intel_pstate: Knights Landing support) Reported-by: NYasuaki Ishimatsu <yishimat@redhat.com> Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: NLukasz Anaczkowski <lukasz.anaczkowski@intel.com> Acked-by: NKristen Carlson Accardi <kristen@linux.intel.com> Cc: 4.1+ <stable@vger.kernel.org> # 4.1+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-