- 26 3月, 2019 1 次提交
-
-
由 Srinivas Pandruvada 提交于
The ACPI specification states that if the "Guaranteed Performance Register" is not implemented, the OSPM assumes guaranteed performance to always be equal to nominal performance. So for invalid or unimplemented guaranteed performance register, use nominal performance as guaranteed performance. This change will fall back to nominal_perf when guranteed_perf is invalid. If nominal_perf is also invalid or not present, fall back to the existing implementation, which is to read from HWP Capabilities MSR. Fixes: 86d333a8 ("cpufreq: intel_pstate: Add base_frequency attribute") Suggested-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 4.20+ <stable@vger.kernel.org> # 4.20+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 12 3月, 2019 1 次提交
-
-
由 Rafael J. Wysocki 提交于
After commit b8bd1581 ("cpufreq: intel_pstate: Rework iowait boosting to be less aggressive") the handling of the case when the SCHED_CPUFREQ_IOWAIT flag is set again after a few iterations of intel_pstate_update_util() is a bit inconsistent, because the new value of cpu->iowait_boost may be lower than ONE_EIGHTH_FP if it was set before, but has not dropped down to zero just yet. Fix that up by ensuring that the new value of cpu->iowait_boost will always be at least ONE_EIGHTH_FP then. Fixes: b8bd1581 ("cpufreq: intel_pstate: Rework iowait boosting to be less aggressive") Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 18 2月, 2019 3 次提交
-
-
由 Rafael J. Wysocki 提交于
The current iowait boosting mechanism in intel_pstate_update_util() is quite aggressive, as it goes to the maximum P-state right away, and may cause excessive amounts of energy to be used, which is not desirable and arguably isn't necessary too. Follow commit a5a0809b ("cpufreq: schedutil: Make iowait boost more energy efficient") that reworked the analogous iowait boost mechanism in the schedutil governor and make the iowait boosting in intel_pstate_update_util() work along the same lines. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
There is only one caller of intel_pstate_get_base_pstate() and it is more straightforward to carry out the computation directly in the caller, so do that and drop intel_pstate_get_base_pstate(). No intentional changes of behavior. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
After commit 1a4fe38a ("cpufreq: intel_pstate: Remove max/min fractions to limit performance") the initial value of the pstate local variable in intel_pstate_max_within_limits() and the initial value of the max_pstate local variable in intel_pstate_prepare_request() are both immediately discarded, so initialize both these variables to their target values upfront. No intentional changes of behavior. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 13 2月, 2019 1 次提交
-
-
由 Erwan Velu 提交于
The init code path has several exceptions where the driver can decide not to load. As CONFIG_X86_INTEL_PSTATE is generally set to Y, the return code is not reachable. The initialization code is neither verbose of the reason why it did choose to prematurely exit, so it is difficult for a user to determine, on a given platform, why the driver didn't load properly. This patch is about reporting to the user the reason/context of why the driver failed to load. That is a precious hint when debugging a platform. Signed-off-by: NErwan Velu <e.velu@criteo.com> [ rjw: Subject & changelog, minor fixups ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 29 1月, 2019 1 次提交
-
-
由 Viresh Kumar 提交于
The cpufreq_global_kobject is created using kobject_create_and_add() helper, which assigns the kobj_type as dynamic_kobj_ktype and show/store routines are set to kobj_attr_show() and kobj_attr_store(). These routines pass struct kobj_attribute as an argument to the show/store callbacks. But all the cpufreq files created using the cpufreq_global_kobject expect the argument to be of type struct attribute. Things work fine currently as no one accesses the "attr" argument. We may not see issues even if the argument is used, as struct kobj_attribute has struct attribute as its first element and so they will both get same address. But this is logically incorrect and we should rather use struct kobj_attribute instead of struct global_attr in the cpufreq core and drivers and the show/store callbacks should take struct kobj_attribute as argument instead. This bug is caught using CFI CLANG builds in android kernel which catches mismatch in function prototypes for such callbacks. Reported-by: NDonghee Han <dh.han@samsung.com> Reported-by: NSangkyu Kim <skwith.kim@samsung.com> Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 30 11月, 2018 1 次提交
-
-
由 Srinivas Pandruvada 提交于
Force HWP Request MAX = HWP Request MIN = HWP Capability MIN and EPP to 0xFF. In this way the performance limits on the offlined CPU will not influence performance limits on its sibling CPU, which is still online. If the sibling CPU is calling for higher performance, it will impact the max core performance. Here core performance will follow higher of the performance requests from each sibling. Reported-and-tested-by: NChen Yu <yu.c.chen@intel.com> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 28 11月, 2018 1 次提交
-
-
由 Paul E. McKenney 提交于
Now that synchronize_rcu() waits for preempt-disable regions of code as well as RCU read-side critical sections, synchronize_sched() can be replaced by synchronize_rcu(). This commit therefore makes this change. Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: <linux-pm@vger.kernel.org>
-
- 26 10月, 2018 1 次提交
-
-
由 Dominik Brodowski 提交于
While at it, add a few comments which config options #ifdef and #else statements refer to. Fixes: 86d333a8 (cpufreq: intel_pstate: Add base_frequency attribute) Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 16 10月, 2018 1 次提交
-
-
由 Srinivas Pandruvada 提交于
Expose base_frequency to user space via cpufreq sysfs when HWP is in use. This HWP base frequency is read from the ACPI _CPC object if present, or from the HWP Capabilities MSR otherwise. On the majority of the HWP platforms the _CPC object will point to the HWP Capabilities MSR using the "Functional Fixed Hardware" address space type. The address space type also can simply be ACPI_TYPE_INTEGER, however, in which case the platform firmware can set its value at the initialization time based on the system constraints. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 02 10月, 2018 1 次提交
-
-
由 Peter Zijlstra 提交于
Going primarily by: https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors with additional information gleaned from other related pages; notably: - Bonnell shrink was called Saltwell - Moorefield is the Merriefield refresh which makes it Airmont The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE for i in `git grep -l FAM6_ATOM` ; do sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \ -e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \ -e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \ -e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \ -e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \ -e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \ -e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \ -e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \ -e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \ -e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \ -e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i} done Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: dave.hansen@linux.intel.com Cc: len.brown@intel.com Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 8月, 2018 1 次提交
-
-
由 Srinivas Pandruvada 提交于
When HWP is active turbo active ratio is not used, so we should allow policy max frequency above turbo activation ratio to be set. When HWP is not active, then any policy max frequency above turbo activation ratio can result upto max one-core turbo frequency. This fix helps better thermal control in turbo region when other methods like "Running Average Power Limit" is not available to use. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 31 7月, 2018 1 次提交
-
-
由 Srinivas Pandruvada 提交于
Dynamic boosting of HWP performance on IO wake showed significant improvement to IO workloads. This series was intended for Skylake Xeon platforms only and feature was enabled by default based on CPU model number. But some Xeon platforms reused the Skylake desktop CPU model number. This caused some undesirable side effects to some graphics workloads. Since they are heavily IO bound, the increase in CPU performance decreased the power available for GPU to do its computing and hence decrease in graphics benchmark performance. For example on a Skylake desktop, GpuTest benchmark showed average FPS reduction from 529 to 506. This change makes sure that HWP boost feature is only enabled for Skylake server platforms by using ACPI FADT preferred PM Profile. If some desktop users wants to get benefit of boost, they can still enable boost from intel_pstate sysfs attribute "hwp_dynamic_boost". Fixes: 41ab43c9 (cpufreq: intel_pstate: enable boost for Skylake Xeon) Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410Reported-by: NEero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: NFrancisco Jerez <currojerez@riseup.net> Acked-by: NMel Gorman <mgorman@techsingularity.net> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 19 7月, 2018 1 次提交
-
-
由 Srinivas Pandruvada 提交于
On HWP platforms with Turbo 3.0, the HWP capability max ratio shows the maximum ratio of that core, which can be different than other cores. If we show the correct maximum frequency in cpufreq sysfs via cpuinfo_max_freq and scaling_max_freq then, user can know which cores can run faster for pinning some high priority tasks. Currently the max turbo frequency is shown as max frequency, which is the max of all cores, even if some cores can't reach that frequency even for single threaded workload. But it is possible that max ratio in HWP capabilities is set as 0xFF or some high invalid value (E.g. One KBL NUC). Since the actual performance can never exceed 1 core turbo frequency from MSR TURBO_RATIO_LIMIT, we use this as a bound check. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 18 7月, 2018 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Currently, intel_pstate doesn't register if _PSS is not present on HP Proliant systems, because it expects the firmware to take over CPU performance scaling in that case. However, if ACPI PCCH is present, the firmware expects the kernel to use it for CPU performance scaling and the pcc-cpufreq driver is loaded for that. Unfortunately, the firmware interface used by that driver is not scalable for fundamental reasons, so pcc-cpufreq is way suboptimal on systems with more than just a few CPUs. In fact, it is better to avoid using it at all. For this reason, modify intel_pstate to look for ACPI PCCH if _PSS is not present and register if it is there. Also prevent the pcc-cpufreq driver from trying to initialize itself if intel_pstate has been registered already. Fixes: fbbcdc07 (intel_pstate: skip the driver if ACPI has power mgmt option) Reported-by: NAndreas Herrmann <aherrmann@suse.com> Reviewed-by: NAndreas Herrmann <aherrmann@suse.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: NAndreas Herrmann <aherrmann@suse.com> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 02 7月, 2018 1 次提交
-
-
由 Xie Yisheng 提交于
match_string() returns the index of an array for a matching string, which can be used instead of open coded variant. Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NYisheng Xie <xieyisheng1@huawei.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 19 6月, 2018 1 次提交
-
-
由 Srinivas Pandruvada 提交于
When scaling max/min settings are changed, internally they are converted to a ratio using the max turbo 1 core turbo frequency. This works fine when 1 core max is same irrespective of the core. But under Turbo 3.0, this will not be the case. For example: Core 0: max turbo pstate: 43 (4.3GHz) Core 1: max turbo pstate: 45 (4.5GHz) In this case 1 core turbo ratio will be maximum of all, so it will be 45 (4.5GHz). Suppose scaling max is set to 4GHz (ratio 40) for all cores ,then on core one it will be = max_state * policy->max / max_freq; = 43 * (4000000/4500000) = 38 (3.8GHz) = 38 which is 200MHz less than the desired. On core2, it will be correctly set to ratio 40 (4GHz). Same holds true for scaling min frequency limit. So this requires usage of correct turbo max frequency for core one, which in this case is 4.3GHz. So we need to adjust per CPU cpu->pstate.turbo_freq using the maximum HWP ratio of that core. This change uses the HWP capability of a core to adjust max turbo frequency. But since Broadwell HWP doesn't use ratios in the HWP capabilities, we have to use legacy max 1 core turbo ratio. This is not a problem as the HWP capabilities don't differ among cores in Broadwell. We need to check for non Broadwell CPU model for applying this change, though. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 4.6+ <stable@vger.kernel.org> # 4.6+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 13 6月, 2018 1 次提交
-
-
由 Kees Cook 提交于
The vzalloc() function has no 2-factor argument form, so multiplication factors need to be wrapped in array_size(). This patch replaces cases of: vzalloc(a * b) with: vzalloc(array_size(a, b)) as well as handling cases of: vzalloc(a * b * c) with: vzalloc(array3_size(a, b, c)) This does, however, attempt to ignore constant size factors like: vzalloc(4 * 1024) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( vzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | vzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( vzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | vzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | vzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | vzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | vzalloc( - sizeof(u8) * COUNT + COUNT , ...) | vzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | vzalloc( - sizeof(char) * COUNT + COUNT , ...) | vzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( vzalloc( - sizeof(TYPE) * (COUNT_ID) + array_size(COUNT_ID, sizeof(TYPE)) , ...) | vzalloc( - sizeof(TYPE) * COUNT_ID + array_size(COUNT_ID, sizeof(TYPE)) , ...) | vzalloc( - sizeof(TYPE) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(TYPE)) , ...) | vzalloc( - sizeof(TYPE) * COUNT_CONST + array_size(COUNT_CONST, sizeof(TYPE)) , ...) | vzalloc( - sizeof(THING) * (COUNT_ID) + array_size(COUNT_ID, sizeof(THING)) , ...) | vzalloc( - sizeof(THING) * COUNT_ID + array_size(COUNT_ID, sizeof(THING)) , ...) | vzalloc( - sizeof(THING) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(THING)) , ...) | vzalloc( - sizeof(THING) * COUNT_CONST + array_size(COUNT_CONST, sizeof(THING)) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ vzalloc( - SIZE * COUNT + array_size(COUNT, SIZE) , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( vzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | vzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | vzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | vzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | vzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | vzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | vzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | vzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( vzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | vzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | vzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | vzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | vzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | vzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( vzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | vzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | vzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | vzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | vzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | vzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | vzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | vzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( vzalloc(C1 * C2 * C3, ...) | vzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants. @@ expression E1, E2; constant C1, C2; @@ ( vzalloc(C1 * C2, ...) | vzalloc( - E1 * E2 + array_size(E1, E2) , ...) ) Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 08 6月, 2018 1 次提交
-
-
由 Srinivas Pandruvada 提交于
Enable HWP boost on Skylake server and workstations. Reported-by: NMel Gorman <mgorman@techsingularity.net> Tested-by: NGiovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 06 6月, 2018 3 次提交
-
-
由 Srinivas Pandruvada 提交于
A new attribute is added to intel_pstate sysfs to enable/disable HWP dynamic performance boost. Reported-by: NMel Gorman <mgorman@techsingularity.net> Tested-by: NGiovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Srinivas Pandruvada 提交于
This change uses SCHED_CPUFREQ_IOWAIT flag to boost HWP performance. Since SCHED_CPUFREQ_IOWAIT flag is set frequently, we don't start boosting steps unless we see two consecutive flags in two ticks. This avoids boosting due to IO because of regular system activities. To avoid synchronization issues, the actual processing of the flag is done on the local CPU callback. Reported-by: NMel Gorman <mgorman@techsingularity.net> Tested-by: NGiovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Srinivas Pandruvada 提交于
Added two utility functions to HWP boost up gradually and boost down to the default cached HWP request values. Boost up: Boost up updates HWP request minimum value in steps. This minimum value can reach upto at HWP request maximum values depends on how frequently, this boost up function is called. At max, boost up will take three steps to reach the maximum, depending on the current HWP request levels and HWP capabilities. For example, if the current settings are: If P0 (Turbo max) = P1 (Guaranteed max) = min No boost at all. If P0 (Turbo max) > P1 (Guaranteed max) = min Should result in one level boost only for P0. If P0 (Turbo max) = P1 (Guaranteed max) > min Should result in two level boost: (min + p1)/2 and P1. If P0 (Turbo max) > P1 (Guaranteed max) > min Should result in three level boost: (min + p1)/2, P1 and P0. We don't set any level between P0 and P1 as there is no guarantee that they will be honored. Boost down: After the system is idle for hold time of 3ms, the HWP request is reset to the default value from HWP init or user modified one via sysfs. Caching of HWP Request and Capabilities Store the HWP request value last set using MSR_HWP_REQUEST and read MSR_HWP_CAPABILITIES. This avoid reading of MSRs in the boost utility functions. These boost utility functions calculated limits are based on the latest HWP request value, which can be modified by setpolicy() callback. So if user space modifies the minimum perf value, that will be accounted for every time the boost up is called. There will be case when there can be contention with the user modified minimum perf, in that case user value will gain precedence. For example just before HWP_REQUEST MSR is updated from setpolicy() callback, the boost up function is called via scheduler tick callback. Here the cached MSR value is already the latest and limits are updated based on the latest user limits, but on return the MSR write callback called from setpolicy() callback will update the HWP_REQUEST value. This will be used till next time the boost up function is called. In addition add a variable to control HWP dynamic boosting. When HWP dynamic boost is active then set the HWP specific update util hook. The contents in the utility hooks will be filled in the subsequent patches. Reported-by: NMel Gorman <mgorman@techsingularity.net> Tested-by: NGiovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 15 5月, 2018 1 次提交
-
-
由 Doug Smythies 提交于
Allow use of the trace_pstate_sample trace function when the intel_pstate driver is in passive mode. Since the core_busy and scaled_busy fields are not used, and it might be desirable to know which path through the driver was used, either intel_cpufreq_target or intel_cpufreq_fast_switch, re-task the core_busy field as a flag indicator. The user can then use the intel_pstate_tracer.py utility to summarize and plot the trace. Note: The core_busy feild still goes by that name in include/trace/events/power.h and within the intel_pstate_tracer.py script and csv file headers, but it is graphed as "performance", and called core_avg_perf now in the intel_pstate driver. Sometimes, in passive mode, the driver is not called for many tens or even hundreds of seconds. The user needs to understand, and not be confused by, this limitation. Signed-off-by: NDoug Smythies <dsmythies@telus.net> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 10 4月, 2018 1 次提交
-
-
由 Rafael J. Wysocki 提交于
The intel_pstate driver doesn't use debugfs any more, so drop linux/debugfs.h from the list of included headers in it. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
-
- 08 2月, 2018 1 次提交
-
-
由 Chen Yu 提交于
When maxcpus=1 is in the kernel command line, the BP is responsible for re-enabling the HWP - because currently only the APs invoke intel_pstate_hwp_enable() during their online process - which might put the system into unstable state after resume. Fix this by enabling the HWP explicitly on BP during resume. Reported-by: NDoug Smythies <dsmythies@telus.net> Suggested-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NYu Chen <yu.c.chen@intel.com> [ rjw: Subject/changelog, minor modifications ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 12 1月, 2018 2 次提交
-
-
由 Srinivas Pandruvada 提交于
Currently intel_pstate can function only in HWP mode on Skylake servers. When HWP feature is not enabled on the processor then acpi-cpufreq is driver is used. Based on the power and performance tests using intel_pstate scaling algorithm the results are comparable. But intel_pstate brings in additional features: - Display of turbo frequency range, which many users like to see - Place limits in the turbo frequency range when platform allows Since these tests are done only using non PID algorithm introduced in kernel version 4.14, this patch is not a backport candidate. So each user has to carefully weigh the benefits before he backports. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Srinivas Pandruvada 提交于
Since core_funcs and bxt_funcs have same set of callbacks, replace bxt_funcs with core_funcs. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 29 8月, 2017 1 次提交
-
-
由 Toshi Kani 提交于
Convert to use acpi_match_platform_list() for the platform check. There is no change in functionality. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 18 8月, 2017 1 次提交
-
-
由 Sudeep Holla 提交于
policy->cpu is copied into policy->cpus in cpufreq_online() before calling into cpufreq_driver->init(). So there's no need to set the same in the individual driver init() functions again. This patch removes the redundant setting of policy->cpu in policy->cpus in intel_pstate and cppc drivers. Reported-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 11 8月, 2017 1 次提交
-
-
由 Doug Smythies 提交于
The intel_pstate CPU frequency scaling driver has always calculated CPU frequency incorrectly. Recent changes have eliminted most of the issues, however the frequency reported in the trace buffer, if used, is incorrect. It remains desireable that cpu->pstate.scaling still be a nice round number for things such as when setting max and min frequencies. So the proposal is to just fix the reported frequency in the trace data. Fixes what remains of [1]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=96521 # [1] Signed-off-by: NDoug Smythies <dsmythies@telus.net> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 10 8月, 2017 2 次提交
-
-
由 Rafael J. Wysocki 提交于
The names of the INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL symbol and the get_target_pstate_use_cpu_load() function don't need to be so long any more, so make them shorter. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
Since there is only one P-state selection routine in intel_pstate now, make intel_pstate_adjust_pstate() call it directly and drop the target_pstate argument from that function. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 04 8月, 2017 1 次提交
-
-
由 Srinivas Pandruvada 提交于
In the current implementation, the response latency between seeing SCHED_CPUFREQ_IOWAIT set and the actual P-state adjustment can be up to 10ms. It can be reduced by bumping up the P-state to the max at the time SCHED_CPUFREQ_IOWAIT is passed to intel_pstate_update_util(). With this change, the IO performance improves significantly. For a simple "grep -r . linux" (Here linux is the kernel source folder) with caches dropped every time on a Broadwell Xeon workstation with per-core P-states, the user and system time is shorter by as much as 30% - 40%. The same performance difference was not observed on clients that don't support per-core P-state. Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 01 8月, 2017 2 次提交
-
-
由 Viresh Kumar 提交于
With Android UI and benchmarks the latency of cpufreq response to certain scheduling events can become very critical. Currently, callbacks into cpufreq governors are only made from the scheduler if the target CPU of the event is the same as the current CPU. This means there are certain situations where a target CPU may not run the cpufreq governor for some time. One testcase to show this behavior is where a task starts running on CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the system is configured such that the new tasks should receive maximum demand initially, this should result in CPU0 increasing frequency immediately. But because of the above mentioned limitation though, this does not occur. This patch updates the scheduler core to call the cpufreq callbacks for remote CPUs as well. The schedutil, ondemand and conservative governors are updated to process cpufreq utilization update hooks called for remote CPUs where the remote CPU is managed by the cpufreq policy of the local CPU. The intel_pstate driver is updated to always reject remote callbacks. This is tested with couple of usecases (Android: hackbench, recentfling, galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit octa-core, single policy). Only galleryfling showed minor improvements, while others didn't had much deviation. The reason being that this patch only targets a corner case, where following are required to be true to improve performance and that doesn't happen too often with these tests: - Task is migrated to another CPU. - The task has high demand, and should take the target CPU to higher OPPs. - And the target CPU doesn't call into the cpufreq governor until the next tick. Based on initial work from Steve Muckle. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Acked-by: NSaravana Kannan <skannan@codeaurora.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
After commit 62611cb9 (intel_pstate: delete scheduler hook in HWP mode) the INTEL_PSTATE_HWP_SAMPLING_INTERVAL is not used anywhere in the code, so drop it. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 28 7月, 2017 1 次提交
-
-
由 Rafael J. Wysocki 提交于
The ->get callback in the intel_pstate structure was mostly there for the scaling_cur_freq sysfs attribute to work, but after commit f8475cef (x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF) that attribute uses arch_freq_get_on_cpu() provided by the x86 arch code on all processors supported by intel_pstate, so it doesn't need the ->get callback from the driver any more. Moreover, the very presence of the ->get callback in the intel_pstate structure causes the cpuinfo_cur_freq attribute to be present when intel_pstate operates in the active mode, which is bogus, because the role of that attribute is to return the current CPU frequency as seen by the hardware. For intel_pstate, though, this is just an average frequency and not really current, but computed for the previous sampling interval (the actual current frequency may be way different at the point this value is obtained by reading from cpuinfo_cur_freq), and after commit 82b4e03e (intel_pstate: skip scheduler hook when in "performance" mode) the value in cpuinfo_cur_freq may be stale or just 0, depending on the driver's operation mode. In fact, however, on the hardware supported by intel_pstate there is no way to read the current CPU frequency from it, so the cpuinfo_cur_freq attribute should not be present at all when this driver is in use. For this reason, drop intel_pstate_get() and clear the ->get callback pointer pointing to it, so that the cpuinfo_cur_freq is not present for intel_pstate in the active mode any more. Fixes: 82b4e03e (intel_pstate: skip scheduler hook when in "performance" mode) Reported-by: NHuaisheng Ye <yehs1@lenovo.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
-
- 27 7月, 2017 2 次提交
-
-
由 Rafael J. Wysocki 提交于
All systems use the same P-state selection "powersave" algorithm in the active mode if HWP is not used, so there's no need to provide a pointer for it in struct pstate_funcs any more. Drop ->update_util from struct pstate_funcs and make intel_pstate_set_update_util_hook() use intel_pstate_update_util() directly. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
All systems with a defined ACPI preferred profile that are not "servers" have been using the load-based P-state selection algorithm in intel_pstate since 4.12-rc1 (mobile systems and laptops have been using it since 4.10-rc1) and no problems with it have been reported to date. In particular, no regressions with respect to the PID-based P-state selection have been reported. Also testing indicates that the P-state selection algorithm based on CPU load is generally on par with the PID-based algorithm performance-wise, and for some workloads it turns out to be better than the other one, while being more straightforward and easier to understand at the same time. Moreover, the PID-based P-state selection algorithm in intel_pstate is known to be unstable in some situation and generally problematic, the issues with it are hard to address and it has become a significant maintenance burden. For these reasons, make intel_pstate use the "powersave" P-state selection algorithm based on CPU load in the active mode on all systems and drop the PID-based P-state selection code along with all things related to it from the driver. Also update the documentation accordingly. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 26 7月, 2017 1 次提交
-
-
由 Viresh Kumar 提交于
The transition_latency field isn't used for drivers with ->setpolicy() callback present and there is no point setting it from the drivers. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-