1. 05 11月, 2021 3 次提交
  2. 26 10月, 2021 1 次提交
  3. 05 10月, 2021 1 次提交
    • S
      cpufreq: intel_pstate: Process HWP Guaranteed change notification · 57577c99
      Srinivas Pandruvada 提交于
      It is possible that HWP guaranteed ratio is changed in response to
      change in power and thermal limits. For example when Intel Speed Select
      performance profile is changed or there is change in TDP, hardware can
      send notifications. It is possible that the guaranteed ratio is
      increased. This creates an issue when turbo is disabled, as the old
      limits set in MSR_HWP_REQUEST are still lower and hardware will clip
      to older limits.
      
      This change enables HWP interrupt and process HWP interrupts. When
      guaranteed is changed, calls cpufreq_update_policy() so that driver
      callbacks are called to update to new HWP limits. This callback
      is called from a delayed workqueue of 10ms to avoid frequent updates.
      
      Although the scope of IA32_HWP_INTERRUPT is per logical cpu, on some
      plaforms interrupt is generated on all CPUs. This is particularly a
      problem during initialization, when the driver didn't allocated
      data for other CPUs. So this change uses a cpumask of enabled CPUs and
      process interrupts on those CPUs only.
      
      When the cpufreq offline() or suspend() callback is called, HWP interrupt
      is disabled on those CPUs and also cancels any pending work item.
      
      Spin lock is used to protect data and processing shared with interrupt
      handler. Here READ_ONCE(), WRITE_ONCE() macros are used to designate
      shared data, even though spin lock act as an optimization barrier here.
      Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Tested-by: pablomh@gmail.com
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      57577c99
  4. 14 9月, 2021 1 次提交
  5. 08 9月, 2021 1 次提交
    • R
      cpufreq: intel_pstate: hybrid: Rework HWP calibration · 46573fd6
      Rafael J. Wysocki 提交于
      The current HWP calibration for hybrid processors in intel_pstate is
      fragile, because it depends too much on the information provided by
      the platform firmware via CPPC which may not be reliable enough.  It
      also need not be so complicated.
      
      In order to improve that mechanism and make it more resistant to
      platform firmware issues, make it only use the CPPC nominal_perf
      values to compute the HWP-to-frequency scaling factors for all
      CPUs and possibly use the HWP_CAP highest_perf values to recompute
      them if the ones derived from the CPPC nominal_perf values alone
      appear to be too high.
      
      Namely, fetch CPC.nominal_perf for all CPUs present in the system,
      find the minimum one and use it as a reference for computing all of
      the CPUs' scaling factors (using the observation that for the CPUs
      having the minimum CPC.nominal_perf the HWP range of available
      performance levels should be the same as the range of available
      "legacy" P-states and so the HWP-to-frequency scaling factor for
      them should be the same as the corresponding scaling factor used
      for representing the P-state values in kHz).
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Tested-by: NZhang Rui <rui.zhang@intel.com>
      46573fd6
  6. 07 9月, 2021 1 次提交
  7. 26 8月, 2021 1 次提交
    • S
      cpufreq: intel_pstate: Process HWP Guaranteed change notification · d0e936ad
      Srinivas Pandruvada 提交于
      It is possible that HWP guaranteed ratio is changed in response to
      change in power and thermal limits. For example when Intel Speed Select
      performance profile is changed or there is change in TDP, hardware can
      send notifications. It is possible that the guaranteed ratio is
      increased. This creates an issue when turbo is disabled, as the old
      limits set in MSR_HWP_REQUEST are still lower and hardware will clip
      to older limits.
      
      This change enables HWP interrupt and process HWP interrupts. When
      guaranteed is changed, calls cpufreq_update_policy() so that driver
      callbacks are called to update to new HWP limits. This callback
      is called from a delayed workqueue of 10ms to avoid frequent updates.
      Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      d0e936ad
  8. 05 8月, 2021 1 次提交
  9. 01 7月, 2021 1 次提交
  10. 07 6月, 2021 1 次提交
  11. 22 5月, 2021 4 次提交
    • G
      cpufreq: intel_pstate: Add Cometlake support in no-HWP mode · 706c5328
      Giovanni Gherdovich 提交于
      Users may disable HWP in firmware, in which case intel_pstate wouldn't load
      unless the CPU model is explicitly supported.
      
      See also commit d8de7a44 ("cpufreq: intel_pstate: Add Skylake servers
      support").
      Suggested-by: NDoug Smythies <dsmythies@telus.net>
      Tested-by: NDoug Smythies <dsmythies@telus.net>
      Signed-off-by: NGiovanni Gherdovich <ggherdovich@suse.cz>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      706c5328
    • G
      cpufreq: intel_pstate: Add Icelake servers support in no-HWP mode · fbdc21e9
      Giovanni Gherdovich 提交于
      Users may disable HWP in firmware, in which case intel_pstate wouldn't load
      unless the CPU model is explicitly supported.
      
      Add ICELAKE_X to the list of CPUs that can register intel_pstate while not
      advertising the HWP capability. Without this change, an ICELAKE_X in no-HWP
      mode could only use the acpi_cpufreq frequency scaling driver.
      
      See also commit d8de7a44 ("cpufreq: intel_pstate: Add Skylake servers
      support").
      Signed-off-by: NGiovanni Gherdovich <ggherdovich@suse.cz>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      fbdc21e9
    • R
      cpufreq: intel_pstate: hybrid: CPU-specific scaling factor · eb3693f0
      Rafael J. Wysocki 提交于
      The scaling factor between HWP performance levels and CPU frequency
      may be different for different types of CPUs in a hybrid processor
      and in general the HWP performance levels need not correspond to
      "P-states" representing values that would be written to
      MSR_IA32_PERF_CTL if HWP was disabled.
      
      However, the policy limits control in cpufreq is defined in terms
      of CPU frequency, so it is necessary to map the frequency limits set
      through that interface to HWP performance levels with reasonable
      accuracy and the behavior of that interface on hybrid processors
      has to be compatible with its behavior on non-hybrid ones.
      
      To address this problem, use the observations that (1) on hybrid
      processors the sysfs interface can operate by mapping frequency
      to "P-states" and translating those "P-states" to specific HWP
      performance levels of the given CPU and (2) the scaling factor
      between the MSR_IA32_PERF_CTL "P-states" and CPU frequency can be
      regarded as a known value.  Moreover, the mapping between the
      HWP performance levels and CPU frequency can be assumed to be
      linear and such that HWP performance level 0 correspond to the
      frequency value of 0, so it is only necessary to know the
      frequency corresponding to one specific HWP performance level
      to compute the scaling factor applicable to all of them.
      
      One possibility is to take the nominal performance value from CPPC,
      if available, and use cpu_khz as the corresponding frequency.  If
      the CPPC capabilities interface is not there or the nominal
      performance value provided by it is out of range, though, something
      else needs to be done.
      
      Namely, the guaranteed performance level either from CPPC or from
      MSR_HWP_CAPABILITIES can be used instead, but the corresponding
      frequency needs to be determined.  That can be done by computing the
      product of the (known) scaling factor between the MSR_IA32_PERF_CTL
      P-states and CPU frequency (the PERF_CTL scaling factor) and the
      P-state value referred to as the "TDP ratio".
      
      If the HWP-to-frequency scaling factor value obtained in one of the
      ways above turns out to be euqal to the PERF_CTL scaling factor, it
      can be assumed that the number of HWP performance levels is equal to
      the number of P-states and the given CPU can be handled as though
      this was not a hybrid processor.
      
      Otherwise, one more adjustment may still need to be made, because the
      HWP-to-frequency scaling factor computed so far may not be accurate
      enough (e.g. because the CPPC information does not match the exact
      behavior of the processor).  Specifically, in that case the frequency
      corresponding to the highest HWP performance value from
      MSR_HWP_CAPABILITIES (computed as the product of that value and the
      HWP-to-frequency scaling factor) cannot exceed the frequency that
      corresponds to the maximum 1-core turbo P-state value from
      MSR_TURBO_RATIO_LIMIT (computed as the procuct of that value and the
      PERF_CTL scaling factor) and the HWP-to-frequency scaling factor may
      need to be adjusted accordingly.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      eb3693f0
    • R
      cpufreq: intel_pstate: hybrid: Avoid exposing two global attributes · c3d175e4
      Rafael J. Wysocki 提交于
      The turbo_pct and num_pstates sysfs attributes represent CPU
      properties that may be different for differenty types of CPUs in
      a hybrid processor, so avoid exposing them in that case.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      c3d175e4
  12. 10 5月, 2021 1 次提交
  13. 09 4月, 2021 1 次提交
    • R
      cpufreq: intel_pstate: Simplify intel_pstate_update_perf_limits() · b989bc0f
      Rafael J. Wysocki 提交于
      Because pstate.max_freq is always equal to the product of
      pstate.max_pstate and pstate.scaling and, analogously,
      pstate.turbo_freq is always equal to the product of
      pstate.turbo_pstate and pstate.scaling, the result of the
      max_policy_perf computation in intel_pstate_update_perf_limits() is
      always equal to the quotient of policy_max and pstate.scaling,
      regardless of whether or not turbo is disabled.  Analogously, the
      result of min_policy_perf in intel_pstate_update_perf_limits() is
      always equal to the quotient of policy_min and pstate.scaling.
      
      Accordingly, intel_pstate_update_perf_limits() need not check
      whether or not turbo is enabled at all and in order to compute
      max_policy_perf and min_policy_perf it can always divide policy_max
      and policy_min, respectively, by pstate.scaling.  Make it do so.
      
      While at it, move the definition and initialization of the
      turbo_max local variable to the code branch using it.
      
      No intentional functional impact.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Tested-by: NChen Yu <yu.c.chen@intel.com>
      b989bc0f
  14. 24 3月, 2021 1 次提交
    • R
      cpufreq: intel_pstate: Clean up frequency computations · de5bcf40
      Rafael J. Wysocki 提交于
      Notice that some computations related to frequency in intel_pstate
      can be simplified if (a) intel_pstate_get_hwp_max() updates the
      relevant members of struct cpudata by itself and (b) the "turbo
      disabled" check is moved from it to its callers, so modify the code
      accordingly and while at it rename intel_pstate_get_hwp_max() to
      intel_pstate_get_hwp_cap() which better reflects its purpose and
      provide a simplified variat of it, __intel_pstate_get_hwp_cap(),
      suitable for the initialization path.
      
      No intentional functional impact.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Tested-by: NChen Yu <yu.c.chen@intel.com>
      de5bcf40
  15. 23 1月, 2021 1 次提交
  16. 13 1月, 2021 4 次提交
  17. 08 1月, 2021 2 次提交
    • L
      cpufreq: intel_pstate: remove obsolete functions · c4151604
      Lukas Bulwahn 提交于
      percent_fp() was used in intel_pstate_pid_reset(), which was removed in
      commit 9d0ef7af ("cpufreq: intel_pstate: Do not use PID-based P-state
      selection") and hence, percent_fp() is unused since then.
      
      percent_ext_fp() was last used in intel_pstate_update_perf_limits(), which
      was refactored in commit 1a4fe38a ("cpufreq: intel_pstate: Remove
      max/min fractions to limit performance"), and hence, percent_ext_fp() is
      unused since then.
      
      make CC=clang W=1 points us those unused functions:
      
      drivers/cpufreq/intel_pstate.c:79:23: warning: unused function 'percent_fp' [-Wunused-function]
      static inline int32_t percent_fp(int percent)
                            ^
      
      drivers/cpufreq/intel_pstate.c:94:23: warning: unused function 'percent_ext_fp' [-Wunused-function]
      static inline int32_t percent_ext_fp(int percent)
                            ^
      
      Remove those obsolete functions.
      Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
      Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      c4151604
    • R
      cpufreq: intel_pstate: Use HWP capabilities in intel_cpufreq_adjust_perf() · 17ffd358
      Rafael J. Wysocki 提交于
      If turbo P-states cannot be used, either due to the configuration of
      the processor, or because intel_pstate is not allowed to used them,
      the maximum available P-state with HWP enabled corresponds to the
      HWP_CAP.GUARANTEED value which is not static.  It can be adjusted by
      an out-of-band agent or during an Intel Speed Select performance
      level change, so long as it remains less than or equal to
      HWP_CAP.MAX.
      
      However, if turbo P-states cannot be used, intel_cpufreq_adjust_perf()
      always uses pstate.max_pstate (set during the initialization of the
      driver only) as the maximum available P-state, so it may miss a change
      of the HWP_CAP.GUARANTEED value.
      
      Prevent that from happening by modifyig intel_cpufreq_adjust_perf()
      to always read the "guaranteed" and "maximum turbo" performance
      levels from the cached HWP_CAP value.
      
      Fixes: a365ab6b ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback")
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      17ffd358
  18. 31 12月, 2020 1 次提交
  19. 21 12月, 2020 1 次提交
    • R
      cpufreq: intel_pstate: Use most recent guaranteed performance values · e40ad84c
      Rafael J. Wysocki 提交于
      When turbo has been disabled by the BIOS, but HWP_CAP.GUARANTEED is
      changed later, user space may want to take advantage of this increased
      guaranteed performance.
      
      HWP_CAP.GUARANTEED is not a static value.  It can be adjusted by an
      out-of-band agent or during an Intel Speed Select performance level
      change.  The HWP_CAP.MAX is still the maximum achievable performance
      with turbo disabled by the BIOS, so HWP_CAP.GUARANTEED can still
      change as long as it remains less than or equal to HWP_CAP.MAX.
      
      When HWP_CAP.GUARANTEED is changed, the sysfs base_frequency
      attribute shows the most recent guaranteed frequency value. This
      attribute can be used by user space software to update the scaling
      min/max limits of the CPU.
      
      Currently, the ->setpolicy() callback already uses the latest
      HWP_CAP values when setting HWP_REQ, but the ->verify() callback will
      restrict the user settings to the to old guaranteed performance value
      which prevents user space from making use of the extra CPU capacity
      theoretically available to it after increasing HWP_CAP.GUARANTEED.
      
      To address this, read HWP_CAP in intel_pstate_verify_cpu_policy()
      to obtain the maximum P-state that can be used and use that to
      confine the policy max limit instead of using the cached and
      possibly stale pstate.max_freq value for this purpose.
      
      For consistency, update intel_pstate_update_perf_limits() to use the
      maximum available P-state returned by intel_pstate_get_hwp_max() to
      compute the maximum frequency instead of using the return value of
      intel_pstate_get_max_freq() which, again, may be stale.
      
      This issue is a side-effect of fixing the scaling frequency limits in
      commit eacc9c5a ("cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max()
      for turbo disabled") which corrected the setting of the reduced scaling
      frequency values, but caused stale HWP_CAP.GUARANTEED to be used in
      the case at hand.
      
      Fixes: eacc9c5a ("cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled")
      Reported-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Tested-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: 5.8+ <stable@vger.kernel.org> # 5.8+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e40ad84c
  20. 16 12月, 2020 1 次提交
  21. 12 12月, 2020 1 次提交
  22. 11 11月, 2020 1 次提交
    • R
      cpufreq: intel_pstate: Take CPUFREQ_GOV_STRICT_TARGET into account · fcb3a1ab
      Rafael J. Wysocki 提交于
      Make intel_pstate take the new CPUFREQ_GOV_STRICT_TARGET governor
      flag into account when it operates in the passive mode with HWP
      enabled, so as to fix the "powersave" governor behavior in that
      case (currently, HWP is allowed to scale the performance all the
      way up to the policy max limit when the "powersave" governor is
      used, but it should be constrained to the policy min limit then).
      
      Fixes: f6ebbcf0 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: 5.9+ <stable@vger.kernel.org> # 5.9+: 9a2a9ebc cpufreq: Introduce governor flags
      Cc: 5.9+ <stable@vger.kernel.org> # 5.9+: 218f6687 cpufreq: Introduce CPUFREQ_GOV_STRICT_TARGET
      Cc: 5.9+ <stable@vger.kernel.org> # 5.9+: ea9364bb cpufreq: Add strict_target to struct cpufreq_policy
      fcb3a1ab
  23. 28 10月, 2020 1 次提交
    • R
      cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode · e0be38ed
      Rafael J. Wysocki 提交于
      If the cpufreq policy max limit is changed when intel_pstate operates
      in the passive mode with HWP enabled and the "powersave" governor is
      used on top of it, the HWP max limit is not updated as appropriate.
      
      Namely, in the "powersave" governor case, the target P-state
      is always equal to the policy min limit, so if the latter does
      not change, intel_cpufreq_adjust_hwp() is not invoked to update
      the HWP Request MSR due to the "target_pstate != old_pstate" check
      in intel_cpufreq_update_pstate(), so the HWP max limit is not
      updated as a result.
      
      Also, if the CPUFREQ_NEED_UPDATE_LIMITS flag is not set for the
      driver and the target frequency does not change along with the
      policy max limit, the "target_freq == policy->cur" check in
      __cpufreq_driver_target() prevents the driver's ->target() callback
      from being invoked at all, so the HWP max limit is not updated.
      
      To prevent that occurring, set the CPUFREQ_NEED_UPDATE_LIMITS flag
      in the intel_cpufreq driver structure if HWP is enabled and modify
      intel_cpufreq_update_pstate() to do the "target_pstate != old_pstate"
      check only in the non-HWP case and let intel_cpufreq_adjust_hwp()
      always run in the HWP case (it will update HWP Request only if the
      cached value of the register is different from the new one including
      the limits, so if neither the target P-state value nor the max limit
      changes, the register write will still be avoided).
      
      Fixes: f6ebbcf0 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
      Reported-by: NZhang Rui <rui.zhang@intel.com>
      Cc: 5.9+ <stable@vger.kernel.org> # 5.9+: 1c534352 cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS ...
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Tested-by: NZhang Rui <rui.zhang@intel.com>
      e0be38ed
  24. 16 10月, 2020 1 次提交
    • C
      cpufreq: intel_pstate: Delete intel_pstate sysfs if failed to register the driver · cdc1719c
      Chen Yu 提交于
      There is a corner case that if the intel_pstate driver fails to be
      registered (might be due to invalid MSR access) and acpi_cpufreq
      takse over, the intel_pstate sysfs interface is still populated
      which may confuse user space (turbostat for example):
      
      grep . /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
      acpi-cpufreq
      
      grep . /sys/devices/system/cpu/intel_pstate/*
      /sys/devices/system/cpu/intel_pstate/max_perf_pct:0
      /sys/devices/system/cpu/intel_pstate/min_perf_pct:0
      grep: /sys/devices/system/cpu/intel_pstate/no_turbo: Resource temporarily unavailable
      grep: /sys/devices/system/cpu/intel_pstate/num_pstates: Resource temporarily unavailable
      /sys/devices/system/cpu/intel_pstate/status:off
      grep: /sys/devices/system/cpu/intel_pstate/turbo_pct: Resource temporarily unavailable
      
      The mere presence of the intel_pstate sysfs interface does not mean
      that intel_pstate is in use (for example, echo "off" to "status"),
      but it should not be created in the failing case.
      
      Fix this issue by deleting the intel_pstate sysfs if the driver
      registration fails.
      Reported-by: NWendy Wang <wendy.wang@intel.com>
      Suggested-by: NZhang Rui <rui.zhang@intel.com>
      Signed-off-by: NChen Yu <yu.c.chen@intel.com>
      Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com
      [ rjw: Refactor code to avoid jumps, change function name, changelog edits ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      cdc1719c
  25. 30 9月, 2020 1 次提交
  26. 02 9月, 2020 6 次提交