1. 23 5月, 2018 1 次提交
  2. 10 4月, 2018 1 次提交
  3. 28 2月, 2018 1 次提交
    • V
      cpufreq: Validate frequency table in the core · d417e069
      Viresh Kumar 提交于
      By design, cpufreq drivers are responsible for calling
      cpufreq_frequency_table_cpuinfo() from their ->init()
      callbacks to validate the frequency table.
      
      However, if a cpufreq driver is buggy and fails to do so properly, it
      lead to unexpected behavior of the driver or the cpufreq core at a
      later point in time.  It would be better if the core could
      validate the frequency table during driver initialization.
      
      To that end, introduce cpufreq_table_validate_and_sort() and make
      the cpufreq core call it right after invoking the ->init() callback
      of the driver and destroy the cpufreq policy if the table is invalid.
      
      For the time being the validation of the table happens twice, once
      from the driver and then from the core.  The individual drivers will
      be updated separately to drop table validation if they don't need it
      for other reasons.
      
      The frequency table is marked "sorted" or "unsorted" by the new helper
      now instead of in cpufreq_table_validate_and_show(), as it should only
      be done after validating the table (which the drivers won't do going
      forward).
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [ rjw: Subject/changelog ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      d417e069
  4. 08 2月, 2018 1 次提交
  5. 16 11月, 2017 1 次提交
    • R
      x86 / CPU: Always show current CPU frequency in /proc/cpuinfo · 7d5905dc
      Rafael J. Wysocki 提交于
      After commit 890da9cf (Revert "x86: do not use cpufreq_quick_get()
      for /proc/cpuinfo "cpu MHz"") the "cpu MHz" number in /proc/cpuinfo
      on x86 can be either the nominal CPU frequency (which is constant)
      or the frequency most recently requested by a scaling governor in
      cpufreq, depending on the cpufreq configuration.  That is somewhat
      inconsistent and is different from what it was before 4.13, so in
      order to restore the previous behavior, make it report the current
      CPU frequency like the scaling_cur_freq sysfs file in cpufreq.
      
      To that end, modify the /proc/cpuinfo implementation on x86 to use
      aperfmperf_snapshot_khz() to snapshot the APERF and MPERF feedback
      registers, if available, and use their values to compute the CPU
      frequency to be reported as "cpu MHz".
      
      However, do that carefully enough to avoid accumulating delays that
      lead to unacceptable access times for /proc/cpuinfo on systems with
      many CPUs.  Run aperfmperf_snapshot_khz() once on all CPUs
      asynchronously at the /proc/cpuinfo open time, add a single delay
      upfront (if necessary) at that point and simply compute the current
      frequency while running show_cpuinfo() for each individual CPU.
      
      Also, to avoid slowing down /proc/cpuinfo accesses too much, reduce
      the default delay between consecutive APERF and MPERF reads to 10 ms,
      which should be sufficient to get large enough numbers for the
      frequency computation in all cases.
      
      Fixes: 890da9cf (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      7d5905dc
  6. 03 10月, 2017 1 次提交
  7. 08 8月, 2017 1 次提交
  8. 01 8月, 2017 2 次提交
    • V
      cpufreq: Process remote callbacks from any CPU if the platform permits · 99d14d0e
      Viresh Kumar 提交于
      On many platforms, CPUs can do DVFS across cpufreq policies. i.e CPU
      from policy-A can change frequency of CPUs belonging to policy-B.
      
      This is quite common in case of ARM platforms where we don't
      configure any per-cpu register.
      
      Add a flag to identify such platforms and update
      cpufreq_can_do_remote_dvfs() to allow remote callbacks if this flag is
      set.
      
      Also enable the flag for cpufreq-dt driver which is used only on ARM
      platforms currently.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NSaravana Kannan <skannan@codeaurora.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      99d14d0e
    • V
      sched: cpufreq: Allow remote cpufreq callbacks · 674e7541
      Viresh Kumar 提交于
      With Android UI and benchmarks the latency of cpufreq response to
      certain scheduling events can become very critical. Currently, callbacks
      into cpufreq governors are only made from the scheduler if the target
      CPU of the event is the same as the current CPU. This means there are
      certain situations where a target CPU may not run the cpufreq governor
      for some time.
      
      One testcase to show this behavior is where a task starts running on
      CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the
      system is configured such that the new tasks should receive maximum
      demand initially, this should result in CPU0 increasing frequency
      immediately. But because of the above mentioned limitation though, this
      does not occur.
      
      This patch updates the scheduler core to call the cpufreq callbacks for
      remote CPUs as well.
      
      The schedutil, ondemand and conservative governors are updated to
      process cpufreq utilization update hooks called for remote CPUs where
      the remote CPU is managed by the cpufreq policy of the local CPU.
      
      The intel_pstate driver is updated to always reject remote callbacks.
      
      This is tested with couple of usecases (Android: hackbench, recentfling,
      galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit
      octa-core, single policy). Only galleryfling showed minor improvements,
      while others didn't had much deviation.
      
      The reason being that this patch only targets a corner case, where
      following are required to be true to improve performance and that
      doesn't happen too often with these tests:
      
      - Task is migrated to another CPU.
      - The task has high demand, and should take the target CPU to higher
        OPPs.
      - And the target CPU doesn't call into the cpufreq governor until the
        next tick.
      
      Based on initial work from Steve Muckle.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NSaravana Kannan <skannan@codeaurora.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      674e7541
  9. 26 7月, 2017 2 次提交
  10. 22 7月, 2017 2 次提交
  11. 27 6月, 2017 1 次提交
    • L
      x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF · f8475cef
      Len Brown 提交于
      The goal of this change is to give users a uniform and meaningful
      result when they read /sys/...cpufreq/scaling_cur_freq
      on modern x86 hardware, as compared to what they get today.
      
      Modern x86 processors include the hardware needed
      to accurately calculate frequency over an interval --
      APERF, MPERF, and the TSC.
      
      Here we provide an x86 routine to make this calculation
      on supported hardware, and use it in preference to any
      driver driver-specific cpufreq_driver.get() routine.
      
      MHz is computed like so:
      
      MHz = base_MHz * delta_APERF / delta_MPERF
      
      MHz is the average frequency of the busy processor
      over a measurement interval.  The interval is
      defined to be the time between successive invocations
      of aperfmperf_khz_on_cpu(), which are expected to to
      happen on-demand when users read sysfs attribute
      cpufreq/scaling_cur_freq.
      
      As with previous methods of calculating MHz,
      idle time is excluded.
      
      base_MHz above is from TSC calibration global "cpu_khz".
      
      This x86 native method to calculate MHz returns a meaningful result
      no matter if P-states are controlled by hardware or firmware
      and/or if the Linux cpufreq sub-system is or is-not installed.
      
      When this routine is invoked more frequently, the measurement
      interval becomes shorter.  However, the code limits re-computation
      to 10ms intervals so that average frequency remains meaningful.
      
      Discerning users are encouraged to take advantage of
      the turbostat(8) utility, which can gracefully handle
      concurrent measurement intervals of arbitrary length.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      f8475cef
  12. 28 5月, 2017 1 次提交
  13. 18 4月, 2017 1 次提交
  14. 04 2月, 2017 3 次提交
  15. 21 11月, 2016 1 次提交
  16. 11 11月, 2016 1 次提交
  17. 20 10月, 2016 1 次提交
    • S
      cpufreq: fix overflow in cpufreq_table_find_index_dl() · c6fe46a7
      Sergey Senozhatsky 提交于
      'best' is always less or equals to 'pos', so `best - pos' returns
      a negative value which is then getting casted to `unsigned int'
      and passed to __cpufreq_driver_target()->acpi_cpufreq_target()
      for policy->freq_table selection. This results in
      
       BUG: unable to handle kernel paging request at ffff881019b469f8
       IP: [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
       PGD 267f067
       PUD 0
      
       Oops: 0000 [#1] PREEMPT SMP
       CPU: 6 PID: 70 Comm: kworker/6:1 Not tainted 4.9.0-rc1-next-20161017-dbg-dirty
       Workqueue: events dbs_work_handler
       task: ffff88041b808000 task.stack: ffff88041b810000
       RIP: 0010:[<ffffffffa00356c1>]  [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
       RSP: 0018:ffff88041b813c60  EFLAGS: 00010282
       RAX: ffff880419b46a00 RBX: ffff88041b848400 RCX: ffff880419b20f80
       RDX: 00000000001dff38 RSI: 00000000ffffffff RDI: ffff88041b848400
       RBP: ffff88041b813cb0 R08: 0000000000000006 R09: 0000000000000040
       R10: ffffffff8207f9e0 R11: ffffffff8173595b R12: 0000000000000000
       R13: ffff88041f1dff38 R14: 0000000000262900 R15: 0000000bfffffff4
       FS:  0000000000000000(0000) GS:ffff88041f000000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffff881019b469f8 CR3: 000000041a2d3000 CR4: 00000000001406e0
       Stack:
        ffff88041b813cb0 ffffffff813347f9 ffff88041b813ca0 ffffffff81334663
        ffff88041f1d4bc0 ffff88041b848400 0000000000000000 0000000000000000
        0000000000262900 0000000000000000 ffff88041b813d00 ffffffff813355dc
       Call Trace:
        [<ffffffff813347f9>] ? cpufreq_freq_transition_begin+0xf1/0xfc
        [<ffffffff81334663>] ? get_cpu_idle_time+0x97/0xa6
        [<ffffffff813355dc>] __cpufreq_driver_target+0x3b6/0x44e
        [<ffffffff81336ca3>] cs_dbs_timer+0x11a/0x135
        [<ffffffff81336fda>] dbs_work_handler+0x39/0x62
        [<ffffffff81057823>] process_one_work+0x280/0x4a5
        [<ffffffff81058719>] worker_thread+0x24f/0x397
        [<ffffffff810584ca>] ? rescuer_thread+0x30b/0x30b
        [<ffffffff81418380>] ? nl80211_get_key+0x29/0x36a
        [<ffffffff8105d2b7>] kthread+0xfc/0x104
        [<ffffffff8107ceea>] ? put_lock_stats.isra.9+0xe/0x20
        [<ffffffff8105d1bb>] ? kthread_create_on_node+0x3f/0x3f
        [<ffffffff814b2092>] ret_from_fork+0x22/0x30
       Code: 56 4d 6b ff 0c 41 55 41 54 53 48 83 ec 28 48 8b 15 ad 1e 00 00 44 8b 41
       08 48 8b 87 c8 00 00 00 49 89 d5 4e 03 2c c5 80 b2 78 81 <46> 8b 74 38 04 45
       3b 75 00 75 11 31 c0 83 39 00 0f 84 1c 01 00
       RIP  [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
        RSP <ffff88041b813c60>
       CR2: ffff881019b469f8
       ---[ end trace 16d9fc7a17897d37 ]---
      
      [ rjw: In some cases this bug may also cause incorrect frequencies to
        be selected by cpufreq governors. ]
      
      Fixes: 899bb664 (cpufreq: skip invalid entries when searching the frequency)
      Link: http://marc.info/?l=linux-kernel&m=147672030714331&w=2Reported-and-tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      Reported-and-tested-by: NJörg Otte <jrg.otte@gmail.com>
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      c6fe46a7
  18. 13 10月, 2016 1 次提交
  19. 21 7月, 2016 1 次提交
    • S
      cpufreq: add cpufreq_driver_resolve_freq() · e3c06236
      Steve Muckle 提交于
      Cpufreq governors may need to know what a particular target frequency
      maps to in the driver without necessarily wanting to set the frequency.
      Support this operation via a new cpufreq API,
      cpufreq_driver_resolve_freq(). This API returns the lowest driver
      frequency equal or greater than the target frequency
      (CPUFREQ_RELATION_L), subject to any policy (min/max) or driver
      limitations. The mapping is also cached in the policy so that a
      subsequent fast_switch operation can avoid repeating the same lookup.
      
      The API will call a new cpufreq driver callback, resolve_freq(), if it
      has been registered by the driver. Otherwise the frequency is resolved
      via cpufreq_frequency_table_target(). Rather than require ->target()
      style drivers to provide a resolve_freq() callback it is left to the
      caller to ensure that the driver implements this callback if necessary
      to use cpufreq_driver_resolve_freq().
      Suggested-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NSteve Muckle <smuckle@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e3c06236
  20. 07 7月, 2016 1 次提交
  21. 09 6月, 2016 3 次提交
  22. 03 6月, 2016 4 次提交
    • R
      cpufreq: stats: Make the stats code non-modular · 1aefc75b
      Rafael J. Wysocki 提交于
      The modularity of cpufreq_stats is quite problematic.
      
      First off, the usage of policy notifiers for the initialization
      and cleanup in the cpufreq_stats module is inherently racy with
      respect to CPU offline/online and the initialization and cleanup
      of the cpufreq driver.
      
      Second, fast frequency switching (used by the schedutil governor)
      cannot be enabled if any transition notifiers are registered, so
      if the cpufreq_stats module (that registers a transition notifier
      for updating transition statistics) is loaded, the schedutil governor
      cannot use fast frequency switching.
      
      On the other hand, allowing cpufreq_stats to be built as a module
      doesn't really add much value.  Arguably, there's not much reason
      for that code to be modular at all.
      
      For the above reasons, make the cpufreq stats code non-modular,
      modify the core to invoke functions provided by that code directly
      and drop the notifiers from it.
      
      Make the stats sysfs attributes appear empty if fast frequency
      switching is enabled as the statistics will not be updated in that
      case anyway (and returning -EBUSY from those attributes breaks
      powertop).
      
      While at it, clean up Kconfig help for the CPU_FREQ_STAT and
      CPU_FREQ_STAT_DETAILS options.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      1aefc75b
    • R
      cpufreq: Drop the 'initialized' field from struct cpufreq_governor · 9a15fb2c
      Rafael J. Wysocki 提交于
      The 'initialized' field in struct cpufreq_governor is only used by
      the conservative governor (as a usage counter) and the way that
      happens is far from straightforward and arguably incorrect.
      
      Namely, the value of 'initialized' is checked by
      cpufreq_dbs_governor_init() and cpufreq_dbs_governor_exit() and
      the results of those checks are passed (as the second argument) to
      the ->init() and ->exit() callbacks in struct dbs_governor.  Those
      callbacks are only implemented by the ondemand and conservative
      governors and ondemand doesn't use their second argument at all.
      In turn, the conservative governor uses it to decide whether or not
      to either register or unregister a transition notifier.
      
      That whole mechanism is not only unnecessarily convoluted, but also
      racy, because the 'initialized' field of struct cpufreq_governor is
      updated in cpufreq_init_governor() and cpufreq_exit_governor() under
      policy->rwsem which doesn't help if one of these functions is run
      twice in parallel for different policies (which isn't impossible in
      principle), for example.
      
      Instead of it, add a proper usage counter to the conservative
      governor and update it from cs_init() and cs_exit() which is
      guaranteed to be non-racy, as those functions are only called
      under gov_dbs_data_mutex which is global.
      
      With that in place, drop the 'initialized' field from struct
      cpufreq_governor as it is not used any more.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      9a15fb2c
    • V
      cpufreq: governor: Create cpufreq_policy_apply_limits() · bf2be2de
      Viresh Kumar 提交于
      Create a new helper to avoid code duplication across governors.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      bf2be2de
    • R
      cpufreq: governor: Get rid of governor events · e788892b
      Rafael J. Wysocki 提交于
      The design of the cpufreq governor API is not very straightforward,
      as struct cpufreq_governor provides only one callback to be invoked
      from different code paths for different purposes.  The purpose it is
      invoked for is determined by its second "event" argument, causing it
      to act as a "callback multiplexer" of sorts.
      
      Unfortunately, that leads to extra complexity in governors, some of
      which implement the ->governor() callback as a switch statement
      that simply checks the event argument and invokes a separate function
      to handle that specific event.
      
      That extra complexity can be eliminated by replacing the all-purpose
      ->governor() callback with a family of callbacks to carry out specific
      governor operations: initialization and exit, start and stop and policy
      limits updates.  That also turns out to reduce the code size too, so
      do it.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      e788892b
  23. 09 4月, 2016 1 次提交
    • R
      cpufreq: Call cpufreq_disable_fast_switch() in sugov_exit() · 6c9d9c81
      Rafael J. Wysocki 提交于
      Due to differences in the cpufreq core's handling of runtime CPU
      offline and nonboot CPUs disabling during system suspend-to-RAM,
      fast frequency switching gets disabled after a suspend-to-RAM and
      resume cycle on all of the nonboot CPUs.
      
      To prevent that from happening, move the invocation of
      cpufreq_disable_fast_switch() from cpufreq_exit_governor() to
      sugov_exit(), as the schedutil governor is the only user of fast
      frequency switching today anyway.
      
      That simply prevents cpufreq_disable_fast_switch() from being called
      without invoking the ->governor callback for the CPUFREQ_GOV_POLICY_EXIT
      event (which happens during system suspend now).
      
      Fixes: b7898fda (cpufreq: Support for fast frequency switching)
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      6c9d9c81
  24. 02 4月, 2016 3 次提交
    • R
      cpufreq: Support for fast frequency switching · b7898fda
      Rafael J. Wysocki 提交于
      Modify the ACPI cpufreq driver to provide a method for switching
      CPU frequencies from interrupt context and update the cpufreq core
      to support that method if available.
      
      Introduce a new cpufreq driver callback, ->fast_switch, to be
      invoked for frequency switching from interrupt context by (future)
      governors supporting that feature via (new) helper function
      cpufreq_driver_fast_switch().
      
      Add two new policy flags, fast_switch_possible, to be set by the
      cpufreq driver if fast frequency switching can be used for the
      given policy and fast_switch_enabled, to be set by the governor
      if it is going to use fast frequency switching for the given
      policy.  Also add a helper for setting the latter.
      
      Since fast frequency switching is inherently incompatible with
      cpufreq transition notifiers, make it possible to set the
      fast_switch_enabled only if there are no transition notifiers
      already registered and make the registration of new transition
      notifiers fail if fast_switch_enabled is set for at least one
      policy.
      
      Implement the ->fast_switch callback in the ACPI cpufreq driver
      and make it set fast_switch_possible during policy initialization
      as appropriate.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      b7898fda
    • R
      cpufreq: Move governor symbols to cpufreq.h · 379480d8
      Rafael J. Wysocki 提交于
      Move definitions of symbols related to transition latency and
      sampling rate to include/linux/cpufreq.h so they can be used by
      (future) goverernors located outside of drivers/cpufreq/.
      
      No functional changes.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      379480d8
    • R
      cpufreq: Move governor attribute set headers to cpufreq.h · 66893b6a
      Rafael J. Wysocki 提交于
      Move definitions and function headers related to struct gov_attr_set
      to include/linux/cpufreq.h so they can be used by (future) goverernors
      located outside of drivers/cpufreq/.
      
      No functional changes.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      66893b6a
  25. 11 3月, 2016 1 次提交
  26. 09 3月, 2016 3 次提交
    • V
      cpufreq: Remove 'policy->governor_enabled' · 242aa883
      Viresh Kumar 提交于
      The entire sequence of events (like INIT/START or STOP/EXIT) for which
      cpufreq_governor() is called, is guaranteed to be protected by
      policy->rwsem now.
      
      The additional checks that were added earlier (as we were forced to drop
      policy->rwsem before calling cpufreq_governor() for EXIT event), aren't
      required anymore.
      
      Over that, they weren't sufficient really. They just take care of
      START/STOP events, but not INIT/EXIT and the state machine was never
      maintained properly by them.
      
      Kill the unnecessary checks and policy->governor_enabled field.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      242aa883
    • V
      Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT" · 68e80dae
      Viresh Kumar 提交于
      Earlier, when the struct freq-attr was used to represent governor
      attributes, the standard cpufreq show/store sysfs attribute callbacks
      were applied to the governor tunable attributes and they always acquire
      the policy->rwsem lock before carrying out the operation.  That could
      have resulted in an ABBA deadlock if governor tunable attributes are
      removed under policy->rwsem while one of them is being accessed
      concurrently (if sysfs attributes removal wins the race, it will wait
      for the access to complete with policy->rwsem held while the attribute
      callback will block on policy->rwsem indefinitely).
      
      We attempted to address this issue by dropping policy->rwsem around
      governor tunable attributes removal (that is, around invocations of the
      ->governor callback with the event arg equal to CPUFREQ_GOV_POLICY_EXIT)
      in cpufreq_set_policy(), but that opened up race conditions that had not
      been possible with policy->rwsem held all the time.
      
      The previous commit, "cpufreq: governor: New sysfs show/store callbacks
      for governor tunables", fixed the original ABBA deadlock by adding new
      governor specific show/store callbacks.
      
      We don't have to drop rwsem around invocations of governor event
      CPUFREQ_GOV_POLICY_EXIT anymore, and original fix can be reverted now.
      
      Fixes: 955ef483 (cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT)
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Reported-by: NJuri Lelli <juri.lelli@arm.com>
      Tested-by: NJuri Lelli <juri.lelli@arm.com>
      Tested-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      68e80dae
    • R
      cpufreq: Add mechanism for registering utilization update callbacks · 34e2c555
      Rafael J. Wysocki 提交于
      Introduce a mechanism by which parts of the cpufreq subsystem
      ("setpolicy" drivers or the core) can register callbacks to be
      executed from cpufreq_update_util() which is invoked by the
      scheduler's update_load_avg() on CPU utilization changes.
      
      This allows the "setpolicy" drivers to dispense with their timers
      and do all of the computations they need and frequency/voltage
      adjustments in the update_load_avg() code path, among other things.
      
      The update_load_avg() changes were suggested by Peter Zijlstra.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      34e2c555