1. 11 3月, 2016 1 次提交
  2. 09 3月, 2016 3 次提交
    • V
      cpufreq: Remove 'policy->governor_enabled' · 242aa883
      Viresh Kumar 提交于
      The entire sequence of events (like INIT/START or STOP/EXIT) for which
      cpufreq_governor() is called, is guaranteed to be protected by
      policy->rwsem now.
      
      The additional checks that were added earlier (as we were forced to drop
      policy->rwsem before calling cpufreq_governor() for EXIT event), aren't
      required anymore.
      
      Over that, they weren't sufficient really. They just take care of
      START/STOP events, but not INIT/EXIT and the state machine was never
      maintained properly by them.
      
      Kill the unnecessary checks and policy->governor_enabled field.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      242aa883
    • V
      Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT" · 68e80dae
      Viresh Kumar 提交于
      Earlier, when the struct freq-attr was used to represent governor
      attributes, the standard cpufreq show/store sysfs attribute callbacks
      were applied to the governor tunable attributes and they always acquire
      the policy->rwsem lock before carrying out the operation.  That could
      have resulted in an ABBA deadlock if governor tunable attributes are
      removed under policy->rwsem while one of them is being accessed
      concurrently (if sysfs attributes removal wins the race, it will wait
      for the access to complete with policy->rwsem held while the attribute
      callback will block on policy->rwsem indefinitely).
      
      We attempted to address this issue by dropping policy->rwsem around
      governor tunable attributes removal (that is, around invocations of the
      ->governor callback with the event arg equal to CPUFREQ_GOV_POLICY_EXIT)
      in cpufreq_set_policy(), but that opened up race conditions that had not
      been possible with policy->rwsem held all the time.
      
      The previous commit, "cpufreq: governor: New sysfs show/store callbacks
      for governor tunables", fixed the original ABBA deadlock by adding new
      governor specific show/store callbacks.
      
      We don't have to drop rwsem around invocations of governor event
      CPUFREQ_GOV_POLICY_EXIT anymore, and original fix can be reverted now.
      
      Fixes: 955ef483 (cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT)
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Reported-by: NJuri Lelli <juri.lelli@arm.com>
      Tested-by: NJuri Lelli <juri.lelli@arm.com>
      Tested-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      68e80dae
    • R
      cpufreq: Add mechanism for registering utilization update callbacks · 34e2c555
      Rafael J. Wysocki 提交于
      Introduce a mechanism by which parts of the cpufreq subsystem
      ("setpolicy" drivers or the core) can register callbacks to be
      executed from cpufreq_update_util() which is invoked by the
      scheduler's update_load_avg() on CPU utilization changes.
      
      This allows the "setpolicy" drivers to dispense with their timers
      and do all of the computations they need and frequency/voltage
      adjustments in the update_load_avg() code path, among other things.
      
      The update_load_avg() changes were suggested by Peter Zijlstra.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      34e2c555
  3. 05 2月, 2016 1 次提交
  4. 01 1月, 2016 2 次提交
  5. 03 12月, 2015 1 次提交
  6. 28 10月, 2015 3 次提交
  7. 16 9月, 2015 1 次提交
  8. 01 9月, 2015 4 次提交
  9. 08 8月, 2015 1 次提交
  10. 07 8月, 2015 1 次提交
    • V
      cpufreq: Allow drivers to enable boost support after registering driver · 44139ed4
      Viresh Kumar 提交于
      In some cases it wouldn't be known at time of driver registration, if
      the driver needs to support boost frequencies.
      
      For example, while getting boost information from DT with opp-v2
      bindings, we need to parse the bindings for all the CPUs to know if
      turbo/boost OPPs are supported or not.
      
      One way out to do that efficiently is to delay supporting boost mode
      (i.e. creating /sys/devices/system/cpu/cpufreq/boost file), until the
      time OPP bindings are parsed.
      
      At that point, the driver can enable boost support. This can be done at
      ->init(), where the frequency table is created.
      
      To do that, the driver requires few APIs from cpufreq core that let him
      do this. This patch provides these APIs.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      44139ed4
  11. 28 7月, 2015 1 次提交
    • R
      cpufreq: Avoid attempts to create duplicate symbolic links · 559ed407
      Rafael J. Wysocki 提交于
      After commit 87549141 (cpufreq: Stop migrating sysfs files on
      hotplug) there is a problem with CPUs that share cpufreq policy
      objects with other CPUs and are initially offline.
      
      Say CPU1 shares a policy with CPU0 which is online and is registered
      first.  As part of the registration process, cpufreq_add_dev() is
      called for it.  It creates the policy object and a symbolic link
      to it from the CPU1's sysfs directory.  If CPU1 is registered
      subsequently and it is offline at that time, cpufreq_add_dev() will
      attempt to create a symbolic link to the policy object for it, but
      that link is present already, so a warning about that will be
      triggered.
      
      To avoid that warning, make cpufreq use an additional CPU mask
      containing related CPUs that are actually present for each policy
      object.  That mask is initialized when the policy object is populated
      after its creation (for the first online CPU using it) and it includes
      CPUs from the "policy CPUs" mask returned by the cpufreq driver's
      ->init() callback that are physically present at that time.  Symbolic
      links to the policy are created only for the CPUs in that mask.
      
      If cpufreq_add_dev() is invoked for an offline CPU, it checks the
      new mask and only creates the symlink if the CPU was not in it (the
      CPU is added to the mask at the same time).
      
      In turn, cpufreq_remove_dev() drops the given CPU from the new mask,
      removes its symlink to the policy object and returns, unless it is
      the CPU owning the policy object.  In that case, the policy object
      is moved to a new CPU's sysfs directory or deleted if the CPU being
      removed was the last user of the policy.
      
      While at it, notice that cpufreq_remove_dev() can't fail, because
      its return value is ignored, so make it ignore return values from
      __cpufreq_remove_dev_prepare() and __cpufreq_remove_dev_finish()
      and prevent these functions from aborting on errors returned by
      __cpufreq_governor().  Also drop the now unused sif argument from
      them.
      
      Fixes: 87549141 (cpufreq: Stop migrating sysfs files on hotplug)
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reported-and-tested-by: NRussell King <linux@arm.linux.org.uk>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      559ed407
  12. 23 5月, 2015 1 次提交
  13. 15 5月, 2015 1 次提交
    • V
      cpufreq: Manage governor usage history with 'policy->last_governor' · 4573237b
      Viresh Kumar 提交于
      History of which governor was used last is common to all CPUs within a
      policy and maintaining it per-cpu isn't the best approach for sure.
      
      Apart from wasting memory, this also increases the complexity of
      managing this data structure as it has to be updated for all CPUs.
      
      To make that somewhat simpler, lets store this information in a new
      field 'last_governor' in struct cpufreq_policy and update it on removal
      of last cpu of a policy.
      
      As a side-effect it also solves an old problem, consider a system with
      two clusters 0 & 1. And there is one policy per cluster.
      
      Cluster 0: CPU0 and 1.
      Cluster 1: CPU2 and 3.
      
       - CPU2 is first brought online, and governor is set to performance
         (default as cpufreq_cpu_governor wasn't set).
       - Governor is changed to ondemand.
       - CPU2 is taken offline and cpufreq_cpu_governor is updated for CPU2.
       - CPU3 is brought online.
       - Because cpufreq_cpu_governor wasn't set for CPU3, the default governor
         performance is picked for CPU3.
      
      This patch fixes the bug as we now have a single variable to update for
      policy.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      4573237b
  14. 24 1月, 2015 3 次提交
  15. 30 11月, 2014 2 次提交
  16. 21 10月, 2014 1 次提交
  17. 09 9月, 2014 1 次提交
  18. 21 7月, 2014 1 次提交
  19. 18 7月, 2014 1 次提交
    • B
      cpufreq: make table sentinel macros unsigned to match use · 2b1987a9
      Brian W Hart 提交于
      Commit 5eeaf1f1 (cpufreq: Fix build error on some platforms that
      use cpufreq_for_each_*) moved function cpufreq_next_valid() to a public
      header.  Warnings are now generated when objects including that header
      are built with -Wsign-compare (as an out-of-tree module might be):
      
      .../include/linux/cpufreq.h: In function ‘cpufreq_next_valid’:
      .../include/linux/cpufreq.h:519:27: warning: comparison between signed
      and unsigned integer expressions [-Wsign-compare]
        while ((*pos)->frequency != CPUFREQ_TABLE_END)
                                 ^
      .../include/linux/cpufreq.h:520:25: warning: comparison between signed
      and unsigned integer expressions [-Wsign-compare]
         if ((*pos)->frequency != CPUFREQ_ENTRY_INVALID)
                               ^
      
      Constants CPUFREQ_ENTRY_INVALID and CPUFREQ_TABLE_END are signed, but
      are used with unsigned member 'frequency' of cpufreq_frequency_table.
      Update the macro definitions to be explicitly unsigned to match their
      use.
      
      This also corrects potentially wrong behavior of clk_rate_table_iter()
      if unsigned long is wider than usigned int.
      
      Fixes: 5eeaf1f1 (cpufreq: Fix build error on some platforms that use cpufreq_for_each_*)
      Signed-off-by: NBrian W Hart <hartb@linux.vnet.ibm.com>
      Reviewed-by: NSimon Horman <horms+renesas@verge.net.au>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2b1987a9
  20. 06 6月, 2014 1 次提交
    • V
      cpufreq: add support for intermediate (stable) frequencies · 1c03a2d0
      Viresh Kumar 提交于
      Douglas Anderson, recently pointed out an interesting problem due to which
      udelay() was expiring earlier than it should.
      
      While transitioning between frequencies few platforms may temporarily switch to
      a stable frequency, waiting for the main PLL to stabilize.
      
      For example: When we transition between very low frequencies on exynos, like
      between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
      No CPUFREQ notification is sent for that. That means there's a period of time
      when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
      and 300MHz. And so udelay behaves badly.
      
      To get this fixed in a generic way, introduce another set of callbacks
      get_intermediate() and target_intermediate(), only for drivers with
      target_index() and CPUFREQ_ASYNC_NOTIFICATION unset.
      
      get_intermediate() should return a stable intermediate frequency platform wants
      to switch to, and target_intermediate() should set CPU to that frequency,
      before jumping to the frequency corresponding to 'index'. Core will take care of
      sending notifications and driver doesn't have to handle them in
      target_intermediate() or target_index().
      
      NOTE: ->target_index() should restore to policy->restore_freq in case of
      failures as core would send notifications for that.
      Tested-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Reviewed-by: NDoug Anderson <dianders@chromium.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      1c03a2d0
  21. 08 5月, 2014 1 次提交
  22. 07 5月, 2014 2 次提交
    • N
      PM / OPP: Move cpufreq specific OPP functions out of generic OPP library · a0dd7b79
      Nishanth Menon 提交于
      CPUFreq specific helper functions for OPP (Operating Performance Points)
      now use generic OPP functions that allow CPUFreq to be be moved back
      into CPUFreq framework. This allows for independent modifications
      or future enhancements as needed isolated to just CPUFreq framework
      alone.
      
      Here, we just move relevant code and documentation to make this part of
      CPUFreq infrastructure.
      
      Cc: Kevin Hilman <khilman@deeprootsystems.com>
      Signed-off-by: NNishanth Menon <nm@ti.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a0dd7b79
    • S
      cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end · ca654dc3
      Srivatsa S. Bhat 提交于
      Some cpufreq drivers were redundantly invoking the _begin() and _end()
      APIs around frequency transitions, and this double invocation (one from
      the cpufreq core and the other from the cpufreq driver) used to result
      in a self-deadlock, leading to system hangs during boot. (The _begin()
      API makes contending callers wait until the previous invocation is
      complete. Hence, the cpufreq driver would end up waiting on itself!).
      
      Now all such drivers have been fixed, but debugging this issue was not
      very straight-forward (even lockdep didn't catch this). So let us add a
      debug infrastructure to the cpufreq core to catch such issues more easily
      in the future.
      
      We add a new field called 'transition_task' to the policy structure, to keep
      track of the task which is performing the frequency transition. Using this
      field, we make note of this task during _begin() and print a warning if we
      find a case where the same task is calling _begin() again, before completing
      the previous frequency transition using the corresponding _end().
      
      We have left out ASYNC_NOTIFICATION drivers from this debug infrastructure
      for 2 reasons:
      
      1. At the moment, we have no way to avoid a particular scenario where this
         debug infrastructure can emit false-positive warnings for such drivers.
         The scenario is depicted below:
      
               Task A						Task B
      
          /* 1st freq transition */
          Invoke _begin() {
                  ...
                  ...
          }
      
          Change the frequency
      
          /* 2nd freq transition */
          Invoke _begin() {
      	    ...	//waiting for B to
                  ... //finish _end() for
      	    ... //the 1st transition
      	    ...	      |				Got interrupt for successful
      	    ...	      |				change of frequency (1st one).
      	    ...       |
      	    ...	      |				/* 1st freq transition */
      	    ...	      |				Invoke _end() {
      	    ...	      |					...
      	    ...	      V				}
      	    ...
      	    ...
          }
      
         This scenario is actually deadlock-free because, once Task A changes the
         frequency, it is Task B's responsibility to invoke the corresponding
         _end() for the 1st frequency transition. Hence it is perfectly legal for
         Task A to go ahead and attempt another frequency transition in the meantime.
         (Of course it won't be able to proceed until Task B finishes the 1st _end(),
         but this doesn't cause a deadlock or a hang).
      
         The debug infrastructure cannot handle this scenario and will treat it as
         a deadlock and print a warning. To avoid this, we exclude such drivers
         from the purview of this code.
      
      2. Luckily, we don't _need_ this infrastructure for ASYNC_NOTIFICATION drivers
         at all! The cpufreq core does not automatically invoke the _begin() and
         _end() APIs during frequency transitions in such drivers. Thus, the driver
         alone is responsible for invoking _begin()/_end() and hence there shouldn't
         be any conflicts which lead to double invocations. So, we can skip these
         drivers, since the probability that such drivers will hit this problem is
         extremely low, as outlined above.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ca654dc3
  23. 30 4月, 2014 1 次提交
  24. 07 4月, 2014 1 次提交
    • V
      cpufreq: create another field .flags in cpufreq_frequency_table · 7f4b0461
      Viresh Kumar 提交于
      Currently cpufreq frequency table has two fields: frequency and driver_data.
      driver_data is only for drivers' internal use and cpufreq core shouldn't use
      it at all. But with the introduction of BOOST frequencies, this assumption
      was broken and we started using it as a flag instead.
      
      There are two problems due to this:
      - It is against the description of this field, as driver's data is used by
        the core now.
      - if drivers fill it with -3 for any frequency, then those frequencies are
        never considered by cpufreq core as it is exactly same as value of
        CPUFREQ_BOOST_FREQ, i.e. ~2.
      
      The best way to get this fixed is by creating another field flags which
      will be used for such flags. This patch does that. Along with that various
      drivers need modifications due to the change of struct cpufreq_frequency_table.
      Reviewed-by: NGautham R Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      7f4b0461
  25. 26 3月, 2014 2 次提交
    • V
      cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static · 236a9800
      Viresh Kumar 提交于
      cpufreq_notify_transition() and cpufreq_notify_post_transition() shouldn't be
      called directly by cpufreq drivers anymore and so these should be marked static.
      Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      236a9800
    • S
      cpufreq: Make sure frequency transitions are serialized · 12478cf0
      Srivatsa S. Bhat 提交于
      Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
      notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
      should strictly alternate, thereby preventing two different sets of PRECHANGE or
      POSTCHANGE notifiers from interleaving arbitrarily.
      
      The following examples illustrate why this is important:
      
      Scenario 1:
      -----------
      A thread reading the value of cpuinfo_cur_freq, will call
      __cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
      
      The ondemand governor can decide to change the frequency of the CPU at the same
      time and hence it can end up sending the notifications via ->target().
      
      If the notifiers are not serialized, the following sequence can occur:
      - PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
      - PRECHANGE Notification for freq B (from target())
      - Freq changed by target() to B
      - POSTCHANGE Notification for freq B
      - POSTCHANGE Notification for freq A
      
      We can see from the above that the last POSTCHANGE Notification happens for freq
      A but the hardware is set to run at freq B.
      
      Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
      in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
      loops_per_jiffy calculations will get messed up.
      
      Scenario 2:
      -----------
      The governor calls __cpufreq_driver_target() to change the frequency. At the
      same time, if we change scaling_{min|max}_freq from sysfs, it will end up
      calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
      __cpufreq_driver_target(). And hence we end up issuing concurrent calls to
      ->target().
      
      Typically, platforms have the following logic in their ->target() routines:
      (Eg: cpufreq-cpu0, omap, exynos, etc)
      
      A. If new freq is more than old: Increase voltage
      B. Change freq
      C. If new freq is less than old: decrease voltage
      
      Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
      increase the freq and Y is trying to decrease it, we get the following race
      condition:
      
      X.A: voltage gets increased for larger freq
      Y.A: nothing happens
      Y.B: freq gets decreased
      Y.C: voltage gets decreased
      X.B: freq gets increased
      X.C: nothing happens
      
      Thus we can end up setting a freq which is not supported by the voltage we have
      set. That will probably make the clock to the CPU unstable and the system might
      not work properly anymore.
      
      This patch introduces a set of synchronization primitives to serialize frequency
      transitions, which are to be used as shown below:
      
      cpufreq_freq_transition_begin();
      
      //Perform the frequency change
      
      cpufreq_freq_transition_end();
      
      The _begin() call sends the PRECHANGE notification whereas the _end() call sends
      the POSTCHANGE notification. Also, all the necessary synchronization is handled
      within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
      flag can also use these APIs for performing frequency transitions (ie., you can
      call _begin() from one task, and call the corresponding _end() from a different
      task).
      
      The actual synchronization underneath is not that complicated:
      
      The key challenge is to allow drivers to begin the transition from one thread
      and end it in a completely different thread (this is to enable drivers that do
      asynchronous POSTCHANGE notification from bottom-halves, to also use the same
      interface).
      
      To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
      wait-queue are added per-policy. The flag and the wait-queue are used in
      conjunction to create an "uninterrupted flow" from _begin() to _end(). The
      spinlock is used to ensure that only one such "flow" is in flight at any given
      time. Put together, this provides us all the necessary synchronization.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      12478cf0
  26. 20 3月, 2014 1 次提交
  27. 19 3月, 2014 1 次提交