1. 29 9月, 2014 1 次提交
    • P
      cpufreq: Allow stop CPU callback to be used by all cpufreq drivers · 789ca243
      Preeti U Murthy 提交于
      Commit 367dc4aa ("cpufreq: Add stop CPU callback to
      cpufreq_driver interface") introduced the stop CPU callback for
      intel_pstate drivers. During the CPU_DOWN_PREPARE stage, this
      callback is invoked so that drivers can take some action on the
      pstate of the cpu before it is taken offline. This callback was
      assumed to be useful only for those drivers which have implemented
      the set_policy CPU callback because they have no other way to take
      action about the cpufreq of a CPU which is being hotplugged out
      except in the exit callback which is called very late in the offline
      process.
      
      The drivers which implement the target/target_index callbacks were
      expected to take care of requirements like the ones that commit
      367dc4aa addresses in the GOV_STOP notification event. But there
      are disadvantages to restricting the usage of stop CPU callback
      to cpufreq drivers that implement the set_policy callbacks and who
      want to take explicit action on the setting the cpufreq during a
      hotplug operation.
      
      1.GOV_STOP gets called for every CPU offline and drivers would usually
      want to take action when the last cpu in the policy->cpus mask
      is taken offline. As long as there is more than one cpu in the
      policy->cpus mask, cpufreq core itself makes sure that the freq
      for the other cpus in this mask is set according to the maximum load.
      This is sensible and drivers which implement the target_index callback
      would mostly not want to modify that. However the cpufreq core leaves a
      loose end when the cpu in the policy->cpus mask is the last one to go offline;
      it does nothing explicit to the frequency of the core. Drivers may need
      a way to take some action here and stop CPU callback mechanism is the
      best way to do it today.
      
      2. We cannot implement driver specific actions in the GOV_STOP mechanism.
      So we will need another driver callback which is invoked from here which is
      unnecessary.
      
      Therefore this patch extends the usage of stop CPU callback to be used
      by all cpufreq drivers as long as they have this callback implemented
      and irrespective of whether they are set_policy/target_index drivers.
      The assumption is if the drivers find the GOV_STOP path to be a suitable
      way of implementing what they want to do with the freq of the cpu
      going offine,they will not implement the stop CPU callback at all.
      Signed-off-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      789ca243
  2. 21 7月, 2014 3 次提交
  3. 17 7月, 2014 1 次提交
    • V
      cpufreq: move policy kobj to policy->cpu at resume · 92c14bd9
      Viresh Kumar 提交于
      This is only relevant to implementations with multiple clusters, where clusters
      have separate clock lines but all CPUs within a cluster share it.
      
      Consider a dual cluster platform with 2 cores per cluster. During suspend we
      start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj
      would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its
      kobj as we want to retain permissions/values/etc.
      
      Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev().
      We will recover the old policy and update policy->cpu from 3 to 2 from
      update_policy_cpu().
      
      But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a
      link for CPU2, but would try that for CPU3 while bringing it online. Which will
      report errors as CPU3 already has kobj assigned to it.
      
      This bug got introduced with commit 42f921a6, which overlooked this scenario.
      
      To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a
      cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here
      only for the first CPU of a non-boot cluster. And we can't recover from this
      situation, if kobject_move() fails.
      
      Fixes: 42f921a6 (cpufreq: remove sysfs files for CPUs which failed to come back after resume)
      Cc:  3.13+ <stable@vger.kernel.org> # 3.13+
      Reported-and-tested-by: NBu Yitian <ybu@qti.qualcomm.com>
      Reported-by: NSaravana Kannan <skannan@codeaurora.org>
      Reviewed-by: NSrivatsa S. Bhat <srivatsa@mit.edu>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      92c14bd9
  4. 19 6月, 2014 1 次提交
  5. 06 6月, 2014 1 次提交
    • V
      cpufreq: add support for intermediate (stable) frequencies · 1c03a2d0
      Viresh Kumar 提交于
      Douglas Anderson, recently pointed out an interesting problem due to which
      udelay() was expiring earlier than it should.
      
      While transitioning between frequencies few platforms may temporarily switch to
      a stable frequency, waiting for the main PLL to stabilize.
      
      For example: When we transition between very low frequencies on exynos, like
      between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
      No CPUFREQ notification is sent for that. That means there's a period of time
      when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
      and 300MHz. And so udelay behaves badly.
      
      To get this fixed in a generic way, introduce another set of callbacks
      get_intermediate() and target_intermediate(), only for drivers with
      target_index() and CPUFREQ_ASYNC_NOTIFICATION unset.
      
      get_intermediate() should return a stable intermediate frequency platform wants
      to switch to, and target_intermediate() should set CPU to that frequency,
      before jumping to the frequency corresponding to 'index'. Core will take care of
      sending notifications and driver doesn't have to handle them in
      target_intermediate() or target_index().
      
      NOTE: ->target_index() should restore to policy->restore_freq in case of
      failures as core would send notifications for that.
      Tested-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Reviewed-by: NDoug Anderson <dianders@chromium.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      1c03a2d0
  6. 29 5月, 2014 1 次提交
  7. 08 5月, 2014 1 次提交
  8. 07 5月, 2014 1 次提交
    • S
      cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end · ca654dc3
      Srivatsa S. Bhat 提交于
      Some cpufreq drivers were redundantly invoking the _begin() and _end()
      APIs around frequency transitions, and this double invocation (one from
      the cpufreq core and the other from the cpufreq driver) used to result
      in a self-deadlock, leading to system hangs during boot. (The _begin()
      API makes contending callers wait until the previous invocation is
      complete. Hence, the cpufreq driver would end up waiting on itself!).
      
      Now all such drivers have been fixed, but debugging this issue was not
      very straight-forward (even lockdep didn't catch this). So let us add a
      debug infrastructure to the cpufreq core to catch such issues more easily
      in the future.
      
      We add a new field called 'transition_task' to the policy structure, to keep
      track of the task which is performing the frequency transition. Using this
      field, we make note of this task during _begin() and print a warning if we
      find a case where the same task is calling _begin() again, before completing
      the previous frequency transition using the corresponding _end().
      
      We have left out ASYNC_NOTIFICATION drivers from this debug infrastructure
      for 2 reasons:
      
      1. At the moment, we have no way to avoid a particular scenario where this
         debug infrastructure can emit false-positive warnings for such drivers.
         The scenario is depicted below:
      
               Task A						Task B
      
          /* 1st freq transition */
          Invoke _begin() {
                  ...
                  ...
          }
      
          Change the frequency
      
          /* 2nd freq transition */
          Invoke _begin() {
      	    ...	//waiting for B to
                  ... //finish _end() for
      	    ... //the 1st transition
      	    ...	      |				Got interrupt for successful
      	    ...	      |				change of frequency (1st one).
      	    ...       |
      	    ...	      |				/* 1st freq transition */
      	    ...	      |				Invoke _end() {
      	    ...	      |					...
      	    ...	      V				}
      	    ...
      	    ...
          }
      
         This scenario is actually deadlock-free because, once Task A changes the
         frequency, it is Task B's responsibility to invoke the corresponding
         _end() for the 1st frequency transition. Hence it is perfectly legal for
         Task A to go ahead and attempt another frequency transition in the meantime.
         (Of course it won't be able to proceed until Task B finishes the 1st _end(),
         but this doesn't cause a deadlock or a hang).
      
         The debug infrastructure cannot handle this scenario and will treat it as
         a deadlock and print a warning. To avoid this, we exclude such drivers
         from the purview of this code.
      
      2. Luckily, we don't _need_ this infrastructure for ASYNC_NOTIFICATION drivers
         at all! The cpufreq core does not automatically invoke the _begin() and
         _end() APIs during frequency transitions in such drivers. Thus, the driver
         alone is responsible for invoking _begin()/_end() and hence there shouldn't
         be any conflicts which lead to double invocations. So, we can skip these
         drivers, since the probability that such drivers will hit this problem is
         extremely low, as outlined above.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ca654dc3
  9. 30 4月, 2014 1 次提交
  10. 26 3月, 2014 4 次提交
    • V
      cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static · 236a9800
      Viresh Kumar 提交于
      cpufreq_notify_transition() and cpufreq_notify_post_transition() shouldn't be
      called directly by cpufreq drivers anymore and so these should be marked static.
      Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      236a9800
    • V
      cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end} · 8fec051e
      Viresh Kumar 提交于
      CPUFreq core has new infrastructure that would guarantee serialized calls to
      target() or target_index() callbacks. These are called
      cpufreq_freq_transition_begin() and cpufreq_freq_transition_end().
      
      This patch converts existing drivers to use these new set of routines.
      Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      8fec051e
    • S
      cpufreq: Make sure frequency transitions are serialized · 12478cf0
      Srivatsa S. Bhat 提交于
      Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
      notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
      should strictly alternate, thereby preventing two different sets of PRECHANGE or
      POSTCHANGE notifiers from interleaving arbitrarily.
      
      The following examples illustrate why this is important:
      
      Scenario 1:
      -----------
      A thread reading the value of cpuinfo_cur_freq, will call
      __cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
      
      The ondemand governor can decide to change the frequency of the CPU at the same
      time and hence it can end up sending the notifications via ->target().
      
      If the notifiers are not serialized, the following sequence can occur:
      - PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
      - PRECHANGE Notification for freq B (from target())
      - Freq changed by target() to B
      - POSTCHANGE Notification for freq B
      - POSTCHANGE Notification for freq A
      
      We can see from the above that the last POSTCHANGE Notification happens for freq
      A but the hardware is set to run at freq B.
      
      Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
      in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
      loops_per_jiffy calculations will get messed up.
      
      Scenario 2:
      -----------
      The governor calls __cpufreq_driver_target() to change the frequency. At the
      same time, if we change scaling_{min|max}_freq from sysfs, it will end up
      calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
      __cpufreq_driver_target(). And hence we end up issuing concurrent calls to
      ->target().
      
      Typically, platforms have the following logic in their ->target() routines:
      (Eg: cpufreq-cpu0, omap, exynos, etc)
      
      A. If new freq is more than old: Increase voltage
      B. Change freq
      C. If new freq is less than old: decrease voltage
      
      Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
      increase the freq and Y is trying to decrease it, we get the following race
      condition:
      
      X.A: voltage gets increased for larger freq
      Y.A: nothing happens
      Y.B: freq gets decreased
      Y.C: voltage gets decreased
      X.B: freq gets increased
      X.C: nothing happens
      
      Thus we can end up setting a freq which is not supported by the voltage we have
      set. That will probably make the clock to the CPU unstable and the system might
      not work properly anymore.
      
      This patch introduces a set of synchronization primitives to serialize frequency
      transitions, which are to be used as shown below:
      
      cpufreq_freq_transition_begin();
      
      //Perform the frequency change
      
      cpufreq_freq_transition_end();
      
      The _begin() call sends the PRECHANGE notification whereas the _end() call sends
      the POSTCHANGE notification. Also, all the necessary synchronization is handled
      within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
      flag can also use these APIs for performing frequency transitions (ie., you can
      call _begin() from one task, and call the corresponding _end() from a different
      task).
      
      The actual synchronization underneath is not that complicated:
      
      The key challenge is to allow drivers to begin the transition from one thread
      and end it in a completely different thread (this is to enable drivers that do
      asynchronous POSTCHANGE notification from bottom-halves, to also use the same
      interface).
      
      To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
      wait-queue are added per-policy. The flag and the wait-queue are used in
      conjunction to create an "uninterrupted flow" from _begin() to _end(). The
      spinlock is used to ensure that only one such "flow" is in flight at any given
      time. Put together, this provides us all the necessary synchronization.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      12478cf0
    • V
      cpufreq: resume drivers before enabling governors · 0c5aa405
      Viresh Kumar 提交于
      During suspend, we first stop governors and then suspend cpufreq drivers and
      resume must be exactly opposite of that. i.e. resume drivers first and then
      start governors.
      
      But the current code in resume enables governors first and then resume drivers.
      Fix it be changing code sequence there.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      0c5aa405
  11. 20 3月, 2014 3 次提交
  12. 19 3月, 2014 2 次提交
  13. 13 3月, 2014 1 次提交
    • R
      cpufreq: Skip current frequency initialization for ->setpolicy drivers · 2ed99e39
      Rafael J. Wysocki 提交于
      After commit da60ce9f (cpufreq: call cpufreq_driver->get() after
      calling ->init()) __cpufreq_add_dev() sometimes fails for CPUs handled
      by intel_pstate, because that driver may return 0 from its ->get()
      callback if it has not run long enough to collect enough samples on the
      given CPU.  That didn't happen before commit da60ce9f which added
      policy->cur initialization to __cpufreq_add_dev() to help reduce code
      duplication in other cpufreq drivers.
      
      However, the code added by commit da60ce9f need not be executed
      for cpufreq drivers having the ->setpolicy callback defined, because
      the subsequent invocation of cpufreq_set_policy() will use that
      callback to initialize the policy anyway and it doesn't need
      policy->cur to be initialized upfront.  The analogous code in
      cpufreq_update_policy() is also unnecessary for cpufreq drivers
      having ->setpolicy set and may be skipped for them as well.
      
      Since intel_pstate provides ->setpolicy, skipping the upfront
      policy->cur initialization for cpufreq drivers with that callback
      set will cover intel_pstate and the problem it's been having after
      commit da60ce9f will be addressed.
      
      Fixes: da60ce9f (cpufreq: call cpufreq_driver->get() after calling ->init())
      References: https://bugzilla.kernel.org/show_bug.cgi?id=71931Reported-and-tested-by: NPatrik Lundquist <patrik.lundquist@gmail.com>
      Acked-by: NDirk Brandewie <dirk.j.brandewie@intel.com>
      Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2ed99e39
  14. 12 3月, 2014 3 次提交
  15. 06 3月, 2014 6 次提交
    • V
      cpufreq: Implement cpufreq_generic_suspend() · e28867ea
      Viresh Kumar 提交于
      Multiple platforms need to set CPUs to a particular frequency before
      suspending the system, so provide a common infrastructure for them.
      
      Those platforms only need to point their ->suspend callback pointers
      to the generic routine.
      Tested-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e28867ea
    • V
      cpufreq: suspend governors on system suspend/hibernate · 2f0aea93
      Viresh Kumar 提交于
      This patch adds cpufreq suspend/resume calls to dpm_{suspend|resume}()
      for handling suspend/resume of cpufreq governors.
      
      Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found an issue where the
      tunables configuration for clusters/sockets with non-boot CPUs was
      lost after system suspend/resume, as we were notifying governors with
      CPUFREQ_GOV_POLICY_EXIT on removal of the last CPU for that policy
      which caused the tunables memory to be freed.
      
      This is fixed by preventing any governor operations from being
      carried out between the device suspend and device resume stages of
      system suspend and resume, respectively.
      
      We could have added these callbacks at dpm_{suspend|resume}_noirq()
      level, but there is an additional problem that the majority of I/O
      devices is already suspended at that point and if cpufreq drivers
      want to change the frequency before suspending, then that not be
      possible on some platforms (which depend on peripherals like i2c,
      regulators, etc).
      Reported-and-tested-by: NLan Tianyu <tianyu.lan@intel.com>
      Reported-by: NJinhyuk Choi <jinchoi@broadcom.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2f0aea93
    • V
      cpufreq: move call to __find_governor() to cpufreq_init_policy() · 6e2c89d1
      viresh kumar 提交于
      We call __find_governor() during the addition of the first CPU of
      each policy from __cpufreq_add_dev() to find the last governor used
      for this CPU before it was hot-removed.
      
      After that we call cpufreq_parse_governor() in cpufreq_init_policy(),
      either with this governor, or with the default governor. Right after
      that policy->governor is set to NULL.
      
      While that code is not functionally problematic, the structure of it
      is suboptimal, because some of the code required in cpufreq_init_policy()
      is being executed by its caller, __cpufreq_add_dev(). So, it would make
      more sense to get all of it together in a single place to make code more
      readable.
      
      Accordingly, move the code needed for policy initialization to
      cpufreq_init_policy() and initialize policy->governor to NULL at the
      beginning.
      
      In order to clean up the code a bit more, some of the #ifdefs for
      CONFIG_HOTPLUG_CPU are dropped too.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6e2c89d1
    • V
      cpufreq: Initialize governor for a new policy under policy->rwsem · 4e97b631
      Viresh Kumar 提交于
      policy->rwsem is used to lock access to all parts of code modifying
      struct cpufreq_policy, but it's not used on a new policy created by
      __cpufreq_add_dev().
      
      Because of that, if cpufreq_update_policy() is called in a tight loop
      on one CPU in parallel with offline/online of another CPU, then the
      following crash can be triggered:
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000020
      pgd = c0003000
      [00000020] *pgd=80000000004003, *pmd=00000000
      Internal error: Oops: 206 [#1] PREEMPT SMP ARM
      
      PC is at __cpufreq_governor+0x10/0x1ac
      LR is at cpufreq_update_policy+0x114/0x150
      
      ---[ end trace f23a8defea6cd706 ]---
      Kernel panic - not syncing: Fatal exception
      CPU0: stopping
      CPU: 0 PID: 7136 Comm: mpdecision Tainted: G      D W    3.10.0-gd727407-00074-g979ede8 #396
      
      [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58)
      [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c)
      [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8)
      [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98)
      [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4)
      [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84)
      [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68)
      [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44)
      [<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc)
      [<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78)
      [<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74)
      [<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c)
      [<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180)
      [<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68)
      [<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30)
      
      Fix that by taking locks at appropriate places in __cpufreq_add_dev()
      as well.
      Reported-by: NSaravana Kannan <skannan@codeaurora.org>
      Suggested-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      4e97b631
    • V
      cpufreq: Initialize policy before making it available for others to use · 5a7e56a5
      Viresh Kumar 提交于
      Policy must be fully initialized before it is being made available
      for use by others. Otherwise cpufreq_cpu_get() would be able to grab
      a half initialized policy structure that might not have affected_cpus
      (for example) populated. Then, anybody accessing those fields will get
      a wrong value and that will lead to unpredictable results.
      
      In order to fix this, do all the necessary initialization before we
      make the policy structure available via cpufreq_cpu_get(). That will
      guarantee that any code accessing fields of the policy will get
      correct data from them.
      Reported-by: NSaravana Kannan <skannan@codeaurora.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5a7e56a5
    • A
      cpufreq: use cpufreq_cpu_get() to avoid cpufreq_get() race conditions · 999976e0
      Aaron Plattner 提交于
      If a module calls cpufreq_get while cpufreq is initializing, it's
      possible for it to be called after cpufreq_driver is set but before
      cpufreq_cpu_data is written during subsys_interface_register.  This
      happens because cpufreq_get doesn't take the cpufreq_driver_lock
      around its use of cpufreq_cpu_data.
      
      Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather
      than reading it out of cpufreq_cpu_data directly.  cpufreq_cpu_get()
      takes the appropriate locks to prevent this race from happening.
      
      Since it's possible for policy to be NULL if the caller passes in an
      invalid CPU number or calls the function before cpufreq is initialized,
      delete the BUG_ON(!policy) and simply return 0.  Don't try to return
      -ENOENT because that's negative and the function returns an unsigned
      integer.
      
      References: https://bbs.archlinux.org/viewtopic.php?id=177934Signed-off-by: NAaron Plattner <aplattner@nvidia.com>
      Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      999976e0
  16. 01 3月, 2014 1 次提交
  17. 27 2月, 2014 1 次提交
  18. 24 2月, 2014 2 次提交
  19. 19 2月, 2014 1 次提交
    • V
      cpufreq: remove sysfs link when a cpu != policy->cpu, is removed · 6964d91d
      viresh kumar 提交于
      Commit 42f921a6 (cpufreq: remove sysfs files for CPUs which failed to
      come back after resume) tried to do this but missed this piece of code
      to fix.
      
      Currently we are getting this on suspend/resume:
      
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 877 at fs/sysfs/dir.c:52 sysfs_warn_dup+0x68/0x84()
      sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
      Modules linked in: brcmfmac brcmutil
      CPU: 0 PID: 877 Comm: test-rtc-resume Not tainted 3.14.0-rc2-00259-g9398a10c #12
      [<c0015bac>] (unwind_backtrace) from [<c0011850>] (show_stack+0x10/0x14)
      [<c0011850>] (show_stack) from [<c056e018>] (dump_stack+0x80/0xcc)
      [<c056e018>] (dump_stack) from [<c0025e44>] (warn_slowpath_common+0x64/0x88)
      [<c0025e44>] (warn_slowpath_common) from [<c0025efc>] (warn_slowpath_fmt+0x30/0x40)
      [<c0025efc>] (warn_slowpath_fmt) from [<c012776c>] (sysfs_warn_dup+0x68/0x84)
      [<c012776c>] (sysfs_warn_dup) from [<c0127a54>] (sysfs_do_create_link_sd+0xb0/0xb8)
      [<c0127a54>] (sysfs_do_create_link_sd) from [<c038ef64>] (__cpufreq_add_dev.isra.27+0x2a8/0x814)
      [<c038ef64>] (__cpufreq_add_dev.isra.27) from [<c038f548>] (cpufreq_cpu_callback+0x70/0x8c)
      [<c038f548>] (cpufreq_cpu_callback) from [<c0043864>] (notifier_call_chain+0x44/0x84)
      [<c0043864>] (notifier_call_chain) from [<c0025f60>] (__cpu_notify+0x28/0x44)
      [<c0025f60>] (__cpu_notify) from [<c00261e8>] (_cpu_up+0xf0/0x140)
      [<c00261e8>] (_cpu_up) from [<c0569eb8>] (enable_nonboot_cpus+0x68/0xb0)
      [<c0569eb8>] (enable_nonboot_cpus) from [<c006339c>] (suspend_devices_and_enter+0x198/0x2dc)
      [<c006339c>] (suspend_devices_and_enter) from [<c0063654>] (pm_suspend+0x174/0x1e8)
      [<c0063654>] (pm_suspend) from [<c00624e0>] (state_store+0x6c/0xbc)
      [<c00624e0>] (state_store) from [<c01fc200>] (kobj_attr_store+0x14/0x20)
      [<c01fc200>] (kobj_attr_store) from [<c0126e50>] (sysfs_kf_write+0x44/0x48)
      [<c0126e50>] (sysfs_kf_write) from [<c012a274>] (kernfs_fop_write+0xb4/0x14c)
      [<c012a274>] (kernfs_fop_write) from [<c00d4818>] (vfs_write+0xa8/0x180)
      [<c00d4818>] (vfs_write) from [<c00d4bb8>] (SyS_write+0x3c/0x70)
      [<c00d4bb8>] (SyS_write) from [<c000e620>] (ret_fast_syscall+0x0/0x30)
      ---[ end trace 76969904b614c18f ]---
      
      Fix this by removing sysfs link for cpufreq directory when cpu removed
      isn't policy->cpu.
      
      Revamps: 42f921a6 (cpufreq: remove sysfs files for CPUs which failed to come back after resume)
      Reported-and-tested-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6964d91d
  20. 17 1月, 2014 3 次提交
    • L
      cpufreq: Add boost frequency support in core · 6f19efc0
      Lukasz Majewski 提交于
      This commit adds boost frequency support in cpufreq core (Hardware &
      Software). Some SoCs (like Exynos4 - e.g. 4x12) allow setting frequency
      above its normal operation limits. Such mode shall be only used for a
      short time.
      
      Overclocking (boost) support is essentially provided by platform
      dependent cpufreq driver.
      
      This commit unifies support for SW and HW (Intel) overclocking solutions
      in the core cpufreq driver. Previously the "boost" sysfs attribute was
      defined in the ACPI processor driver code. By default boost is disabled.
      One global attribute is available at: /sys/devices/system/cpu/cpufreq/boost.
      
      It only shows up when cpufreq driver supports overclocking.
      Under the hood frequencies dedicated for boosting are marked with a
      special flag (CPUFREQ_BOOST_FREQ) at driver's frequency table.
      It is the user's concern to enable/disable overclocking with a proper call
      to sysfs.
      
      The cpufreq_boost_trigger_state() function is defined non static on purpose.
      It is used later with thermal subsystem to provide automatic enable/disable
      of the BOOST feature.
      Signed-off-by: NLukasz Majewski <l.majewski@samsung.com>
      Signed-off-by: NMyungjoo Ham <myungjoo.ham@samsung.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6f19efc0
    • V
      cpufreq: introduce cpufreq_generic_get() routine · 652ed95d
      Viresh Kumar 提交于
      CPUFreq drivers that use clock frameworks interface,i.e. clk_get_rate(),
      to get CPUs clk rate, have similar sort of code used in most of them.
      
      This patch adds a generic ->get() which will do the same thing for them.
      All those drivers are required to now is to set .get to cpufreq_generic_get()
      and set their clk pointer in policy->clk during ->init().
      Acked-by: NHans-Christian Egtvedt <egtvedt@samfundet.no>
      Acked-by: NShawn Guo <shawn.guo@linaro.org>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NShawn Guo <shawn.guo@linaro.org>
      Acked-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      652ed95d
    • V
      cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly · fcd7af91
      Viresh Kumar 提交于
      There are several problems with cpufreq stats in the way it handles
      cpufreq_unregister_driver() and suspend/resume..
      
       - We must not lose data collected so far when suspend/resume happens
         and so stats directories must not be removed/allocated during these
         operations, which is done currently.
      
       - cpufreq_stat has registered notifiers with both cpufreq and hotplug.
         It adds sysfs stats directory with a cpufreq notifier: CPUFREQ_NOTIFY
         and removes this directory with a notifier from hotplug core.
      
         In case cpufreq_unregister_driver() is called (on rmmod cpufreq driver),
         stats directories per cpu aren't removed as CPUs are still online. The
         only call cpufreq_stats gets is cpufreq_stats_update_policy_cpu() for
         all CPUs except the last of each policy. And pointer to stat information
         is stored in the entry for last CPU in the per-cpu cpufreq_stats_table.
         But policy structure would be freed inside cpufreq core and so that will
         result in memory leak inside cpufreq stats (as we are never freeing
         memory for stats).
      
         Now if we again insert the module cpufreq_register_driver() will be
         called and we will again allocate stats data and put it on for first
         CPU of every policy.  In case we only have a single CPU per policy, we
         will return with a error from cpufreq_stats_create_table() due to this
         code:
      
      	if (per_cpu(cpufreq_stats_table, cpu))
      		return -EBUSY;
      
         And so probably cpufreq stats directory would not show up anymore (as
         it was added inside last policies->kobj which doesn't exist anymore).
         I haven't tested it, though. Also the values in stats files wouldn't
         be refreshed as we are using the earlier stats structure.
      
       - CPUFREQ_NOTIFY is called from cpufreq_set_policy() which is called for
         scenarios where we don't really want cpufreq_stat_notifier_policy() to get
         called. For example whenever we are changing anything related to a policy:
         min/max/current freq, etc. cpufreq_set_policy() is called and so cpufreq
         stats is notified. Where we don't do any useful stuff other than simply
         returning with -EBUSY from cpufreq_stats_create_table(). And so this
         isn't the right notifier that cpufreq stats..
      
       Due to all above reasons this patch does following changes:
       - Add new notifiers CPUFREQ_CREATE_POLICY and CPUFREQ_REMOVE_POLICY,
         which are only called when policy is created/destroyed. They aren't
         called for suspend/resume paths..
       - Use these notifiers in cpufreq_stat_notifier_policy() to create/destory
         stats sysfs entries. And so cpufreq_unregister_driver() or suspend/resume
         shouldn't be a problem for cpufreq_stats.
       - Return early from cpufreq_stat_cpu_callback() for suspend/resume sequence,
         so that we don't free stats structure.
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Tested-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      fcd7af91
  21. 06 1月, 2014 2 次提交
    • V
      cpufreq: Make sure CPU is running on a freq from freq-table · d3916691
      Viresh Kumar 提交于
      Sometimes boot loaders set CPU frequency to a value outside of frequency table
      present with cpufreq core. In such cases CPU might be unstable if it has to run
      on that frequency for long duration of time and so its better to set it to a
      frequency which is specified in freq-table. This also makes cpufreq stats
      inconsistent as cpufreq-stats would fail to register because current frequency
      of CPU isn't found in freq-table.
      
      Because we don't want this change to affect boot process badly, we go for the
      next freq which is >= policy->cur ('cur' must be set by now, otherwise we will
      end up setting freq to lowest of the table as 'cur' is initialized to zero).
      
      In case current frequency doesn't match any frequency from freq-table, we throw
      warnings to user, so that user can get this fixed in their bootloaders or
      freq-tables.
      Reported-by: NCarlos Hernandez <ceh@ti.com>
      Reported-and-tested-by: NNishanth Menon <nm@ti.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      d3916691
    • V
      cpufreq: send new set of notification for transition failures · ab1b1c4e
      Viresh Kumar 提交于
      In the current code, if we fail during a frequency transition, we
      simply send the POSTCHANGE notification with the old frequency. This
      isn't enough.
      
      One of the core users of these notifications is the code responsible
      for keeping loops_per_jiffy aligned with frequency changes. And mostly
      it is written as:
      
      	if ((val == CPUFREQ_PRECHANGE  && freq->old < freq->new) ||
      	    (val == CPUFREQ_POSTCHANGE && freq->old > freq->new)) {
      		update-loops-per-jiffy...
      	}
      
      So, suppose we are changing to a higher frequency and failed during
      transition, then following will happen:
      - CPUFREQ_PRECHANGE notification with freq-new > freq-old
      - CPUFREQ_POSTCHANGE notification with freq-new == freq-old
      
      The first one will update loops_per_jiffy and second one will do
      nothing. Even if we send the 2nd notification by exchanging values of
      freq-new and old, some users of these notifications might get
      unstable.
      
      This can be fixed by simply calling cpufreq_notify_post_transition()
      with error code and this routine will take care of sending
      notifications in the correct order.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Folded 3 patches into one, rebased unicore2 changes]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ab1b1c4e