1. 19 3月, 2014 1 次提交
  2. 13 3月, 2014 1 次提交
    • R
      cpufreq: Skip current frequency initialization for ->setpolicy drivers · 2ed99e39
      Rafael J. Wysocki 提交于
      After commit da60ce9f (cpufreq: call cpufreq_driver->get() after
      calling ->init()) __cpufreq_add_dev() sometimes fails for CPUs handled
      by intel_pstate, because that driver may return 0 from its ->get()
      callback if it has not run long enough to collect enough samples on the
      given CPU.  That didn't happen before commit da60ce9f which added
      policy->cur initialization to __cpufreq_add_dev() to help reduce code
      duplication in other cpufreq drivers.
      
      However, the code added by commit da60ce9f need not be executed
      for cpufreq drivers having the ->setpolicy callback defined, because
      the subsequent invocation of cpufreq_set_policy() will use that
      callback to initialize the policy anyway and it doesn't need
      policy->cur to be initialized upfront.  The analogous code in
      cpufreq_update_policy() is also unnecessary for cpufreq drivers
      having ->setpolicy set and may be skipped for them as well.
      
      Since intel_pstate provides ->setpolicy, skipping the upfront
      policy->cur initialization for cpufreq drivers with that callback
      set will cover intel_pstate and the problem it's been having after
      commit da60ce9f will be addressed.
      
      Fixes: da60ce9f (cpufreq: call cpufreq_driver->get() after calling ->init())
      References: https://bugzilla.kernel.org/show_bug.cgi?id=71931Reported-and-tested-by: NPatrik Lundquist <patrik.lundquist@gmail.com>
      Acked-by: NDirk Brandewie <dirk.j.brandewie@intel.com>
      Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2ed99e39
  3. 12 3月, 2014 3 次提交
  4. 06 3月, 2014 6 次提交
    • V
      cpufreq: Implement cpufreq_generic_suspend() · e28867ea
      Viresh Kumar 提交于
      Multiple platforms need to set CPUs to a particular frequency before
      suspending the system, so provide a common infrastructure for them.
      
      Those platforms only need to point their ->suspend callback pointers
      to the generic routine.
      Tested-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e28867ea
    • V
      cpufreq: suspend governors on system suspend/hibernate · 2f0aea93
      Viresh Kumar 提交于
      This patch adds cpufreq suspend/resume calls to dpm_{suspend|resume}()
      for handling suspend/resume of cpufreq governors.
      
      Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found an issue where the
      tunables configuration for clusters/sockets with non-boot CPUs was
      lost after system suspend/resume, as we were notifying governors with
      CPUFREQ_GOV_POLICY_EXIT on removal of the last CPU for that policy
      which caused the tunables memory to be freed.
      
      This is fixed by preventing any governor operations from being
      carried out between the device suspend and device resume stages of
      system suspend and resume, respectively.
      
      We could have added these callbacks at dpm_{suspend|resume}_noirq()
      level, but there is an additional problem that the majority of I/O
      devices is already suspended at that point and if cpufreq drivers
      want to change the frequency before suspending, then that not be
      possible on some platforms (which depend on peripherals like i2c,
      regulators, etc).
      Reported-and-tested-by: NLan Tianyu <tianyu.lan@intel.com>
      Reported-by: NJinhyuk Choi <jinchoi@broadcom.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2f0aea93
    • V
      cpufreq: move call to __find_governor() to cpufreq_init_policy() · 6e2c89d1
      viresh kumar 提交于
      We call __find_governor() during the addition of the first CPU of
      each policy from __cpufreq_add_dev() to find the last governor used
      for this CPU before it was hot-removed.
      
      After that we call cpufreq_parse_governor() in cpufreq_init_policy(),
      either with this governor, or with the default governor. Right after
      that policy->governor is set to NULL.
      
      While that code is not functionally problematic, the structure of it
      is suboptimal, because some of the code required in cpufreq_init_policy()
      is being executed by its caller, __cpufreq_add_dev(). So, it would make
      more sense to get all of it together in a single place to make code more
      readable.
      
      Accordingly, move the code needed for policy initialization to
      cpufreq_init_policy() and initialize policy->governor to NULL at the
      beginning.
      
      In order to clean up the code a bit more, some of the #ifdefs for
      CONFIG_HOTPLUG_CPU are dropped too.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6e2c89d1
    • V
      cpufreq: Initialize governor for a new policy under policy->rwsem · 4e97b631
      Viresh Kumar 提交于
      policy->rwsem is used to lock access to all parts of code modifying
      struct cpufreq_policy, but it's not used on a new policy created by
      __cpufreq_add_dev().
      
      Because of that, if cpufreq_update_policy() is called in a tight loop
      on one CPU in parallel with offline/online of another CPU, then the
      following crash can be triggered:
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000020
      pgd = c0003000
      [00000020] *pgd=80000000004003, *pmd=00000000
      Internal error: Oops: 206 [#1] PREEMPT SMP ARM
      
      PC is at __cpufreq_governor+0x10/0x1ac
      LR is at cpufreq_update_policy+0x114/0x150
      
      ---[ end trace f23a8defea6cd706 ]---
      Kernel panic - not syncing: Fatal exception
      CPU0: stopping
      CPU: 0 PID: 7136 Comm: mpdecision Tainted: G      D W    3.10.0-gd727407-00074-g979ede8 #396
      
      [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58)
      [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c)
      [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8)
      [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98)
      [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4)
      [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84)
      [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68)
      [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44)
      [<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc)
      [<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78)
      [<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74)
      [<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c)
      [<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180)
      [<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68)
      [<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30)
      
      Fix that by taking locks at appropriate places in __cpufreq_add_dev()
      as well.
      Reported-by: NSaravana Kannan <skannan@codeaurora.org>
      Suggested-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      4e97b631
    • V
      cpufreq: Initialize policy before making it available for others to use · 5a7e56a5
      Viresh Kumar 提交于
      Policy must be fully initialized before it is being made available
      for use by others. Otherwise cpufreq_cpu_get() would be able to grab
      a half initialized policy structure that might not have affected_cpus
      (for example) populated. Then, anybody accessing those fields will get
      a wrong value and that will lead to unpredictable results.
      
      In order to fix this, do all the necessary initialization before we
      make the policy structure available via cpufreq_cpu_get(). That will
      guarantee that any code accessing fields of the policy will get
      correct data from them.
      Reported-by: NSaravana Kannan <skannan@codeaurora.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5a7e56a5
    • A
      cpufreq: use cpufreq_cpu_get() to avoid cpufreq_get() race conditions · 999976e0
      Aaron Plattner 提交于
      If a module calls cpufreq_get while cpufreq is initializing, it's
      possible for it to be called after cpufreq_driver is set but before
      cpufreq_cpu_data is written during subsys_interface_register.  This
      happens because cpufreq_get doesn't take the cpufreq_driver_lock
      around its use of cpufreq_cpu_data.
      
      Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather
      than reading it out of cpufreq_cpu_data directly.  cpufreq_cpu_get()
      takes the appropriate locks to prevent this race from happening.
      
      Since it's possible for policy to be NULL if the caller passes in an
      invalid CPU number or calls the function before cpufreq is initialized,
      delete the BUG_ON(!policy) and simply return 0.  Don't try to return
      -ENOENT because that's negative and the function returns an unsigned
      integer.
      
      References: https://bbs.archlinux.org/viewtopic.php?id=177934Signed-off-by: NAaron Plattner <aplattner@nvidia.com>
      Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      999976e0
  5. 01 3月, 2014 1 次提交
  6. 27 2月, 2014 1 次提交
  7. 24 2月, 2014 2 次提交
  8. 19 2月, 2014 1 次提交
    • V
      cpufreq: remove sysfs link when a cpu != policy->cpu, is removed · 6964d91d
      viresh kumar 提交于
      Commit 42f921a6 (cpufreq: remove sysfs files for CPUs which failed to
      come back after resume) tried to do this but missed this piece of code
      to fix.
      
      Currently we are getting this on suspend/resume:
      
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 877 at fs/sysfs/dir.c:52 sysfs_warn_dup+0x68/0x84()
      sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
      Modules linked in: brcmfmac brcmutil
      CPU: 0 PID: 877 Comm: test-rtc-resume Not tainted 3.14.0-rc2-00259-g9398a10c #12
      [<c0015bac>] (unwind_backtrace) from [<c0011850>] (show_stack+0x10/0x14)
      [<c0011850>] (show_stack) from [<c056e018>] (dump_stack+0x80/0xcc)
      [<c056e018>] (dump_stack) from [<c0025e44>] (warn_slowpath_common+0x64/0x88)
      [<c0025e44>] (warn_slowpath_common) from [<c0025efc>] (warn_slowpath_fmt+0x30/0x40)
      [<c0025efc>] (warn_slowpath_fmt) from [<c012776c>] (sysfs_warn_dup+0x68/0x84)
      [<c012776c>] (sysfs_warn_dup) from [<c0127a54>] (sysfs_do_create_link_sd+0xb0/0xb8)
      [<c0127a54>] (sysfs_do_create_link_sd) from [<c038ef64>] (__cpufreq_add_dev.isra.27+0x2a8/0x814)
      [<c038ef64>] (__cpufreq_add_dev.isra.27) from [<c038f548>] (cpufreq_cpu_callback+0x70/0x8c)
      [<c038f548>] (cpufreq_cpu_callback) from [<c0043864>] (notifier_call_chain+0x44/0x84)
      [<c0043864>] (notifier_call_chain) from [<c0025f60>] (__cpu_notify+0x28/0x44)
      [<c0025f60>] (__cpu_notify) from [<c00261e8>] (_cpu_up+0xf0/0x140)
      [<c00261e8>] (_cpu_up) from [<c0569eb8>] (enable_nonboot_cpus+0x68/0xb0)
      [<c0569eb8>] (enable_nonboot_cpus) from [<c006339c>] (suspend_devices_and_enter+0x198/0x2dc)
      [<c006339c>] (suspend_devices_and_enter) from [<c0063654>] (pm_suspend+0x174/0x1e8)
      [<c0063654>] (pm_suspend) from [<c00624e0>] (state_store+0x6c/0xbc)
      [<c00624e0>] (state_store) from [<c01fc200>] (kobj_attr_store+0x14/0x20)
      [<c01fc200>] (kobj_attr_store) from [<c0126e50>] (sysfs_kf_write+0x44/0x48)
      [<c0126e50>] (sysfs_kf_write) from [<c012a274>] (kernfs_fop_write+0xb4/0x14c)
      [<c012a274>] (kernfs_fop_write) from [<c00d4818>] (vfs_write+0xa8/0x180)
      [<c00d4818>] (vfs_write) from [<c00d4bb8>] (SyS_write+0x3c/0x70)
      [<c00d4bb8>] (SyS_write) from [<c000e620>] (ret_fast_syscall+0x0/0x30)
      ---[ end trace 76969904b614c18f ]---
      
      Fix this by removing sysfs link for cpufreq directory when cpu removed
      isn't policy->cpu.
      
      Revamps: 42f921a6 (cpufreq: remove sysfs files for CPUs which failed to come back after resume)
      Reported-and-tested-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6964d91d
  9. 17 1月, 2014 3 次提交
    • L
      cpufreq: Add boost frequency support in core · 6f19efc0
      Lukasz Majewski 提交于
      This commit adds boost frequency support in cpufreq core (Hardware &
      Software). Some SoCs (like Exynos4 - e.g. 4x12) allow setting frequency
      above its normal operation limits. Such mode shall be only used for a
      short time.
      
      Overclocking (boost) support is essentially provided by platform
      dependent cpufreq driver.
      
      This commit unifies support for SW and HW (Intel) overclocking solutions
      in the core cpufreq driver. Previously the "boost" sysfs attribute was
      defined in the ACPI processor driver code. By default boost is disabled.
      One global attribute is available at: /sys/devices/system/cpu/cpufreq/boost.
      
      It only shows up when cpufreq driver supports overclocking.
      Under the hood frequencies dedicated for boosting are marked with a
      special flag (CPUFREQ_BOOST_FREQ) at driver's frequency table.
      It is the user's concern to enable/disable overclocking with a proper call
      to sysfs.
      
      The cpufreq_boost_trigger_state() function is defined non static on purpose.
      It is used later with thermal subsystem to provide automatic enable/disable
      of the BOOST feature.
      Signed-off-by: NLukasz Majewski <l.majewski@samsung.com>
      Signed-off-by: NMyungjoo Ham <myungjoo.ham@samsung.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6f19efc0
    • V
      cpufreq: introduce cpufreq_generic_get() routine · 652ed95d
      Viresh Kumar 提交于
      CPUFreq drivers that use clock frameworks interface,i.e. clk_get_rate(),
      to get CPUs clk rate, have similar sort of code used in most of them.
      
      This patch adds a generic ->get() which will do the same thing for them.
      All those drivers are required to now is to set .get to cpufreq_generic_get()
      and set their clk pointer in policy->clk during ->init().
      Acked-by: NHans-Christian Egtvedt <egtvedt@samfundet.no>
      Acked-by: NShawn Guo <shawn.guo@linaro.org>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NShawn Guo <shawn.guo@linaro.org>
      Acked-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      652ed95d
    • V
      cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly · fcd7af91
      Viresh Kumar 提交于
      There are several problems with cpufreq stats in the way it handles
      cpufreq_unregister_driver() and suspend/resume..
      
       - We must not lose data collected so far when suspend/resume happens
         and so stats directories must not be removed/allocated during these
         operations, which is done currently.
      
       - cpufreq_stat has registered notifiers with both cpufreq and hotplug.
         It adds sysfs stats directory with a cpufreq notifier: CPUFREQ_NOTIFY
         and removes this directory with a notifier from hotplug core.
      
         In case cpufreq_unregister_driver() is called (on rmmod cpufreq driver),
         stats directories per cpu aren't removed as CPUs are still online. The
         only call cpufreq_stats gets is cpufreq_stats_update_policy_cpu() for
         all CPUs except the last of each policy. And pointer to stat information
         is stored in the entry for last CPU in the per-cpu cpufreq_stats_table.
         But policy structure would be freed inside cpufreq core and so that will
         result in memory leak inside cpufreq stats (as we are never freeing
         memory for stats).
      
         Now if we again insert the module cpufreq_register_driver() will be
         called and we will again allocate stats data and put it on for first
         CPU of every policy.  In case we only have a single CPU per policy, we
         will return with a error from cpufreq_stats_create_table() due to this
         code:
      
      	if (per_cpu(cpufreq_stats_table, cpu))
      		return -EBUSY;
      
         And so probably cpufreq stats directory would not show up anymore (as
         it was added inside last policies->kobj which doesn't exist anymore).
         I haven't tested it, though. Also the values in stats files wouldn't
         be refreshed as we are using the earlier stats structure.
      
       - CPUFREQ_NOTIFY is called from cpufreq_set_policy() which is called for
         scenarios where we don't really want cpufreq_stat_notifier_policy() to get
         called. For example whenever we are changing anything related to a policy:
         min/max/current freq, etc. cpufreq_set_policy() is called and so cpufreq
         stats is notified. Where we don't do any useful stuff other than simply
         returning with -EBUSY from cpufreq_stats_create_table(). And so this
         isn't the right notifier that cpufreq stats..
      
       Due to all above reasons this patch does following changes:
       - Add new notifiers CPUFREQ_CREATE_POLICY and CPUFREQ_REMOVE_POLICY,
         which are only called when policy is created/destroyed. They aren't
         called for suspend/resume paths..
       - Use these notifiers in cpufreq_stat_notifier_policy() to create/destory
         stats sysfs entries. And so cpufreq_unregister_driver() or suspend/resume
         shouldn't be a problem for cpufreq_stats.
       - Return early from cpufreq_stat_cpu_callback() for suspend/resume sequence,
         so that we don't free stats structure.
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Tested-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      fcd7af91
  10. 06 1月, 2014 4 次提交
    • V
      cpufreq: Make sure CPU is running on a freq from freq-table · d3916691
      Viresh Kumar 提交于
      Sometimes boot loaders set CPU frequency to a value outside of frequency table
      present with cpufreq core. In such cases CPU might be unstable if it has to run
      on that frequency for long duration of time and so its better to set it to a
      frequency which is specified in freq-table. This also makes cpufreq stats
      inconsistent as cpufreq-stats would fail to register because current frequency
      of CPU isn't found in freq-table.
      
      Because we don't want this change to affect boot process badly, we go for the
      next freq which is >= policy->cur ('cur' must be set by now, otherwise we will
      end up setting freq to lowest of the table as 'cur' is initialized to zero).
      
      In case current frequency doesn't match any frequency from freq-table, we throw
      warnings to user, so that user can get this fixed in their bootloaders or
      freq-tables.
      Reported-by: NCarlos Hernandez <ceh@ti.com>
      Reported-and-tested-by: NNishanth Menon <nm@ti.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      d3916691
    • V
      cpufreq: send new set of notification for transition failures · ab1b1c4e
      Viresh Kumar 提交于
      In the current code, if we fail during a frequency transition, we
      simply send the POSTCHANGE notification with the old frequency. This
      isn't enough.
      
      One of the core users of these notifications is the code responsible
      for keeping loops_per_jiffy aligned with frequency changes. And mostly
      it is written as:
      
      	if ((val == CPUFREQ_PRECHANGE  && freq->old < freq->new) ||
      	    (val == CPUFREQ_POSTCHANGE && freq->old > freq->new)) {
      		update-loops-per-jiffy...
      	}
      
      So, suppose we are changing to a higher frequency and failed during
      transition, then following will happen:
      - CPUFREQ_PRECHANGE notification with freq-new > freq-old
      - CPUFREQ_POSTCHANGE notification with freq-new == freq-old
      
      The first one will update loops_per_jiffy and second one will do
      nothing. Even if we send the 2nd notification by exchanging values of
      freq-new and old, some users of these notifications might get
      unstable.
      
      This can be fixed by simply calling cpufreq_notify_post_transition()
      with error code and this routine will take care of sending
      notifications in the correct order.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [rjw: Folded 3 patches into one, rebased unicore2 changes]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ab1b1c4e
    • V
      cpufreq: Introduce cpufreq_notify_post_transition() · f7ba3b41
      Viresh Kumar 提交于
      This introduces a new routine cpufreq_notify_post_transition() which
      can be used to send POSTCHANGE notification for new freq with or
      without both {PRE|POST}CHANGE notifications for last freq. This is
      useful at multiple places, especially for sending transition failure
      notifications.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      f7ba3b41
    • J
      cpufreq: Fix timer/workqueue corruption by protecting reading governor_enabled · 6f1e4efd
      Jane Li 提交于
      When a CPU is hot removed we'll cancel all the delayed work items via
      gov_cancel_work(). Sometimes the delayed work function determines that
      it should adjust the delay for all other CPUs that the policy is
      managing. If this scenario occurs, the canceling CPU will cancel its own
      work but queue up the other CPUs works to run.
      
      Commit 3617f2 (cpufreq: Fix timer/workqueue corruption due to double
      queueing) has tried to fix this, but reading governor_enabled is not
      protected by cpufreq_governor_lock. Even though od_dbs_timer() checks
      governor_enabled before gov_queue_work(), this scenario may occur. For
      example:
      
       CPU0                                        CPU1
       ----                                        ----
       cpu_down()
        ...                                        <work runs>
        __cpufreq_remove_dev()                     od_dbs_timer()
         __cpufreq_governor()                       policy->governor_enabled
          policy->governor_enabled = false;
          cpufreq_governor_dbs()
           case CPUFREQ_GOV_STOP:
            gov_cancel_work(dbs_data, policy);
             cpu0 work is canceled
              timer is canceled
              cpu1 work is canceled
              <waits for cpu1>
                                                    gov_queue_work(*, *, true);
                                                     cpu0 work queued
                                                     cpu1 work queued
                                                     cpu2 work queued
                                                     ...
              cpu1 work is canceled
              cpu2 work is canceled
              ...
      
      At the end of the GOV_STOP case cpu0 still has a work queued to
      run although the code is expecting all of the works to be
      canceled. __cpufreq_remove_dev() will then proceed to
      re-initialize all the other CPUs works except for the CPU that is
      going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
      will trample over the queued work and debugobjects will spit out
      a warning:
      
      WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
      ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x14
      Modules linked in:
      CPU: 1 PID: 1205 Comm: sh Tainted: G        W    3.10.0 #200
      [<c01144f0>] (unwind_backtrace+0x0/0xf8) from [<c0111d98>] (show_stack+0x10/0x14)
      [<c0111d98>] (show_stack+0x10/0x14) from [<c01272cc>] (warn_slowpath_common+0x4c/0x68)
      [<c01272cc>] (warn_slowpath_common+0x4c/0x68) from [<c012737c>] (warn_slowpath_fmt+0x30/0x40)
      [<c012737c>] (warn_slowpath_fmt+0x30/0x40) from [<c034c640>] (debug_print_object+0x94/0xbc)
      [<c034c640>] (debug_print_object+0x94/0xbc) from [<c034c7f8>] (__debug_object_init+0xc8/0x3c0)
      [<c034c7f8>] (__debug_object_init+0xc8/0x3c0) from [<c01360e0>] (init_timer_key+0x20/0x104)
      [<c01360e0>] (init_timer_key+0x20/0x104) from [<c04872ac>] (cpufreq_governor_dbs+0x1dc/0x68c)
      [<c04872ac>] (cpufreq_governor_dbs+0x1dc/0x68c) from [<c04833a8>] (__cpufreq_governor+0x80/0x1b0)
      [<c04833a8>] (__cpufreq_governor+0x80/0x1b0) from [<c0483704>] (__cpufreq_remove_dev.isra.12+0x22c/0x380)
      [<c0483704>] (__cpufreq_remove_dev.isra.12+0x22c/0x380) from [<c0692f38>] (cpufreq_cpu_callback+0x48/0x5c)
      [<c0692f38>] (cpufreq_cpu_callback+0x48/0x5c) from [<c014fb40>] (notifier_call_chain+0x44/0x84)
      [<c014fb40>] (notifier_call_chain+0x44/0x84) from [<c012ae44>] (__cpu_notify+0x2c/0x48)
      [<c012ae44>] (__cpu_notify+0x2c/0x48) from [<c068dd40>] (_cpu_down+0x80/0x258)
      [<c068dd40>] (_cpu_down+0x80/0x258) from [<c068df40>] (cpu_down+0x28/0x3c)
      [<c068df40>] (cpu_down+0x28/0x3c) from [<c068e4c0>] (store_online+0x30/0x74)
      [<c068e4c0>] (store_online+0x30/0x74) from [<c03a7308>] (dev_attr_store+0x18/0x24)
      [<c03a7308>] (dev_attr_store+0x18/0x24) from [<c0256fe0>] (sysfs_write_file+0x100/0x180)
      [<c0256fe0>] (sysfs_write_file+0x100/0x180) from [<c01fec9c>] (vfs_write+0xbc/0x184)
      [<c01fec9c>] (vfs_write+0xbc/0x184) from [<c01ff034>] (SyS_write+0x40/0x68)
      [<c01ff034>] (SyS_write+0x40/0x68) from [<c010e200>] (ret_fast_syscall+0x0/0x48)
      
      In gov_queue_work(), lock cpufreq_governor_lock before gov_queue_work,
      and unlock it after __gov_queue_work(). In this way, governor_enabled
      is guaranteed not changed in gov_queue_work().
      Signed-off-by: NJane Li <jiel@marvell.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Reviewed-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6f1e4efd
  11. 29 12月, 2013 2 次提交
  12. 22 12月, 2013 2 次提交
    • J
      cpufreq: Use CONFIG_CPU_FREQ_DEFAULT_* to set initial policy for setpolicy drivers · a27a9ab7
      Jason Baron 提交于
      When configuring a default governor (via CONFIG_CPU_FREQ_DEFAULT_*) with the
      intel_pstate driver, the desired default policy is not properly set. For
      example, setting 'CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE' ends up with the
      'powersave' policy being set.
      
      Fix by configuring the correct default policy, if either 'powersave' or
      'performance' are requested. Otherwise, fallback to what the driver originally
      set via its 'init' routine.
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a27a9ab7
    • V
      cpufreq: remove sysfs files for CPUs which failed to come back after resume · 42f921a6
      Viresh Kumar 提交于
      There are cases where cpufreq_add_dev() may fail for some CPUs
      during system resume. With the current code we will still have
      sysfs cpufreq files for those CPUs and struct cpufreq_policy
      would be already freed for them. Hence any operation on those
      sysfs files would result in kernel warnings.
      
      Example of problems resulting from resume errors (from Bjørn Mork):
      
      WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212()
      missing sysfs attribute operations for kobject: (null)
      Modules linked in: [stripped as irrelevant]
      CPU: 0 PID: 6055 Comm: grep Tainted: G      D      3.13.0-rc2 #153
      Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
       0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006
       ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000
       ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310
      Call Trace:
       [<ffffffff81380b0e>] dump_stack+0x55/0x76
       [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96
       [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212
       [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43
       [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82
       [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212
       [<ffffffff811823c7>] sysfs_open_file+0x77/0x212
       [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac
       [<ffffffff81122562>] do_dentry_open+0x17c/0x257
       [<ffffffff8112267e>] finish_open+0x41/0x4f
       [<ffffffff81130225>] do_last+0x80c/0x9ba
       [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42
       [<ffffffff81130606>] path_openat+0x233/0x4a1
       [<ffffffff81130b7e>] do_filp_open+0x35/0x85
       [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184
       [<ffffffff811232ea>] do_sys_open+0x6b/0xfa
       [<ffffffff811233a7>] SyS_openat+0xf/0x11
       [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b
      
      To fix this, remove those sysfs files or put the associated kobject
      in case of such errors. Also, to make it simple, remove the cpufreq
      sysfs links from all the CPUs (except for the policy->cpu) during
      suspend, as that operation won't result in a loss of sysfs file
      permissions and we can create those links during resume just fine.
      
      Fixes: 5302c3fb ("cpufreq: Perform light-weight init/teardown during suspend/resume")
      Reported-and-tested-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
      [rjw: Changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      42f921a6
  13. 08 12月, 2013 2 次提交
  14. 03 12月, 2013 1 次提交
    • B
      cpufreq: fix garbage kobjects on errors during suspend/resume · 2167e239
      Bjørn Mork 提交于
      This is effectively a revert of commit 5302c3fb ("cpufreq: Perform
      light-weight init/teardown during suspend/resume"), which enabled
      suspend/resume optimizations leaving the sysfs files in place.
      
      Errors during suspend/resume are not handled properly, leaving
      dead sysfs attributes in case of failures.  There are are number of
      functions with special code for the "frozen" case, and all these
      need to also have special error handling.
      
      The problem is easy to demonstrate by making cpufreq_driver->init()
      or cpufreq_driver->get() fail during resume.
      
      The code is too complex for a simple fix, with split code paths
      in multiple blocks within a number of functions.  It is therefore
      best to revert the patch enabling this code until the error handling
      is in place.
      
      Examples of problems resulting from resume errors:
      
      WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212()
      missing sysfs attribute operations for kobject: (null)
      Modules linked in: [stripped as irrelevant]
      CPU: 0 PID: 6055 Comm: grep Tainted: G      D      3.13.0-rc2 #153
      Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
       0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006
       ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000
       ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310
      Call Trace:
       [<ffffffff81380b0e>] dump_stack+0x55/0x76
       [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96
       [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212
       [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43
       [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82
       [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212
       [<ffffffff811823c7>] sysfs_open_file+0x77/0x212
       [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac
       [<ffffffff81122562>] do_dentry_open+0x17c/0x257
       [<ffffffff8112267e>] finish_open+0x41/0x4f
       [<ffffffff81130225>] do_last+0x80c/0x9ba
       [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42
       [<ffffffff81130606>] path_openat+0x233/0x4a1
       [<ffffffff81130b7e>] do_filp_open+0x35/0x85
       [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184
       [<ffffffff811232ea>] do_sys_open+0x6b/0xfa
       [<ffffffff811233a7>] SyS_openat+0xf/0x11
       [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b
      
      The failure to restore cpufreq devices on cancelled hibernation is
      not a new bug. It is caused by the ACPI _PPC call failing unless the
      hibernate is completed. This makes the acpi_cpufreq driver fail its
      init.
      
      Previously, the cpufreq device could be restored by offlining the
      cpu temporarily.  And as a complete hibernation cycle would do this,
      it would be automatically restored most of the time.  But after
      commit 5302c3fb the leftover sysfs attributes will block any
      device add action.  Therefore offlining and onlining CPU 1 will no
      longer restore the cpufreq object, and a complete suspend/resume
      cycle will replace it with garbage.
      
      Fixes: 5302c3fb ("cpufreq: Perform light-weight init/teardown during suspend/resume")
      Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2167e239
  15. 28 11月, 2013 1 次提交
  16. 31 10月, 2013 1 次提交
  17. 26 10月, 2013 2 次提交
  18. 17 10月, 2013 1 次提交
  19. 16 10月, 2013 5 次提交