1. 25 7月, 2015 1 次提交
    • S
      cpufreq: Remove cpufreq_rwsem · 454d3a25
      Sebastian Andrzej Siewior 提交于
      cpufreq_rwsem was introduced in commit 6eed9404 ("cpufreq: Use
      rwsem for protecting critical sections) in order to replace
      try_module_get() on the cpu-freq driver. That try_module_get() worked
      well until the refcount was so heavily used that module removal became
      more or less impossible.
      
      Though when looking at the various (undocumented) protection
      mechanisms in that code, the randomly sprinkeled around cpufreq_rwsem
      locking sites are superfluous.
      
      The policy, which is acquired in cpufreq_cpu_get() and released in
      cpufreq_cpu_put() is sufficiently protected already.
      
        cpufreq_cpu_get(cpu)
          /* Protects against concurrent driver removal */
          read_lock_irqsave(&cpufreq_driver_lock, flags);
          policy = per_cpu(cpufreq_cpu_data, cpu);
          kobject_get(&policy->kobj);
          read_unlock_irqrestore(&cpufreq_driver_lock, flags);
      
      The reference on the policy serializes versus module unload already:
      
        cpufreq_unregister_driver()
          subsys_interface_unregister()
            __cpufreq_remove_dev_finish()
              per_cpu(cpufreq_cpu_data) = NULL;
      	cpufreq_policy_put_kobj()
      
      If there is a reference held on the policy, i.e. obtained prior to the
      unregister call, then cpufreq_policy_put_kobj() will wait until that
      reference is dropped. So once subsys_interface_unregister() returns
      there is no policy pointer in flight and no new reference can be
      obtained. So that rwsem protection is useless.
      
      The other usage of cpufreq_rwsem in show()/store() of the sysfs
      interface is redundant as well because sysfs already does the proper
      kobject_get()/put() pairs.
      
      That leaves CPU hotplug versus module removal. The current
      down_write() around the write_lock() in cpufreq_unregister_driver() is
      silly at best as it protects actually nothing.
      
      The trivial solution to this is to prevent hotplug across
      cpufreq_unregister_driver completely.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      454d3a25
  2. 21 7月, 2015 1 次提交
  3. 17 7月, 2015 2 次提交
  4. 10 7月, 2015 2 次提交
    • V
      cpufreq: Allow freq_table to be obtained for offline CPUs · 5a31d594
      Viresh Kumar 提交于
      Users of freq table may want to access it for any CPU from
      policy->related_cpus mask. One such user is cpu-cooling layer. It gets a
      list of 'clip_cpus' (equivalent to policy->related_cpus) during
      registration and tries to get freq_table for the first CPU of this mask.
      
      If the CPU, for which it tries to fetch freq_table, is offline,
      cpufreq_frequency_get_table() fails. This happens because it relies on
      cpufreq_cpu_get_raw() for its functioning which returns policy only for
      online CPUs.
      
      The fix is to access the policy data structure for the given CPU
      directly (which also returns a valid policy for offline CPUs), but the
      policy itself has to be active (meaning that at least one CPU using it
      is online) for the frequency table to be returned.
      
      Because we will be using 'cpufreq_cpu_data' now, which is internal to
      the cpufreq core, move cpufreq_frequency_get_table() to cpufreq.c.
      Reported-and-tested-by: NPi-Cheng Chen <pi-cheng.chen@linaro.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5a31d594
    • V
      cpufreq: Initialize the governor again while restoring policy · 35afd02e
      Viresh Kumar 提交于
      When all CPUs of a policy are hot-unplugged, we EXIT the governor but
      don't mark policy->governor as NULL. This was done in order to keep last
      used governor's information intact in sysfs, while the CPUs are offline.
      
      But we also need to clear policy->governor when restoring the policy.
      
      Because policy->governor still points to the last governor while policy
      is restored, following sequence of event happens:
       - cpufreq_init_policy() called while restoring policy
       - find_governor() matches last_governor string for present governors and
         returns last used governor's pointer, say ondemand. policy->governor
         already has the same address, unless the governor was removed in
         between.
       - cpufreq_set_policy() is called with both old/new policies governor set
         as ondemand.
       - Because governors matched, we skip governor initialization and return
         after calling __cpufreq_governor(CPUFREQ_GOV_LIMITS). Because the
         governor wasn't initialized for this policy, it returned -EBUSY.
       - cpufreq_init_policy() exits the policy on this error, but doesn't
         destroy it properly (should be fixed separately).
       - And so we enter a scenario where the policy isn't completely
         initialized but used.
      
      Fix this by setting policy->governor to NULL while restoring the policy.
      Reported-and-tested-by: NPi-Cheng Chen <pi-cheng.chen@linaro.org>
      Reported-and-tested-by: N"Jon Medhurst (Tixy)" <tixy@linaro.org>
      Reported-and-tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Fixes: 18bf3a12 (cpufreq: Mark policy->governor = NULL for inactive policies)
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      35afd02e
  5. 11 6月, 2015 5 次提交
    • V
      cpufreq: Remove cpufreq_update_policy() · 37829029
      Viresh Kumar 提交于
      cpufreq_update_policy() was kept as a separate routine earlier as it was
      handling migration of sysfs directories, which isn't the case anymore.
      It is only updating policy->cpu now and is called by a single caller.
      
      The WARN_ON() isn't really required anymore, as we are just updating the
      cpu now, not moving the sysfs directories.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      37829029
    • V
      cpufreq: Restart governor as soon as possible · 9591becb
      Viresh Kumar 提交于
      __cpufreq_remove_dev_finish() is doing two things today:
      - Restarts the governor if some CPUs from concerned policy are still
        online.
      - Frees the policy if all CPUs are offline.
      
      The first task of restarting the governor can be moved to
      __cpufreq_remove_dev_prepare() to restart the governor early. There is
      no race between _prepare() and _finish() as they would be handling
      completely different cases. _finish() will only be required if we are
      going to free the policy and that has nothing to do with restarting the
      governor.
      Original-by: NSaravana Kannan <skannan@codeaurora.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9591becb
    • V
      cpufreq: Call cpufreq_policy_put_kobj() from cpufreq_policy_free() · 3654c5cc
      Viresh Kumar 提交于
      cpufreq_policy_put_kobj() is actually part of freeing the policy and can
      be called from cpufreq_policy_free() directly instead of a separate
      call.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      3654c5cc
    • V
      cpufreq: Initialize policy->kobj while allocating policy · 2fc3384d
      Viresh Kumar 提交于
      policy->kobj is required to be initialized once in the lifetime of a
      policy.  Currently we are initializing it from __cpufreq_add_dev() and
      that doesn't look to be the best place for doing so as we have to do
      this on special cases (like: !recover_policy).
      
      We can initialize it from a more obvious place cpufreq_policy_alloc()
      and that will make code look cleaner, specially the error handling part.
      
      The error handling part of __cpufreq_add_dev() was doing almost the same
      thing while recover_policy is true or false. Fix that as well by always
      calling cpufreq_policy_put_kobj() with an additional parameter to skip
      notification part of it.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2fc3384d
    • V
      cpufreq: Stop migrating sysfs files on hotplug · 87549141
      Viresh Kumar 提交于
      When we hot-unplug a cpu, we remove its sysfs cpufreq directory and if
      the outgoing cpu was the owner of policy->kobj earlier then we migrate
      the sysfs directory to under another online cpu.
      
      There are few disadvantages this brings:
      - Code Complexity
      - Slower hotplug/suspend/resume
      - sysfs file permissions are reset after all policy->cpus are offlined
      - CPUFreq stats history lost after all policy->cpus are offlined
      - Special management of sysfs stuff during suspend/resume
      
      To overcome these, this patch modifies the way sysfs directories are
      managed:
      - Select sysfs kobjects owner while initializing policy and don't change
        it during hotplugs. Track it with kobj_cpu created earlier.
      
      - Create symlinks for all related CPUs (can be offline) instead of
        affected CPUs on policy initialization and remove them only when the
        policy is freed.
      
      - Free policy structure only on the removal of cpufreq-driver and not
        during hotplug/suspend/resume, detected by checking 'struct
        subsys_interface *' (Valid only when called from
        subsys_interface_unregister() while unregistering driver).
      
      Apart from this, special care is taken to handle physical hoplug of CPUs
      as we wouldn't remove sysfs links or remove policies on logical
      hotplugs. Physical hotplug happens in the following sequence.
      
      Hot removal:
      - CPU is offlined first, ~ 'echo 0 >
        /sys/devices/system/cpu/cpuX/online'
      - Then its device is removed along with all sysfs files, cpufreq core
        notified with cpufreq_remove_dev() callback from subsys-interface..
      
      Hot addition:
      - First the device along with its sysfs files is added, cpufreq core
        notified with cpufreq_add_dev() callback from subsys-interface..
      - CPU is onlined, ~ 'echo 1 > /sys/devices/system/cpu/cpuX/online'
      
      We call the same routines with both hotplug and subsys callbacks, and we
      sense physical hotplug with cpu_offline() check in subsys callback. We
      can handle most of the stuff with regular hotplug callback paths and
      add/remove cpufreq sysfs links or free policy from subsys callbacks.
      Original-by: NSaravana Kannan <skannan@codeaurora.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      87549141
  6. 10 6月, 2015 1 次提交
  7. 23 5月, 2015 2 次提交
  8. 15 5月, 2015 6 次提交
  9. 08 5月, 2015 5 次提交
  10. 03 4月, 2015 1 次提交
    • V
      cpufreq: Schedule work for the first-online CPU on resume · c75de0ac
      Viresh Kumar 提交于
      All CPUs leaving the first-online CPU are hotplugged out on suspend and
      and cpufreq core stops managing them.
      
      On resume, we need to call cpufreq_update_policy() for this CPU's policy
      to make sure its frequency is in sync with cpufreq's cached value, as it
      might have got updated by hardware during suspend/resume.
      
      The policies are always added to the top of the policy-list. So, in
      normal circumstances, CPU 0's policy will be the last one in the list.
      And so the code checks for the last policy.
      
      But there are cases where it will fail. Consider quad-core system, with
      policy-per core. If CPU0 is hotplugged out and added back again, the
      last policy will be on CPU1 :(
      
      To fix this in a proper way, always look for the policy of the first
      online CPU. That way we will be sure that we are calling
      cpufreq_update_policy() for the only CPU that wasn't hotplugged out.
      
      Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
      Fixes: 2f0aea93 ("cpufreq: suspend governors on system suspend/hibernate")
      Reported-by: NSaravana Kannan <skannan@codeaurora.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NSaravana Kannan <skannan@codeaurora.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      c75de0ac
  11. 04 2月, 2015 3 次提交
  12. 03 2月, 2015 1 次提交
    • V
      cpufreq: Set cpufreq_cpu_data to NULL before putting kobject · 6ffae8c0
      Viresh Kumar 提交于
      In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs
      to be cleared before calling kobject_put(&policy->kobj) and under
      cpufreq_driver_lock. Otherwise, if someone else calls cpufreq_cpu_get()
      in parallel with it, they can obtain a non-NULL policy from that after
      kobject_put(&policy->kobj) was executed.
      
      Consider this case:
      
      Thread A				Thread B
      cpufreq_cpu_get()
        acquire cpufreq_driver_lock
        read-per-cpu cpufreq_cpu_data
      					kobject_put(&policy->kobj);
        kobject_get(&policy->kobj);
      					...
      					per_cpu(&cpufreq_cpu_data, cpu) = NULL
      
      And this will result in a warning like this one:
      
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47
       kobject_get+0x41/0x50()
       Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl
       lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon
       ...
       Call Trace:
        [<ffffffff81661b14>] dump_stack+0x46/0x58
        [<ffffffff81072b61>] warn_slowpath_common+0x81/0xa0
        [<ffffffff81072c7a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff812e16d1>] kobject_get+0x41/0x50
        [<ffffffff815262a5>] cpufreq_cpu_get+0x75/0xc0
        [<ffffffff81527c3e>] cpufreq_update_policy+0x2e/0x1f0
        [<ffffffff810b8cb2>] ? up+0x32/0x50
        [<ffffffff81381aa9>] ? acpi_ns_get_node+0xcb/0xf2
        [<ffffffff81381efd>] ? acpi_evaluate_object+0x22c/0x252
        [<ffffffff813824f6>] ? acpi_get_handle+0x95/0xc0
        [<ffffffff81360967>] ? acpi_has_method+0x25/0x40
        [<ffffffff81391e08>] acpi_processor_ppc_has_changed+0x77/0x82
        [<ffffffff81089566>] ? move_linked_works+0x66/0x90
        [<ffffffff8138e8ed>] acpi_processor_notify+0x58/0xe7
        [<ffffffff8137410c>] acpi_ev_notify_dispatch+0x44/0x5c
        [<ffffffff8135f293>] acpi_os_execute_deferred+0x15/0x22
        [<ffffffff8108c910>] process_one_work+0x160/0x410
        [<ffffffff8108d05b>] worker_thread+0x11b/0x520
        [<ffffffff8108cf40>] ? rescuer_thread+0x380/0x380
        [<ffffffff81092421>] kthread+0xe1/0x100
        [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0
        [<ffffffff81669ebc>] ret_from_fork+0x7c/0xb0
        [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0
       ---[ end trace 89e66eb9795efdf7 ]---
      
      The actual code flow is as follows:
      
       Thread A: Workqueue: kacpi_notify
      
       acpi_processor_notify()
         acpi_processor_ppc_has_changed()
               cpufreq_update_policy()
                 cpufreq_cpu_get()
                   kobject_get()
      
       Thread B: xenbus_thread()
      
       xenbus_thread()
         msg->u.watch.handle->callback()
           handle_vcpu_hotplug_event()
             vcpu_hotplug()
               cpu_down()
                 __cpu_notify(CPU_POST_DEAD..)
                   cpufreq_cpu_callback()
                     __cpufreq_remove_dev_finish()
                       cpufreq_policy_put_kobj()
                         kobject_put()
      
      cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data
      under cpufreq_driver_lock, and once it gets a valid policy it expects it
      to not be freed until cpufreq_cpu_put() is called.
      
      But the race happens when another thread puts the kobject first and updates
      cpufreq_cpu_data before or later. And so the first thread gets a valid policy
      structure and before it does kobject_get() on it, the second one has already
      done kobject_put().
      
      Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that
      too under locks.
      Reported-by: NEthan Zhao <ethan.zhao@oracle.com>
      Reported-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6ffae8c0
  13. 24 1月, 2015 10 次提交