提交 · 43c9fad942b5afb9e03801c0721d83160fa5b0dd · openeuler / Kernel

11 6月, 2015 5 次提交

cpufreq: Remove cpufreq_update_policy() · 37829029

由 Viresh Kumar 提交于 6月 08, 2015

cpufreq_update_policy() was kept as a separate routine earlier as it was
handling migration of sysfs directories, which isn't the case anymore.
It is only updating policy->cpu now and is called by a single caller.

The WARN_ON() isn't really required anymore, as we are just updating the
cpu now, not moving the sysfs directories.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

37829029

cpufreq: Restart governor as soon as possible · 9591becb

由 Viresh Kumar 提交于 6月 10, 2015

__cpufreq_remove_dev_finish() is doing two things today:
- Restarts the governor if some CPUs from concerned policy are still
  online.
- Frees the policy if all CPUs are offline.

The first task of restarting the governor can be moved to
__cpufreq_remove_dev_prepare() to restart the governor early. There is
no race between _prepare() and _finish() as they would be handling
completely different cases. _finish() will only be required if we are
going to free the policy and that has nothing to do with restarting the
governor.
Original-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9591becb

cpufreq: Call cpufreq_policy_put_kobj() from cpufreq_policy_free() · 3654c5cc

由 Viresh Kumar 提交于 6月 08, 2015

cpufreq_policy_put_kobj() is actually part of freeing the policy and can
be called from cpufreq_policy_free() directly instead of a separate
call.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3654c5cc

cpufreq: Initialize policy->kobj while allocating policy · 2fc3384d

由 Viresh Kumar 提交于 6月 08, 2015

policy->kobj is required to be initialized once in the lifetime of a
policy. Currently we are initializing it from __cpufreq_add_dev() and
that doesn't look to be the best place for doing so as we have to do
this on special cases (like: !recover_policy).

We can initialize it from a more obvious place cpufreq_policy_alloc()
and that will make code look cleaner, specially the error handling part.

The error handling part of __cpufreq_add_dev() was doing almost the same
thing while recover_policy is true or false. Fix that as well by always
calling cpufreq_policy_put_kobj() with an additional parameter to skip
notification part of it.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2fc3384d

cpufreq: Stop migrating sysfs files on hotplug · 87549141

由 Viresh Kumar 提交于 6月 10, 2015

When we hot-unplug a cpu, we remove its sysfs cpufreq directory and if
the outgoing cpu was the owner of policy->kobj earlier then we migrate
the sysfs directory to under another online cpu.

There are few disadvantages this brings:
- Code Complexity
- Slower hotplug/suspend/resume
- sysfs file permissions are reset after all policy->cpus are offlined
- CPUFreq stats history lost after all policy->cpus are offlined
- Special management of sysfs stuff during suspend/resume

To overcome these, this patch modifies the way sysfs directories are
managed:
- Select sysfs kobjects owner while initializing policy and don't change
  it during hotplugs. Track it with kobj_cpu created earlier.

- Create symlinks for all related CPUs (can be offline) instead of
  affected CPUs on policy initialization and remove them only when the
  policy is freed.

- Free policy structure only on the removal of cpufreq-driver and not
  during hotplug/suspend/resume, detected by checking 'struct
  subsys_interface *' (Valid only when called from
  subsys_interface_unregister() while unregistering driver).

Apart from this, special care is taken to handle physical hoplug of CPUs
as we wouldn't remove sysfs links or remove policies on logical
hotplugs. Physical hotplug happens in the following sequence.

Hot removal:
- CPU is offlined first, ~ 'echo 0 >
  /sys/devices/system/cpu/cpuX/online'
- Then its device is removed along with all sysfs files, cpufreq core
  notified with cpufreq_remove_dev() callback from subsys-interface..

Hot addition:
- First the device along with its sysfs files is added, cpufreq core
  notified with cpufreq_add_dev() callback from subsys-interface..
- CPU is onlined, ~ 'echo 1 > /sys/devices/system/cpu/cpuX/online'

We call the same routines with both hotplug and subsys callbacks, and we
sense physical hotplug with cpu_offline() check in subsys callback. We
can handle most of the stuff with regular hotplug callback paths and
add/remove cpufreq sysfs links or free policy from subsys callbacks.
Original-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

87549141

10 6月, 2015 1 次提交

cpufreq: Don't allow updating inactive policies from sysfs · 11e584cf

由 Viresh Kumar 提交于 6月 10, 2015

Later commits would change the way policies are managed today. Policies
wouldn't be freed on cpu hotplug (currently they aren't freed only for
suspend), and while the CPU is offline, the sysfs cpufreq files would
still be present.

User may accidentally try to update the sysfs files in following
directory: '/sys/devices/system/cpu/cpuX/cpufreq/'. And that would
result in undefined behavior as policy wouldn't be active then.

Apart from updating the store() routine, we also update __cpufreq_get()
which can call cpufreq_out_of_sync(). The later routine tries to update
policy->cur and starts notifying kernel about it.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

11e584cf

23 5月, 2015 2 次提交

cpufreq: Track cpu managing sysfs kobjects separately · 9d16f207

由 Saravana Kannan 提交于 5月 18, 2015

In order to prepare for the next few commits, that will stop migrating
sysfs files on cpu hotplug, this patch starts managing sysfs-cpu
separately.

The behavior is still the same as we are still migrating sysfs files on
hotplug, later commits would change that.
Signed-off-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9d16f207

cpufreq: Fix for typos in two comments · 58405af6

由 Shailendra Verma 提交于 5月 22, 2015

Signed-off-by: NShailendra Verma <shailendra.capricorn@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

58405af6

15 5月, 2015 6 次提交

cpufreq: Mark policy->governor = NULL for inactive policies · 18bf3a12

由 Viresh Kumar 提交于 5月 12, 2015

Later commits would change the way policies are managed today. Policies
wouldn't be freed on cpu hotplug (currently they aren't freed on
suspend), and while the CPU is offline, the sysfs cpufreq files would
still be present.

Because we don't mark policy->governor as NULL, it still contains
pointer of the last used governor. And if the governor is removed, while
all the CPUs of a policy are hotplugged out, this pointer wouldn't be
valid anymore. And if we try to read the 'scaling_governor', etc.  from
sysfs, it will result in kernel OOPs.

To prevent this, mark policy->governor as NULL for all inactive policies
while the governor is removed from kernel.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

18bf3a12

cpufreq: Manage governor usage history with 'policy->last_governor' · 4573237b

由 Viresh Kumar 提交于 5月 12, 2015

History of which governor was used last is common to all CPUs within a
policy and maintaining it per-cpu isn't the best approach for sure.

Apart from wasting memory, this also increases the complexity of
managing this data structure as it has to be updated for all CPUs.

To make that somewhat simpler, lets store this information in a new
field 'last_governor' in struct cpufreq_policy and update it on removal
of last cpu of a policy.

As a side-effect it also solves an old problem, consider a system with
two clusters 0 & 1. And there is one policy per cluster.

Cluster 0: CPU0 and 1.
Cluster 1: CPU2 and 3.

 - CPU2 is first brought online, and governor is set to performance
   (default as cpufreq_cpu_governor wasn't set).
 - Governor is changed to ondemand.
 - CPU2 is taken offline and cpufreq_cpu_governor is updated for CPU2.
 - CPU3 is brought online.
 - Because cpufreq_cpu_governor wasn't set for CPU3, the default governor
   performance is picked for CPU3.

This patch fixes the bug as we now have a single variable to update for
policy.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4573237b

cpufreq: Don't traverse all active policies to find policy for a cpu · 9104bb26

由 Viresh Kumar 提交于 5月 12, 2015

We reach here while adding policy for a CPU and enter into the 'if'
block only if a policy already exists for the CPU.

As cpufreq_cpu_data is set for all policy->related_cpus now, when the
policy is first added, we can use that to find the CPU's policy instead
of traversing the list of all active policies.
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9104bb26

cpufreq: Get rid of cpufreq_cpu_data_fallback · 3914d379

由 Viresh Kumar 提交于 5月 08, 2015

We can extract the same information from cpufreq_cpu_data as it is also
available for inactive policies now. And so don't need
cpufreq_cpu_data_fallback anymore.

Also add a WARN_ON() for the case where we try to restore from an active
policy.
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3914d379

cpufreq: Don't clear cpufreq_cpu_data and policy list for inactive policies · 988bed09

由 Viresh Kumar 提交于 5月 08, 2015

Now that we can check policy->cpus to find if policy is active or not,
we don't need to clean cpufreq_cpu_data and delete policy from the list
on light weight tear down of policies (like in suspend).

To make it consistent and clean, set cpufreq_cpu_data for all related
CPUs when the policy is first created and clean it only while it is
freed.

Also update cpufreq_cpu_get_raw() to check if cpu is part of
policy->cpus mask, so that we don't end up getting policies for offline
CPUs.

In order to make sure that no users of 'policy' are using an inactive
policy, use cpufreq_cpu_get_raw() instead of directly accessing
cpufreq_cpu_data.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

988bed09

cpufreq: Create for_each_{in}active_policy() · f963735a

由 Viresh Kumar 提交于 5月 12, 2015

policy->cpus is cleared unconditionally now on hotplug-out of a CPU and
it can be checked to know if a policy is active or not. Create helper
routines to iterate over all active/inactive policies, based on
policy->cpus field.

Replace all instances of for_each_policy() with for_each_active_policy()
to make them iterate only for active policies. (We haven't made changes
yet to keep inactive policies in the same list, but that will be
followed in a later patch).
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f963735a

08 5月, 2015 5 次提交

cpufreq: Clear policy->cpus even for the last CPU · 303ae723

由 Viresh Kumar 提交于 2月 19, 2015

We clear policy->cpus mask while CPUs are hotplugged out. We do it for all CPUs
except the last CPU of the policy. I don't remember what the rationale behind
that was, but I couldn't think of anything that will break if we remove this
conditional clearing and always clear policy->cpus.

The benefit we get out of it is, we can know if a policy is active or not by
checking if this field is empty or not. That will be used by later commits.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

303ae723

cpufreq: Keep a single path for adding managed CPUs · bb29ae15

由 Viresh Kumar 提交于 2月 19, 2015

There are two cases when we may try to add CPUs we're already handling:
 - On boot, the first cpu has marked all policy->cpus managed and so we
   will find policy for all other policy->cpus later on.
 - When a managed cpu is hotplugged out and later brought back in.

Currently, separate paths and checks take care of the two.  While the
first one is detected by testing cpu against 'policy->cpus', the other
one is detected by testing cpu against 'policy->related_cpus'.

We can handle them both via a single path and there is no need to do
special checking for the first one.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
[ rjw: Changelog, comments ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

bb29ae15

cpufreq: Throw warning when we try to get policy for an invalid CPU · 1b947c90

由 Viresh Kumar 提交于 2月 19, 2015

Simply returning here with an error is not enough. It shouldn't be allowed at
all to try calling cpufreq_cpu_get() for an invalid CPU.

Add a WARN here to make it clear that it wouldn't be acceptable at all.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1b947c90

cpufreq: Merge __cpufreq_add_dev() and cpufreq_add_dev() · 23faf0b7

由 Viresh Kumar 提交于 2月 19, 2015

cpufreq_add_dev() is an unnecessary wrapper over __cpufreq_add_dev(). Merge
them.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

23faf0b7

cpufreq: Add doc style comment about cpufreq_cpu_{get|put}() · 50e9c852

由 Viresh Kumar 提交于 2月 19, 2015

This clearly states what the code inside these routines is doing and how these
must be used.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

50e9c852

03 4月, 2015 1 次提交

cpufreq: Schedule work for the first-online CPU on resume · c75de0ac

由 Viresh Kumar 提交于 4月 02, 2015

All CPUs leaving the first-online CPU are hotplugged out on suspend and
and cpufreq core stops managing them.

On resume, we need to call cpufreq_update_policy() for this CPU's policy
to make sure its frequency is in sync with cpufreq's cached value, as it
might have got updated by hardware during suspend/resume.

The policies are always added to the top of the policy-list. So, in
normal circumstances, CPU 0's policy will be the last one in the list.
And so the code checks for the last policy.

But there are cases where it will fail. Consider quad-core system, with
policy-per core. If CPU0 is hotplugged out and added back again, the
last policy will be on CPU1 :(

To fix this in a proper way, always look for the policy of the first
online CPU. That way we will be sure that we are calling
cpufreq_update_policy() for the only CPU that wasn't hotplugged out.

Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
Fixes: 2f0aea93 ("cpufreq: suspend governors on system suspend/hibernate")
Reported-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c75de0ac

04 2月, 2015 3 次提交

cpufreq: Create for_each_governor() · f7b27061

由 Viresh Kumar 提交于 1月 27, 2015

To make code more readable and less error prone, lets create a helper macro for
iterating over all available governors.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f7b27061

cpufreq: Create for_each_policy() · b4f0676f

由 Viresh Kumar 提交于 1月 27, 2015

To make code more readable and less error prone, lets create a helper macro for
iterating over all active policies.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b4f0676f

cpufreq: Drop cpufreq_disabled() check from cpufreq_cpu_{get|put}() · 1e63eaf0

由 Viresh Kumar 提交于 1月 27, 2015

When cpufreq is disabled, the per-cpu variable would have been set to
NULL. Remove this unnecessary check.

[ Changelog from Saravana Kannan. ]
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NSaravana Kannan <skannan@codeaurora.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1e63eaf0

03 2月, 2015 1 次提交

cpufreq: Set cpufreq_cpu_data to NULL before putting kobject · 6ffae8c0

由 Viresh Kumar 提交于 1月 31, 2015

In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs
to be cleared before calling kobject_put(&policy->kobj) and under
cpufreq_driver_lock. Otherwise, if someone else calls cpufreq_cpu_get()
in parallel with it, they can obtain a non-NULL policy from that after
kobject_put(&policy->kobj) was executed.

Consider this case:

Thread A				Thread B
cpufreq_cpu_get()
  acquire cpufreq_driver_lock
  read-per-cpu cpufreq_cpu_data
					kobject_put(&policy->kobj);
  kobject_get(&policy->kobj);
					...
					per_cpu(&cpufreq_cpu_data, cpu) = NULL

And this will result in a warning like this one:

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47
 kobject_get+0x41/0x50()
 Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl
 lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon
 ...
 Call Trace:
  [<ffffffff81661b14>] dump_stack+0x46/0x58
  [<ffffffff81072b61>] warn_slowpath_common+0x81/0xa0
  [<ffffffff81072c7a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff812e16d1>] kobject_get+0x41/0x50
  [<ffffffff815262a5>] cpufreq_cpu_get+0x75/0xc0
  [<ffffffff81527c3e>] cpufreq_update_policy+0x2e/0x1f0
  [<ffffffff810b8cb2>] ? up+0x32/0x50
  [<ffffffff81381aa9>] ? acpi_ns_get_node+0xcb/0xf2
  [<ffffffff81381efd>] ? acpi_evaluate_object+0x22c/0x252
  [<ffffffff813824f6>] ? acpi_get_handle+0x95/0xc0
  [<ffffffff81360967>] ? acpi_has_method+0x25/0x40
  [<ffffffff81391e08>] acpi_processor_ppc_has_changed+0x77/0x82
  [<ffffffff81089566>] ? move_linked_works+0x66/0x90
  [<ffffffff8138e8ed>] acpi_processor_notify+0x58/0xe7
  [<ffffffff8137410c>] acpi_ev_notify_dispatch+0x44/0x5c
  [<ffffffff8135f293>] acpi_os_execute_deferred+0x15/0x22
  [<ffffffff8108c910>] process_one_work+0x160/0x410
  [<ffffffff8108d05b>] worker_thread+0x11b/0x520
  [<ffffffff8108cf40>] ? rescuer_thread+0x380/0x380
  [<ffffffff81092421>] kthread+0xe1/0x100
  [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0
  [<ffffffff81669ebc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0
 ---[ end trace 89e66eb9795efdf7 ]---

The actual code flow is as follows:

 Thread A: Workqueue: kacpi_notify

 acpi_processor_notify()
   acpi_processor_ppc_has_changed()
         cpufreq_update_policy()
           cpufreq_cpu_get()
             kobject_get()

 Thread B: xenbus_thread()

 xenbus_thread()
   msg->u.watch.handle->callback()
     handle_vcpu_hotplug_event()
       vcpu_hotplug()
         cpu_down()
           __cpu_notify(CPU_POST_DEAD..)
             cpufreq_cpu_callback()
               __cpufreq_remove_dev_finish()
                 cpufreq_policy_put_kobj()
                   kobject_put()

cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data
under cpufreq_driver_lock, and once it gets a valid policy it expects it
to not be freed until cpufreq_cpu_put() is called.

But the race happens when another thread puts the kobject first and updates
cpufreq_cpu_data before or later. And so the first thread gets a valid policy
structure and before it does kobject_get() on it, the second one has already
done kobject_put().

Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that
too under locks.
Reported-by: NEthan Zhao <ethan.zhao@oracle.com>
Reported-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6ffae8c0

24 1月, 2015 16 次提交

cpufreq: remove CPUFREQ_UPDATE_POLICY_CPU notifications · d9f35446

由 Viresh Kumar 提交于 1月 06, 2015

CPUFREQ_UPDATE_POLICY_CPU notifications were used only from cpufreq-stats which
doesn't use it anymore. Remove them.

This also decrements values of other notification macros defined after
CPUFREQ_UPDATE_POLICY_CPU by 1 to remove gaps. Hopefully all users are using
macro's instead of direct numbers and so they wouldn't break as macro values are
changed now.
Reviewed-by: NPrarit Bhargava <prarit@redhat.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d9f35446

cpufreq: Remove (now) unused 'last_cpu' from struct cpufreq_policy · 7c418ff0

由 Viresh Kumar 提交于 1月 06, 2015

'last_cpu' was used only from cpufreq-stats and isn't used anymore. Get rid of
it.
Reviewed-by: NPrarit Bhargava <prarit@redhat.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7c418ff0

cpufreq: move some initialization stuff to cpufreq_policy_alloc() · 818c5712