提交 · 9ff4a80b2551d43f92c3d9a7a11e406ce7aa47f3 · openanolis / cloud-kernel

01 10月, 2013 16 次提交

cpufreq: imx6q: use cpufreq_table_validate_and_show() · 9ff4a80b

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().

Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9ff4a80b

cpufreq: ia64-acpi: use cpufreq_table_validate_and_show() · bbe2c170

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().

Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

bbe2c170

cpufreq: exynos: use cpufreq_table_validate_and_show() · bc574ce9

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().

Cc: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

bc574ce9

cpufreq: elanfreq: use cpufreq_table_validate_and_show() · 55bb85b7

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

55bb85b7

cpufreq: e_powersaver: use cpufreq_table_validate_and_show() · 7813ed7e

由 Viresh Kumar 提交于 9月 16, 2013

7813ed7e

cpufreq: dbx500: use cpufreq_table_validate_and_show() · 025829e4

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

025829e4

cpufreq: davinci: use cpufreq_table_validate_and_show() · e873c5b2

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().

Cc: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e873c5b2

cpufreq: cris: use cpufreq_table_validate_and_show() · ae24b5cd

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().

Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Mikael Starvik <starvik@axis.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ae24b5cd

cpufreq: cpufreq-cpu0: use cpufreq_table_validate_and_show() · 7a906849

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().
Acked-by: NShawn Guo <shawn.guo@linaro.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7a906849

cpufreq: blackfin: use cpufreq_table_validate_and_show() · e2889e2c

由 Viresh Kumar 提交于 9月 16, 2013

Lets use cpufreq_table_validate_and_show() instead of calling
cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr().

Cc: Steven Miao <realmz6@gmail.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e2889e2c

cpufreq: arm_big_little: use cpufreq_table_validate_and_show() · 39b10ebe

由 Viresh Kumar 提交于 9月 16, 2013

39b10ebe

cpufreq: acpi-cpufreq: use cpufreq_table_validate_and_show() · 776b57be

由 Viresh Kumar 提交于 9月 16, 2013

776b57be

cpufreq: sparc: call cpufreq_frequency_table_get_attr() · 18f130ed

由 Viresh Kumar 提交于 9月 16, 2013

This exposes frequency table of driver to cpufreq core and is required for core
to guess what the index for a target frequency is, when it calls
cpufreq_frequency_table_target(). And so this driver needs to expose it.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

18f130ed

cpufreq: s3cx4xx: call cpufreq_frequency_table_get_attr() · 5c40e052

由 Viresh Kumar 提交于 9月 16, 2013

Cc: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

5c40e052

cpufreq: pxa: call cpufreq_frequency_table_get_attr() · 6a77a1e6

由 Viresh Kumar 提交于 9月 16, 2013

Cc: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6a77a1e6

cpufreq: Add new helper cpufreq_table_validate_and_show() · 27047a60

由 Viresh Kumar 提交于 9月 16, 2013

Almost every cpufreq driver is required to validate its frequency table with:
cpufreq_frequency_table_cpuinfo() and then expose it to cpufreq core with:
cpufreq_frequency_table_get_attr().

This patch creates another helper routine cpufreq_table_validate_and_show() that
will do both these steps in a single call and will return 0 for success, error
otherwise.

This also fixes potential bugs in cpufreq drivers where people have called
cpufreq_frequency_table_get_attr() before calling
cpufreq_frequency_table_cpuinfo(), as the later may fail.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

27047a60

25 9月, 2013 3 次提交

cpufreq: exynos5440: Fix potential NULL pointer dereference · 116decb7

由 Sachin Kamat 提交于 9月 18, 2013

If 'dvfs_info' is NULL (due to devm_kzalloc failure) the failure
error message would try to dereference it. Use 'pdev' instead.
Signed-off-by: NSachin Kamat <sachin.kamat@linaro.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

116decb7

cpufreq: check cpufreq driver is valid and cpufreq isn't disabled in cpufreq_get() · 26ca8694

由 Viresh Kumar 提交于 9月 20, 2013

cpufreq_get() can be called from external drivers which might not be aware if
cpufreq driver is registered or not. And so we should actually check if cpufreq
driver is registered or not and also if cpufreq is active or disabled, at the
beginning of cpufreq_get().

Otherwise call to lock_policy_rwsem_read() might hit BUG_ON(!policy).
Reported-and-tested-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

26ca8694

acpi-cpufreq: skip loading acpi_cpufreq after intel_pstate · 8a61e12e

由 Yinghai Lu 提交于 9月 20, 2013

If the hw supports intel_pstate and acpi_cpufreq, intel_pstate will
get loaded first.

acpi_cpufreq_init() will call acpi_cpufreq_early_init()
and that will allocate perf data and init those perf data in ACPI core,
(that will cover all CPUs). But later it will free them as
cpufreq_register_driver(acpi_cpufreq) will fail as intel_pstate is
already registered

Use cpufreq_get_current_driver() to check if we can skip the
acpi_cpufreq loading.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8a61e12e

20 9月, 2013 1 次提交

cpufreq: return EEXIST instead of EBUSY for second registering · 4dea5806

由 Yinghai Lu 提交于 9月 18, 2013

On systems that support intel_pstate, acpi_cpufreq fails to load, and
udev keeps trying until trace gets filled up and kernel crashes.

The root cause is driver return ret from cpufreq_register_driver(),
because when some other driver takes over before, it will return
EBUSY and then udev will keep trying ...

cpufreq_register_driver() should return EEXIST instead so that the
system can boot without appending intel_pstate=disable and still use
intel_pstate.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4dea5806

19 9月, 2013 2 次提交

cpufreq: imx6q-cpufreq: assign cpu_dev correctly to cpu0 device · b494b48d

由 Sudeep KarkadaNagesha 提交于 9月 10, 2013

Commit cdc58d60 "cpufreq: imx6q-cpufreq:
remove device tree parsing for cpu nodes" assumed the pdev->dev is set to
cpu0 device in the platform code. But it actually points to the virtual
cpufreq-cpu0 platform device which is not present in the device tree.
Most of the information needed by cpufreq is stored in cpu0 DT node.
So cpu_dev must point to cpu0 device.

This patch fixes the wrong assignment to cpu_dev.
Reported-by: NGuennadi Liakhovetski <g.liakhovetski@gmx.de>
Tested-by: NShawn Guo <shawn.guo@linaro.org>
Signed-off-by: NSudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b494b48d

cpufreq: cpufreq-cpu0: assign cpu_dev correctly to cpu0 device · e1825b25

由 Sudeep KarkadaNagesha 提交于 9月 10, 2013

Commit f837a9b5 "cpufreq: cpufreq-cpu0:
remove device tree parsing for cpu nodes" assumed the pdev->dev is set to
cpu0 device in the platform code. But it actually points to the virtual
cpufreq-cpu0 platform device which is not present in the device tree.
Most of the information needed by cpufreq is stored in cpu0 DT node.
So cpu_dev must point to cpu0 device.

This patch fixes the wrong assignment to cpu_dev.
Reported-and-tested-by: NGuennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: NSudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e1825b25

18 9月, 2013 2 次提交

cpufreq: unlock correct rwsem while updating policy->cpu · 8efd5765

由 Viresh Kumar 提交于 9月 17, 2013

Current code looks like this:

        WARN_ON(lock_policy_rwsem_write(cpu));
        update_policy_cpu(policy, new_cpu);
        unlock_policy_rwsem_write(cpu);

{lock|unlock}_policy_rwsem_write(cpu) takes/releases policy->cpu's rwsem.
Because cpu is changing with the call to update_policy_cpu(), the
unlock_policy_rwsem_write() will release the incorrect lock.

The right solution would be to release the same lock as was taken earlier. Also
update_policy_cpu() was also called from cpufreq_add_dev() without any locks and
so its better if we move this locking to inside update_policy_cpu().

This patch fixes a regression introduced in 3.12 by commit f9ba680d
(cpufreq: Extract the handover of policy cpu to a helper function).

Reported-and-tested-by: Jon Medhurst<tixy@linaro.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8efd5765

cpufreq: Clear policy->cpus bits in __cpufreq_remove_dev_finish() · 9c8f1ee4

由 Viresh Kumar 提交于 9月 12, 2013

This broke after a recent change "cedb70af cpufreq: Split __cpufreq_remove_dev()
into two parts" from Srivatsa.

Consider a scenario where we have two CPUs in a policy (0 & 1) and we are
removing CPU 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1
from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read
cpumask_weight of policy->cpus, which will come as 1 and this code will behave
as if we are removing the last CPU from policy :)

Fix it by clearing the CPU mask in __cpufreq_remove_dev_finish() instead of
__cpufreq_remove_dev_prepare().
Tested-by: NStephen Warren <swarren@wwwdotorg.org>
Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9c8f1ee4

12 9月, 2013 4 次提交

cpufreq: Acquire the lock in cpufreq_policy_restore() for reading · 44871c9c

由 Lan Tianyu 提交于 9月 11, 2013

In cpufreq_policy_restore() before system suspend policy is read from
percpu's cpufreq_cpu_data_fallback.  It's a read operation rather
than a write one, so take the lock for reading in there.
Signed-off-by: NLan Tianyu <tianyu.lan@intel.com>
Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

44871c9c

cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu · cb38ed5c

由 Srivatsa S. Bhat 提交于 9月 12, 2013

If update_policy_cpu() is invoked with the existing policy->cpu itself
as the new-cpu parameter, then a lot of things can go terribly wrong.

In its present form, update_policy_cpu() always assumes that the new-cpu
is different from policy->cpu and invokes other functions to perform their
respective updates. And those functions implement the actual update like
this:

per_cpu(..., new_cpu) = per_cpu(..., last_cpu);
per_cpu(..., last_cpu) = NULL;

Thus, when new_cpu == last_cpu, the final NULL assignment makes the per-cpu
references vanish into thin air! (memory leak). From there, it leads to more
problems: cpufreq_stats_create_table() now doesn't find the per-cpu reference
and hence tries to create a new sysfs-group; but sysfs already had created
the group earlier, so it complains that it cannot create a duplicate filename.
In short, the repercussions of a rather innocuous invocation of
update_policy_cpu() can turn out to be pretty nasty.

Ideally update_policy_cpu() should handle this situation (new == last)
gracefully, and not lead to such severe problems. So fix it by adding an
appropriate check.
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: NStephen Warren <swarren@nvidia.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

cb38ed5c

cpufreq: Restructure if/else block to avoid unintended behavior · 61173f25

由 Srivatsa S. Bhat 提交于 9月 12, 2013

In __cpufreq_remove_dev_prepare(), the code which decides whether to remove
the sysfs link or nominate a new policy cpu, is governed by an if/else block
with a rather complex set of conditionals. Worse, they harbor a subtlety
which leads to certain unintended behavior.

The code looks like this:

        if (cpu != policy->cpu && !frozen) {
                sysfs_remove_link(&dev->kobj, "cpufreq");
        } else if (cpus > 1) {
		new_cpu = cpufreq_nominate_new_policy_cpu(...);
		...
		update_policy_cpu(..., new_cpu);
	}

The original intention was:
If the CPU going offline is not policy->cpu, just remove the link.
On the other hand, if the CPU going offline is the policy->cpu itself,
handover the policy->cpu job to some other surviving CPU in that policy.

But because the 'if' condition also includes the 'frozen' check, now there
are *two* possibilities by which we can enter the 'else' block:

1. cpu == policy->cpu (intended)
2. cpu != policy->cpu && frozen (unintended)

Due to the second (unintended) scenario, we end up spuriously nominating
a CPU as the policy->cpu, even when the existing policy->cpu is alive and
well. This can cause problems further down the line, especially when we end
up nominating the same policy->cpu as the new one (ie., old == new),
because it totally confuses update_policy_cpu().

To avoid this mess, restructure the if/else block to only do what was
originally intended, and thus prevent any unwelcome surprises.
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: NStephen Warren <swarren@nvidia.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

61173f25

cpufreq: Fix crash in cpufreq-stats during suspend/resume · 0d66b91e

由 Srivatsa S. Bhat 提交于 9月 12, 2013

Stephen Warren reported that the cpufreq-stats code hits a NULL pointer
dereference during the second attempt to suspend a system. He also
pin-pointed the problem to commit 5302c3fb "cpufreq: Perform light-weight
init/teardown during suspend/resume".

That commit actually ensured that the cpufreq-stats table and the
cpufreq-stats sysfs entries are *not* torn down (ie., not freed) during
suspend/resume, which makes it all the more surprising. However, it turns
out that the root-cause is not that we access an already freed memory, but
that the reference to the allocated memory gets moved around and we lose
track of that during resume, leading to the reported crash in a subsequent
suspend attempt.

In the suspend path, during CPU offline, the value of policy->cpu is updated
by choosing one of the surviving CPUs in that policy, as long as there is
atleast one CPU in that policy. And cpufreq_stats_update_policy_cpu() is
invoked to update the reference to the stats structure by assigning it to
the new CPU. However, in the resume path, during CPU online, we end up
assigning a fresh CPU as the policy->cpu, without letting cpufreq-stats
know about this. Thus the reference to the stats structure remains
(incorrectly) associated with the old CPU. So, in a subsequent suspend attempt,
during CPU offline, we end up accessing an incorrect location to get the
stats structure, which eventually leads to the NULL pointer dereference.

Fix this by letting cpufreq-stats know about the update of the policy->cpu
during CPU online in the resume path. (Also, move the update_policy_cpu()
function higher up in the file, so that __cpufreq_add_dev() can invoke
it).
Reported-and-tested-by: NStephen Warren <swarren@nvidia.com>
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

0d66b91e

11 9月, 2013 1 次提交

intel_pstate: Add Haswell CPU models · 6cdcdb79

由 Nell Hardcastle 提交于 6月 30, 2013

Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge
model (0x3E) is also included. Models referenced from
tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit
Signed-off-by: NNell Hardcastle <nell@spicious.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NDirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6cdcdb79

10 9月, 2013 9 次提交

Revert "cpufreq: make sure frequency transitions are serialized" · 798282a8

由 Rafael J. Wysocki 提交于 9月 10, 2013

Commit 7c30ed53 (cpufreq: make sure frequency transitions are
serialized) attempted to serialize frequency transitions by
adding checks to the CPUFREQ_PRECHANGE and CPUFREQ_POSTCHANGE
notifications.  However, it assumed that the notifications will
always originate from the driver's .target() callback, but they
also can be triggered by cpufreq_out_of_sync() and that leads to
warnings like this on some systems:

 WARNING: CPU: 0 PID: 14543 at drivers/cpufreq/cpufreq.c:317
 __cpufreq_notify_transition+0x238/0x260()
 In middle of another frequency transition

accompanied by a call trace similar to this one:

 [<ffffffff81720daa>] dump_stack+0x46/0x58
 [<ffffffff8106534c>] warn_slowpath_common+0x8c/0xc0
 [<ffffffff815b8560>] ? acpi_cpufreq_target+0x320/0x320
 [<ffffffff81065436>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff815b1ec8>] __cpufreq_notify_transition+0x238/0x260
 [<ffffffff815b33be>] cpufreq_notify_transition+0x3e/0x70
 [<ffffffff815b345d>] cpufreq_out_of_sync+0x6d/0xb0
 [<ffffffff815b370c>] cpufreq_update_policy+0x10c/0x160
 [<ffffffff815b3760>] ? cpufreq_update_policy+0x160/0x160
 [<ffffffff81413813>] cpufreq_set_cur_state+0x8c/0xb5
 [<ffffffff814138df>] processor_set_cur_state+0xa3/0xcf
 [<ffffffff8158e13c>] thermal_cdev_update+0x9c/0xb0
 [<ffffffff8159046a>] step_wise_throttle+0x5a/0x90
 [<ffffffff8158e21f>] handle_thermal_trip+0x4f/0x140
 [<ffffffff8158e377>] thermal_zone_device_update+0x57/0xa0
 [<ffffffff81415b36>] acpi_thermal_check+0x2e/0x30
 [<ffffffff81415ca0>] acpi_thermal_notify+0x40/0xdc
 [<ffffffff813e7dbd>] acpi_device_notify+0x19/0x1b
 [<ffffffff813f8241>] acpi_ev_notify_dispatch+0x41/0x5c
 [<ffffffff813e3fbe>] acpi_os_execute_deferred+0x25/0x32
 [<ffffffff81081060>] process_one_work+0x170/0x4a0
 [<ffffffff81082121>] worker_thread+0x121/0x390
 [<ffffffff81082000>] ? manage_workers.isra.20+0x170/0x170
 [<ffffffff81088fe0>] kthread+0xc0/0xd0
 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff8173582c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0

For this reason, revert commit 7c30ed53 along with the fix 266c13d7
(cpufreq: Fix serialization of frequency transitions) on top of it
and we will revisit the serialization problem later.
Reported-by: NAlessandro Bono <alessandro.bono@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

798282a8

cpufreq: Use signed type for 'ret' variable, to store negative error values · 5136fa56

由 Srivatsa S. Bhat 提交于 9月 07, 2013

There are places where the variable 'ret' is declared as unsigned int
and then used to store negative return values such as -EINVAL. Fix them
by declaring the variable as a signed quantity.
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

5136fa56

cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes · 56d07db2

由 Srivatsa S. Bhat 提交于 9月 07, 2013

Commit "cpufreq: serialize calls to __cpufreq_governor()" had been a temporary
and partial solution to the race condition between writing to a cpufreq sysfs
file and taking a CPU offline. Now that we have a proper and complete solution
to that problem, remove the temporary fix.
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

56d07db2

cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug · 4f750c93

由 Srivatsa S. Bhat 提交于 9月 07, 2013

The functions that are used to write to cpufreq sysfs files (such as
store_scaling_max_freq()) are not hotplug safe. They can race with CPU
hotplug tasks and lead to problems such as trying to acquire an already
destroyed timer-mutex etc.

Eg:

    __cpufreq_remove_dev()
     __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
       policy->governor->governor(policy, CPUFREQ_GOV_STOP);
        cpufreq_governor_dbs()
         case CPUFREQ_GOV_STOP:
          mutex_destroy(&cpu_cdbs->timer_mutex)
          cpu_cdbs->cur_policy = NULL;
      <PREEMPT>
    store()
     __cpufreq_set_policy()
      __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
        policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
         case CPUFREQ_GOV_LIMITS:
          mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex)
           if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL

So use get_online_cpus()/put_online_cpus() in the store_*() functions, to
synchronize with CPU hotplug. However, there is an additional point to note
here: some parts of the CPU teardown in the cpufreq subsystem are done in
the CPU_POST_DEAD stage, with cpu_hotplug.lock *released*. So, using the
get/put_online_cpus() functions alone is insufficient; we should also ensure
that we don't race with those latter steps in the hotplug sequence. We can
easily achieve this by checking if the CPU is online before proceeding with
the store, since the CPU would have been marked offline by the time the
CPU_POST_DEAD notifiers are executed.
Reported-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4f750c93

cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock · 1aee40ac

由 Srivatsa S. Bhat 提交于 9月 07, 2013

__cpufreq_remove_dev_finish() handles the kobject cleanup for a CPU going
offline. But because we destroy the kobject towards the end of the CPU offline
phase, there are certain race windows where a task can try to write to a
cpufreq sysfs file (eg: using store_scaling_max_freq()) while we are taking
that CPU offline, and this can bump up the kobject refcount, which in turn might
hinder the CPU offline task from running to completion. (It can also cause
other more serious problems such as trying to acquire a destroyed timer-mutex
etc., depending on the exact stage of the cleanup at which the task managed to
take a new refcount).

To fix the race window, we will need to synchronize those store_*() call-sites
with CPU hotplug, using get_online_cpus()/put_online_cpus(). However, that
in turn can cause a total deadlock because it can end up waiting for the
CPU offline task to complete, with incremented refcount!

Write to sysfs                            CPU offline task
--------------                            ----------------
kobj_refcnt++

                                          Acquire cpu_hotplug.lock

get_online_cpus();

					  Wait for kobj_refcnt to drop to zero

                     **DEADLOCK**

A simple way to avoid this problem is to perform the kobject cleanup in the
CPU offline path, with the cpu_hotplug.lock *released*. That is, we can
perform the wait-for-kobj-refcnt-to-drop as well as the subsequent cleanup
in the CPU_POST_DEAD stage of CPU offline, which is run with cpu_hotplug.lock
released. Doing this helps us avoid deadlocks due to holding kobject refcounts
and waiting on each other on the cpu_hotplug.lock.

(Note: We can't move all of the cpufreq CPU offline steps to the
CPU_POST_DEAD stage, because certain things such as stopping the governors
have to be done before the outgoing CPU is marked offline. So retain those
parts in the CPU_DOWN_PREPARE stage itself).
Reported-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1aee40ac

cpufreq: Split __cpufreq_remove_dev() into two parts · cedb70af

由 Srivatsa S. Bhat 提交于 9月 07, 2013

During CPU offline, the cpufreq core invokes __cpufreq_remove_dev()
to perform work such as stopping the cpufreq governor, clearing the
CPU from the policy structure etc, and finally cleaning up the
kobject.

There are certain subtle issues related to the kobject cleanup, and
it would be much easier to deal with them if we separate that part
from the rest of the cleanup-work in the CPU offline phase. So split
the __cpufreq_remove_dev() function into 2 parts: one that handles
the kobject cleanup, and the other that handles the rest of the work.
Reported-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

cedb70af

cpufreq: Fix wrong time unit conversion · a857c0b9

由 Andreas Schwab 提交于 9月 07, 2013

The time spent by a CPU under a given frequency is stored in jiffies unit
in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of
the frequency.

This is what is displayed in the following file on the right column:

     cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
     2301000 19835820
     2300000 3172
     [...]

Now cpufreq converts this jiffies unit delta to clock_t before returning it
to the user as in the above file. And that conversion is achieved using the API
cputime64_to_clock_t().

Although it accidentally works on traditional tick based cputime accounting, where
cputime_t maps directly to jiffies, it doesn't work with other types of cputime
accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs
or any granularity preffered by the architecture.

For example we get a buggy zero delta on full dyntick configurations:

     cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
     2301000 0
     2300000 0
     [...]

Fix this with using the proper jiffies_64_t to clock_t conversion.
Reported-and-tested-by: NCarsten Emde <C.Emde@osadl.org>
Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a857c0b9

cpufreq: serialize calls to __cpufreq_governor() · 19c76303

由 Viresh Kumar 提交于 8月 31, 2013

We can't take a big lock around __cpufreq_governor() as this causes
recursive locking for some cases. But calls to this routine must be
serialized for every policy. Otherwise we can see some unpredictable
events.

For example, consider following scenario:

__cpufreq_remove_dev()
 __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
   policy->governor->governor(policy, CPUFREQ_GOV_STOP);
    cpufreq_governor_dbs()
     case CPUFREQ_GOV_STOP:
      mutex_destroy(&cpu_cdbs->timer_mutex)
      cpu_cdbs->cur_policy = NULL;
  <PREEMPT>
store()
 __cpufreq_set_policy()
  __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
    policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
     case CPUFREQ_GOV_LIMITS:
      mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex)
       if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL

And so store() will eventually result in a crash if cur_policy is
NULL at this point.

Introduce an additional variable which would guarantee serialization
here.
Reported-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

19c76303

cpufreq: don't allow governor limits to be changed when it is disabled · f73d3933

由 Viresh Kumar 提交于 8月 31, 2013

__cpufreq_governor() returns with -EBUSY when governor is already
stopped and we try to stop it again, but when it is stopped we must
not allow calls to CPUFREQ_GOV_LIMITS event as well.

This patch adds this check in __cpufreq_governor().
Reported-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f73d3933

30 8月, 2013 1 次提交

cpufreq: Don't use smp_processor_id() in preemptible context · 69320783

由 Stephen Boyd 提交于 8月 28, 2013

Workqueues are preemptible even if works are queued on them with
queue_work_on(). Let's use raw_smp_processor_id() here to silence
the warning.

BUG: using smp_processor_id() in preemptible [00000000] code: kworker/3:2/674
caller is gov_queue_work+0x28/0xb0
CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G        W    3.10.0 #30
Workqueue: events od_dbs_timer
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c03885a4>] (debug_smp_processor_id+0xbc/0xf0)
[<c03885a4>] (debug_smp_processor_id+0xbc/0xf0) from [<c0635864>] (gov_queue_work+0x28/0xb0)
[<c0635864>] (gov_queue_work+0x28/0xb0) from [<c0635618>] (od_dbs_timer+0x108/0x134)
[<c0635618>] (od_dbs_timer+0x108/0x134) from [<c01aa8f8>] (process_one_work+0x25c/0x444)
[<c01aa8f8>] (process_one_work+0x25c/0x444) from [<c01aaf88>] (worker_thread+0x200/0x344)
[<c01aaf88>] (worker_thread+0x200/0x344) from [<c01b03bc>] (kthread+0xa0/0xb0)
[<c01b03bc>] (kthread+0xa0/0xb0) from [<c01061b8>] (ret_from_fork+0x14/0x3c)
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

69320783

29 8月, 2013 1 次提交

cpufreq: governor: Fix typos in comments · c4afc410

由 Stratos Karafotis 提交于 8月 26, 2013

 - 'Governer' should be 'Governor'.
 - 'S' is used for Siemens (electrical conductance) in SI units,
   so use small 's' for seconds.
Signed-off-by: NStratos Karafotis <stratosk@semaphore.gr>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c4afc410

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功