提交 · c0f7f5b6c69107ca92909512533e70258ee19188 · openanolis / cloud-kernel

27 4月, 2018 1 次提交

cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interrupt · c0f7f5b6

由 Shilpasri G Bhat 提交于 4月 25, 2018

gpstate_timer_handler() uses synchronous smp_call to set the pstate
on the requested core. This causes the below hard lockup:

  smp_call_function_single+0x110/0x180 (unreliable)
  smp_call_function_any+0x180/0x250
  gpstate_timer_handler+0x1e8/0x580
  call_timer_fn+0x50/0x1c0
  expire_timers+0x138/0x1f0
  run_timer_softirq+0x1e8/0x270
  __do_softirq+0x158/0x3e4
  irq_exit+0xe8/0x120
  timer_interrupt+0x9c/0xe0
  decrementer_common+0x114/0x120
  -- interrupt: 901 at doorbell_global_ipi+0x34/0x50
  LR = arch_send_call_function_ipi_mask+0x120/0x130
  arch_send_call_function_ipi_mask+0x4c/0x130
  smp_call_function_many+0x340/0x450
  pmdp_invalidate+0x98/0xe0
  change_huge_pmd+0xe0/0x270
  change_protection_range+0xb88/0xe40
  mprotect_fixup+0x140/0x340
  SyS_mprotect+0x1b4/0x350
  system_call+0x58/0x6c

One way to avoid this is removing the smp-call. We can ensure that the
timer always runs on one of the policy-cpus. If the timer gets
migrated to a cpu outside the policy then re-queue it back on the
policy->cpus. This way we can get rid of the smp-call which was being
used to set the pstate on the policy->cpus.

Fixes: 7bc54b65 ("timers, cpufreq/powernv: Initialize the gpstate timer as pinned")
Cc: stable@vger.kernel.org # v4.8+
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Reported-by: NPridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NNicholas Piggin <npiggin@gmail.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c0f7f5b6

19 3月, 2018 1 次提交

cpufreq: powernv: Don't validate the frequency table twice · bf14721c

由 Viresh Kumar 提交于 3月 05, 2018

The cpufreq core is already validating the CPU frequency table after
calling the ->init() callback of the cpufreq drivers and the drivers
don't need to do the same anymore. Though they need to set the
policy->freq_table field directly from the ->init() callback now.

Stop validating the frequency table from powernv driver.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

bf14721c

12 1月, 2018 1 次提交

cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin · 3fa4680b

由 Shilpasri G Bhat 提交于 1月 12, 2018

Some OpenPOWER boxes can have same pstate values for nominal and
pmin pstates. In these boxes the current code will not initialize
'powernv_pstate_info.min' variable and result in erroneous CPU
frequency reporting. This patch fixes this problem.

Fixes: 09ca4c9b (cpufreq: powernv: Replacing pstate_id with frequency table index)
Reported-by: NAlvin Wang <wangat@tw.ibm.com>
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3fa4680b

05 1月, 2018 3 次提交

powernv-cpufreq: Treat pstates as opaque 8-bit values · 967b87fd

由 Gautham R. Shenoy 提交于 12月 13, 2017

On POWER8 and POWER9, the PMSR and the PMCR registers define pstates
to be 8-bit wide values. The device-tree exports pstates as 32-bit
wide values of which the lower byte is the actual pstate.

The current implementation in the kernel treats pstates as integer
type, since it used to use the sign of the pstate for performing some
boundary-checks. This is no longer required after the patch
"powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous
pstates".

So, in this patch, we modify the powernv-cpufreq driver to uniformly
treat pstates as opaque 8-bit values obtained from the device-tree or
the PMCR. This simplifies the extract_pstate() helper function since
we no longer no longer require to worry about the sign-extentions.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

967b87fd

powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates · 332f0a01

由 Gautham R. Shenoy 提交于 12月 13, 2017

The code in powernv-cpufreq, makes the following two assumptions which
are not guaranteed by the device-tree bindings:

    1) Pstate ids are continguous: This is used in pstate_to_idx() to
       obtain the reverse map from a pstate to it's corresponding
       entry into the cpufreq frequency table.

    2) Every Pstate should always lie between the max and the min
       pstates that are explicitly reported in the device tree: This
       is used to determine whether a pstate reported by the PMSR is
       out of bounds.

Both these assumptions are unwarranted and can change on future
platforms.

In this patch, we maintain the reverse map from a pstate to it's index
in the cpufreq frequency table and use this in pstate_to_idx(). This
does away with the assumptions (1) mentioned above, and will work with
non continguous pstate ids. If no entry exists for a particular
pstate, then such a pstate is treated as being out of bounds. This
gets rid of assumption (2).

On all the existing platforms, where the pstates are 8-bit long
values, the new implementation of pstate_to_idx() takes constant time.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

332f0a01

powernv-cpufreq: Add helper to extract pstate from PMSR · ee1f4a7d

由 Gautham R. Shenoy 提交于 12月 13, 2017

On POWERNV platform, the fields for pstates in the Power Management
Status Register (PMSR) and the Power Management Control Register
(PMCR) are 8-bits wide. On POWER8 the pstates are negatively numbered
while on POWER9 they are positively numbered.

The device-tree exports pstates as 32-bit entries. The device-tree
implementation sign-extends the 8-bit pstate values to obtain the
corresponding 32-bit entry.

Eg: On POWER8, a pstate value 0x82 [-126] is represented in the
device-tree as 0xfffffff82 while on POWER9, the same value 0x82 [130]
is represented in the device-tree as 0x00000082.

The powernv-cpufreq driver implementation represents pstates using the
integer type. In multiple places in the driver, the code interprets
the pstates extracted from the PMSR as a signed byte and assigns it to
a integer variable to get the sign-extention.

On POWER9 platforms which have greater than 128 pstates, this results
in the driver performing incorrect sign-extention, and thereby
treating a legitimate pstate (say 130) as an invalid pstates (since it
is interpreted as -126).

This patch fixes the issue by implementing a helper function to
extract Pstates from PMSR register, and correctly sign-extend it to be
consistent with the values provided by the device-tree.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ee1f4a7d

05 10月, 2017 1 次提交

timer: Remove init_timer_pinned_deferrable() in favor of timer_setup() · 1d1fe902

由 Kees Cook 提交于 10月 04, 2017

This refactors the only user of init_timer_pinned_deferrable() to use the
new timer_setup() and from_timer(). Adds a pointer back to the policy,
and drops the definition of init_timer_pinned_deferrable().
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: Petr Mladek <pmladek@suse.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: linux1394-devel@lists.sourceforge.net
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: linux-s390@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-pm@vger.kernel.org
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mark Gross <mark.gross@intel.com>
Cc: linux-watchdog@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Michael Reed <mdr@sgi.com>
Cc: netdev@vger.kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lkml.kernel.org/r/1507159627-127660-3-git-send-email-keescook@chromium.org

1d1fe902

04 2月, 2017 1 次提交

cpufreq: powernv: Add boost files to export ultra-turbo frequencies · b12f7a2b

由 Shilpasri G Bhat 提交于 1月 03, 2017

In P8+, Workload Optimized Frequency(WOF) provides the capability to
boost the cpu frequency based on the utilization of the other cpus
running in the chip. The On-Chip-Controller(OCC) firmware will control
the achievability of these frequencies depending on the power headroom
available in the chip. Currently the ultra-turbo frequencies provided
by this feature are exported along with the turbo and sub-turbo
frequencies as scaling_available_frequencies. This patch will export
the ultra-turbo frequencies separately as scaling_boost_frequencies in
WOF enabled systems. This patch will add the boost sysfs file which
can be used to disable/enable ultra-turbo frequencies.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b12f7a2b

17 11月, 2016 1 次提交

cpufreq: powernv: Disable preemption while checking CPU throttling state · 8a10c06a

由 Denis Kirjanov 提交于 11月 08, 2016

With preemption turned on we can read incorrect throttling state
while being switched to CPU on a different chip.

 BUG: using smp_processor_id() in preemptible [00000000] code: cat/7343
 caller is .powernv_cpufreq_throttle_check+0x2c/0x710
 CPU: 13 PID: 7343 Comm: cat Not tainted 4.8.0-rc5-dirty #1
 Call Trace:
 [c0000007d25b75b0] [c000000000971378] .dump_stack+0xe4/0x150 (unreliable)
 [c0000007d25b7640] [c0000000005162e4] .check_preemption_disabled+0x134/0x150
 [c0000007d25b76e0] [c0000000007b63ac] .powernv_cpufreq_throttle_check+0x2c/0x710
 [c0000007d25b7790] [c0000000007b6d18] .powernv_cpufreq_target_index+0x288/0x360
 [c0000007d25b7870] [c0000000007acee4] .__cpufreq_driver_target+0x394/0x8c0
 [c0000007d25b7920] [c0000000007b22ac] .cpufreq_set+0x7c/0xd0
 [c0000007d25b79b0] [c0000000007adf50] .store_scaling_setspeed+0x80/0xc0
 [c0000007d25b7a40] [c0000000007ae270] .store+0xa0/0x100
 [c0000007d25b7ae0] [c0000000003566e8] .sysfs_kf_write+0x88/0xb0
 [c0000007d25b7b70] [c0000000003553b8] .kernfs_fop_write+0x178/0x260
 [c0000007d25b7c10] [c0000000002ac3cc] .__vfs_write+0x3c/0x1c0
 [c0000007d25b7cf0] [c0000000002ad584] .vfs_write+0xc4/0x230
 [c0000007d25b7d90] [c0000000002aeef8] .SyS_write+0x58/0x100
 [c0000007d25b7e30] [c00000000000bfec] system_call+0x38/0xfc

Fixes: 09a972d1 (cpufreq: powernv: Report cpu frequency throttling)
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8a10c06a

15 11月, 2016 1 次提交

cpufreq: powernv: Fix uninitialized lpstate_idx in gpstates_timer_handler() · c9a81e68

由 Akshay Adiga 提交于 11月 14, 2016

lpstate_idx remains uninitialized in the case when elapsed_time
is greater than MAX_RAMP_DOWN_TIME. At the end of rampdown the
global pstate should be equal to the local pstate.

Fixes: 20b15b76 (cpufreq: powernv: Use PMCR to verify global and localpstate)
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c9a81e68

11 11月, 2016 2 次提交

cpufreq: powernv: Use PMCR to verify global and local pstate · 20b15b76

由 Akshay Adiga 提交于 11月 08, 2016

As fast_switch() may get called with interrupt disable mode, we cannot
hold a mutex to update the global_pstate_info. So currently, fast_switch()
does not update the global_pstate_info and it will end up with stale data
whenever pstate is updated through fast_switch().

As the gpstate_timer can fire after fast_switch() has updated the pstates,
the timer handler cannot rely on the cached values of local and global
pstate and needs to read it from the PMCR.

Only gpstate_timer_handler() is affected by the stale cached pstate data
beacause either fast_switch() or target_index() routines will be called
for a given govenor, but gpstate_timer can fire after the governor has
changed to schedutil.
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

20b15b76

cpufreq: powernv: Adding fast_switch for schedutil · 60c9efb8

由 Akshay Adiga 提交于 11月 08, 2016

Adding fast_switch which does light weight operation to set the desired
pstate. Both global and local pstates are set to the same desired pstate.
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

60c9efb8

06 8月, 2016 1 次提交

cpufreq: powernv: Fix crash in gpstate_timer_handler() · 8e859467

由 Akshay Adiga 提交于 8月 04, 2016

Commit 09ca4c9b (cpufreq: powernv: Replacing pstate_id with
frequency table index) changes calc_global_pstate() to use
cpufreq_table index instead of pstate_id.

But in gpstate_timer_handler(), pstate_id was being passed instead
of cpufreq_table index, which caused index_to_pstate() to access
out of bound indices, leading to this crash.

Adding sanity check for index and pstate, to ensure only valid pstate
and index values are returned.

Call Trace:
[c00000078d66b130] [c00000000011d224] __free_irq+0x234/0x360
(unreliable)
[c00000078d66b1c0] [c00000000011d44c] free_irq+0x6c/0xa0
[c00000078d66b1f0] [c00000000006c4f8] opal_event_shutdown+0x88/0xd0
[c00000078d66b230] [c000000000067a4c] opal_shutdown+0x1c/0x90
[c00000078d66b260] [c000000000063a00] pnv_shutdown+0x20/0x40
[c00000078d66b280] [c000000000021538] machine_restart+0x38/0x90
[c0000000078d66b310] [c000000000965ea0] panic+0x284/0x300
[c00000078d66b3a0] [c00000000001f508] die+0x388/0x450
[c00000078d66b430] [c000000000045a50] bad_page_fault+0xd0/0x140
[c00000078d66b4a0] [c000000000008964] handle_page_fault+0x2c/0x30
   interrupt: 300 at gpstate_timer_handler+0x150/0x260
    LR = gpstate_timer_handler+0x130/0x260
[c00000078d66b7f0] [c000000000132b58] call_timer_fn+0x58/0x1c0
[c00000078d66b880] [c000000000132e20] expire_timers+0x130/0x1d0
[c00000078d66b8f0] [c000000000133068] run_timer_softirq+0x1a8/0x230
[c00000078d66b980] [c0000000000b535c] __do_softirq+0x18c/0x400
[c00000078d66ba70] [c0000000000b5828] irq_exit+0xc8/0x100
[c00000078d66ba90] [c00000000001e214] timer_interrupt+0xa4/0xe0
[c00000078d66bac0] [c0000000000027d0] decrementer_common+0x150/0x180
   interrupt: 901 at arch_local_irq_restore+0x74/0x90
  0] [c000000000106b34] call_cpuidle+0x44/0x90
[c00000078d66be50] [c00000000010708c] cpu_startup_entry+0x38c/0x460
[c00000078d66bf20] [c00000000003d930] start_secondary+0x330/0x380
[c00000078d66bf90] [c000000000008e6c] start_secondary_prolog+0x10/0x14

Fixes: 09ca4c9b (cpufreq: powernv: Replacing pstate_id with frequency table index)
Reported-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Tested-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8e859467

12 7月, 2016 1 次提交

cpufreq: powernv: Replacing pstate_id with frequency table index · 09ca4c9b

由 Akshay Adiga 提交于 6月 30, 2016

Refactoring code to use frequency table index instead of pstate_id.
This abstraction will make the code independent of the pstate values.

- No functional changes
- The highest frequency is at frequency table index 0 and the frequency
  decreases as the index increases.
- Macros pstates_to_idx() and idx_to_pstate() can be used for conversion
  between pstate_id and index.
- powernv_pstate_info now contains frequency table index to min, max and
  nominal frequency (instead of pstate_ids)
- global_pstate_info new stores index values instead pstate ids.
- variables renamed as *_idx which now store index instead of pstate
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

09ca4c9b

07 7月, 2016 2 次提交

timers, cpufreq/powernv: Initialize the gpstate timer as pinned · 7bc54b65

由 Thomas Gleixner 提交于 7月 04, 2016

Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.297014487@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

7bc54b65

cpufreq: Reuse new freq-table helpers · 82577360

由 Viresh Kumar 提交于 6月 27, 2016

This patch migrates few users of cpufreq tables to the new helpers
that work on sorted freq-tables.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

82577360

09 6月, 2016 2 次提交

cpufreq: Return index from cpufreq_frequency_table_target() · d218ed77

由 Viresh Kumar 提交于 6月 03, 2016

This routine can't fail unless the frequency table is invalid and
doesn't contain any valid entries.

Make it return the index and WARN() in case it is used for an invalid
table.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d218ed77

cpufreq: Drop freq-table param to cpufreq_frequency_table_target() · 7ab4aabb

由 Viresh Kumar 提交于 6月 03, 2016

The policy already has this pointer set, use it instead.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7ab4aabb

11 5月, 2016 2 次提交

cpufreq: powernv: del_timer_sync when global and local pstate are equal · 0bc10b93

由 Akshay Adiga 提交于 5月 03, 2016

When global and local pstate are equal in a powernv_target_index() call,
we don't queue a timer. But we may have timer already queued for future.
This could cause the timer to fire one additional time for no use.
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

0bc10b93

cpufreq: powernv: Move smp_call_function_any() out of irq safe block · 1fd3ff28

由 Akshay Adiga 提交于 5月 03, 2016

Fix a WARN_ON caused by smp_call_function_any() when irq is disabled,
because of changes made in the patch ('cpufreq: powernv: Ramp-down
 global pstate slower than local-pstate')
https://patchwork.ozlabs.org/patch/612058/

 WARNING: CPU: 0 PID: 4 at kernel/smp.c:291
smp_call_function_single+0x170/0x180

 Call Trace:
 [c0000007f648f9f0] [c0000007f648fa90] 0xc0000007f648fa90 (unreliable)
 [c0000007f648fa30] [c0000000001430e0] smp_call_function_any+0x170/0x1c0
 [c0000007f648fa90] [c0000000007b4b00]
powernv_cpufreq_target_index+0xe0/0x250
 [c0000007f648fb00] [c0000000007ac9dc]
__cpufreq_driver_target+0x20c/0x3d0
 [c0000007f648fbc0] [c0000000007b1b4c] od_dbs_timer+0xcc/0x260
 [c0000007f648fc10] [c0000000007b3024] dbs_work_handler+0x54/0xa0
 [c0000007f648fc50] [c0000000000c49a8] process_one_work+0x1d8/0x590
 [c0000007f648fce0] [c0000000000c4e08] worker_thread+0xa8/0x660
 [c0000007f648fd80] [c0000000000cca88] kthread+0x108/0x130
 [c0000007f648fe30] [c0000000000095e8] ret_from_kernel_thread+0x5c/0x74

- Calling smp_call_function_any() with interrupt disabled (through
 spin_lock_irqsave) could cause a deadlock, as smp_call_function_any()
 relies on the IPI to complete. This is detected in the
 smp_call_function_any() call and hence the WARN_ON.

- As the spinlock (gpstates->lock) is only used to synchronize access of
 global_pstate_info  between timer irq handler and target_index calls. And
 the timer irq handler just try_locks() hence it would not cause a
 deadlock. Hence could do without making spinlocks irq safe.

- As the smp_call_function_any() is a blocking call and does not access
 global_pstates_info, it could reduce the critcal section by moving
 smp_call_function_any() after giving up the lock.
Reported-by: NAbdul Haleem <abdhalee@linux.vnet.linux.com>
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1fd3ff28

28 4月, 2016 2 次提交

cpufreq: powernv: Ramp-down global pstate slower than local-pstate · eaa2c3ae

由 Akshay Adiga 提交于 4月 19, 2016

The frequency transition latency from pmin to pmax is observed to be in
few millisecond granurality. And it usually happens to take a performance
penalty during sudden frequency rampup requests.

This patch set solves this problem by using an entity called "global
pstates". The global pstate is a Chip-level entity, so the global entitiy
(Voltage) is managed across the cores. The local pstate is a Core-level
entity, so the local entity (frequency) is managed across threads.

This patch brings down global pstate at a slower rate than the local
pstate. Hence by holding global pstates higher than local pstate makes
the subsequent rampups faster.

A per policy structure is maintained to keep track of the global and
local pstate changes. The global pstate is brought down using a parabolic
equation. The ramp down time to pmin is set to ~5 seconds. To make sure
that the global pstates are dropped at regular interval , a timer is
queued for every 2 seconds during ramp-down phase, which eventually brings
the pstate down to local pstate.

Iozone results show fairly consistent performance boost.
YCSB on redis shows improved Max latencies in most cases.

Iozone write/rewite test were made with filesizes 200704Kb and 401408Kb
with different record sizes . The following table shows IOoperations/sec
with and without patch.

Iozone Results ( in op/sec) ( mean over 3 iterations )
---------------------------------------------------------------------
file size-                      with            without		  %
recordsize-IOtype               patch           patch		change
----------------------------------------------------------------------
200704-1-SeqWrite               1616532         1615425         0.06
200704-1-Rewrite                2423195         2303130         5.21
200704-2-SeqWrite               1628577         1602620         1.61
200704-2-Rewrite                2428264         2312154         5.02
200704-4-SeqWrite               1617605         1617182         0.02
200704-4-Rewrite                2430524         2351238         3.37
200704-8-SeqWrite               1629478         1600436         1.81
200704-8-Rewrite                2415308e         2298136         5.09
200704-16-SeqWrite              1619632         1618250         0.08
200704-16-Rewrite               2396650         2352591         1.87
200704-32-SeqWrite              1632544         1598083         2.15
200704-32-Rewrite               2425119         2329743         4.09
200704-64-SeqWrite              1617812         1617235         0.03
200704-64-Rewrite               2402021         2321080         3.48
200704-128-SeqWrite             1631998         1600256         1.98
200704-128-Rewrite              2422389         2304954         5.09
200704-256 SeqWrite             1617065         1616962         0.00
200704-256-Rewrite              2432539         2301980         5.67
200704-512-SeqWrite             1632599         1598656         2.12
200704-512-Rewrite              2429270         2323676         4.54
200704-1024-SeqWrite            1618758         1616156         0.16
200704-1024-Rewrite             2431631         2315889         4.99
401408-1-SeqWrite               1631479         1608132         1.45
401408-1-Rewrite                2501550         2459409         1.71
401408-2-SeqWrite               1617095         1626069         -0.55
401408-2-Rewrite                2507557         2443621         2.61
401408-4-SeqWrite               1629601         1611869         1.10
401408-4-Rewrite                2505909         2462098         1.77
401408-8-SeqWrite               1617110         1626968         -0.60
401408-8-Rewrite                2512244         2456827         2.25
401408-16-SeqWrite              1632609         1609603         1.42
401408-16-Rewrite               2500792         2451405         2.01
401408-32-SeqWrite              1619294         1628167         -0.54
401408-32-Rewrite               2510115         2451292         2.39
401408-64-SeqWrite              1632709         1603746         1.80
401408-64-Rewrite               2506692         2433186         3.02
401408-128-SeqWrite             1619284         1627461         -0.50
401408-128-Rewrite              2518698         2453361         2.66
401408-256-SeqWrite             1634022         1610681         1.44
401408-256-Rewrite              2509987         2446328         2.60
401408-512-SeqWrite             1617524         1628016         -0.64
401408-512-Rewrite              2504409         2442899         2.51
401408-1024-SeqWrite            1629812         1611566         1.13
401408-1024-Rewrite             2507620          2442968        2.64

Tested with YCSB workload (50% update + 50% read) over redis for 1 million
records and 1 million operation. Each test was carried out with target
operations per second and persistence disabled.

Max-latency (in us)( mean over 5 iterations )
---------------------------------------------------------------
op/s    Operation       with patch      without patch   %change
---------------------------------------------------------------
15000   Read            61480.6         50261.4         22.32
15000   cleanup         215.2           293.6           -26.70
15000   update          25666.2         25163.8         2.00

25000   Read            32626.2         89525.4         -63.56
25000   cleanup         292.2           263.0           11.10
25000   update          32293.4         90255.0         -64.22

35000   Read            34783.0         33119.0         5.02
35000   cleanup         321.2           395.8           -18.8
35000   update          36047.0         38747.8         -6.97

40000   Read            38562.2         42357.4         -8.96
40000   cleanup         371.8           384.6           -3.33
40000   update          27861.4         41547.8         -32.94

45000   Read            42271.0         88120.6         -52.03
45000   cleanup         263.6           383.0           -31.17
45000   update          29755.8         81359.0         -63.43

(test without target op/s)
47659   Read            83061.4         136440.6        -39.12
47659   cleanup         195.8           193.8           1.03
47659   update          73429.4         124971.8        -41.24
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

eaa2c3ae

cpufreq: powernv: Remove flag use-case of policy->driver_data · 2920e9ce

由 Shilpasri G Bhat 提交于 4月 19, 2016

commit 1b028984 ("cpufreq: powernv: Add sysfs attributes to show
throttle stats") used policy->driver_data as a flag for one-time creation
of throttle sysfs files. Instead of this use 'kernfs_find_and_get()' to
check if the attribute already exists. This is required as
policy->driver_data is used for other purposes in the later patch.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2920e9ce

23 3月, 2016 1 次提交

cpufreq: powernv: Add sysfs attributes to show throttle stats · 1b028984

由 Shilpasri G Bhat 提交于 3月 22, 2016

Create sysfs attributes to export throttle information in
/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats directory. The
newly added sysfs files are as follows:

 1)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/turbo_stat
 2)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/sub-turbo_stat
 3)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/unthrottle
 4)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/powercap
 5)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/overtemp
 6)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/supply_fault
 7)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/overcurrent
 8)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/occ_reset

Detailed explanation of each attribute is added to
Documentation/ABI/testing/sysfs-devices-system-cpu
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1b028984

22 3月, 2016 1 次提交

cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path · 3e5963bc

由 Michael Neuling 提交于 3月 21, 2016

Commit 96c4726f "cpufreq: powernv: Remove cpu_to_chip_id() from
hot-path" introduced a 'core_to_chip_map' array to cache the chip-ids
of all cores.

Replace this with a per-CPU variable that stores the pointer to the
chip-array. This removes the linear lookup and provides a neater and
simpler solution.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3e5963bc

27 2月, 2016 1 次提交

cpufreq: powernv: Fix bugs in powernv_cpufreq_{init/exit} · c5e29ea7

由 Shilpasri G Bhat 提交于 2月 26, 2016

Unregister the notifiers if cpufreq_driver_register() fails in
powernv_cpufreq_init(). Re-arrange the unregistration and cleanup routines
in powernv_cpufreq_exit() to free all the resources after the driver
has unregistered.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c5e29ea7

05 2月, 2016 4 次提交

cpufreq: powernv: Replace pr_info with trace print for throttle event · c89f2682

由 Shilpasri G Bhat 提交于 2月 03, 2016

Currently we use printk message to notify the throttle event. But this
can flood the console if the cpu is throttled frequently. So replace the
printk with the tracepoint to notify the throttle event. And also events
like throttle below nominal frequency and OCC_RESET are reduced to
pr_warn/pr_warn_once as pointed by MFG to not mark them as critical
messages. This patch adds 'throttle_reason' to struct chip to store the
throttle reason.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c89f2682

cpufreq: powernv: Remove cpu_to_chip_id() from hot-path · 96c4726f

由 Shilpasri G Bhat 提交于 2月 03, 2016

cpu_to_chip_id() does a DT walk through to find out the chip id by
taking a contended device tree lock. This adds an unnecessary overhead
in a hot path. So instead of calling cpu_to_chip_id() everytime cache
the chip ids for all cores in the array 'core_to_chip_map' and use it
in the hotpath.
Reported-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

96c4726f

cpufreq: powernv: Hot-plug safe the kworker thread · 6d167a44

由 Shilpasri G Bhat 提交于 2月 03, 2016

In the kworker_thread powernv_cpufreq_work_fn(), we can end up
sending an IPI to a cpu going offline. This is a rare corner case
which is fixed using {get/put}_online_cpus(). Along with this fix,
this patch adds changes to do oneshot cpumask_{clear/and} operation.
Suggested-by: NShreyas B Prabhu <shreyas@linux.vnet.ibm.com>
Suggested-by: NGautham R Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6d167a44

cpufreq: powernv: Free 'chips' on module exit · 86622cb8

由 Shilpasri G Bhat 提交于 2月 03, 2016

This will free the dynamically allocated memory of 'chips' on
module exit.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

86622cb8

17 12月, 2015 1 次提交

powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL · e4d54f71

由 Stewart Smith 提交于 12月 09, 2015

Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OPALv3 to
be cared about or supported by mainline kernels.

So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL
exclusively.
Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e4d54f71

26 9月, 2015 1 次提交

cpufreq : powernv: Report Pmax throttling if capped below nominal frequency · d43b1b6f

由 Shilpasri G Bhat 提交于 9月 14, 2015

Log a 'critical' message if the max frequency is reduced below nominal
frequency. We already log 'info' message if the max frequency is
capped below turbo frequency. CPU should guarantee atleast nominal
frequency, but not turbo frequency in all system configurations and
environments. So report the pmax throttling with severity when Pmax is
dipped below nominal frequency.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d43b1b6f

01 9月, 2015 1 次提交

cpufreq: powernv: Increase the verbosity of OCC console messages · 309d0631

由 Shilpasri G Bhat 提交于 8月 27, 2015

Modify the OCC reset/load/active event message to make it clearer for
the user to understand the event and effect of the event.
Suggested-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

309d0631

28 7月, 2015 5 次提交

cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling · 22794280

由 Shilpasri G Bhat 提交于 7月 16, 2015

If frequency is throttled due to OCC reset then cpus will be in Psafe
frequency, so restore the frequency on all cpus to policy->cur when
OCCs are active again. And if frequency is throttled due to Pmax
capping then restore the frequency of all the cpus  in the chip on
unthrottling.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

22794280

cpufreq: powernv: Report Psafe only if PMSR.psafe_mode_active bit is set · 3dd3ebe5

由 Shilpasri G Bhat 提交于 7月 16, 2015

On a reset cycle of OCC, although the system retires from safe
frequency state the local pstate is not restored to Pmin or last
requested pstate. Now if the cpufreq governor initiates a pstate
change, the local pstate will be in Psafe and we will be reporting a
false positive when we are not throttled.

So in powernv_cpufreq_throttle_check() remove the condition which
checks if local pstate is less than Pmin while checking for Psafe
frequency. If the cpus are forced to Psafe then PMSR.psafe_mode_active
bit will be set. So, when OCCs become active this bit will be cleared.
Let us just rely on this bit for reporting throttling.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3dd3ebe5

cpufreq: powernv: Call throttle_check() on receiving OCC_THROTTLE · 735366fc

由 Shilpasri G Bhat 提交于 7月 16, 2015

Re-evaluate the chip's throttled state on recieving OCC_THROTTLE
notification by executing *throttle_check() on any one of the cpu on
the chip. This is a sanity check to verify if we were indeed
throttled/unthrottled after receiving OCC_THROTTLE notification.

We cannot call *throttle_check() directly from the notification
handler because we could be handling chip1's notification in chip2. So
initiate an smp_call to execute *throttle_check(). We are irq-disabled
in the notification handler, so use a worker thread to smp_call
throttle_check() on any of the cpu in the chipmask.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

735366fc

cpufreq: powernv: Register for OCC related opal_message notification · cb166fa9

由 Shilpasri G Bhat 提交于 7月 16, 2015

OCC is an On-Chip-Controller which takes care of power and thermal
safety of the chip. During runtime due to power failure or
overtemperature the OCC may throttle the frequencies of the CPUs to
remain within the power budget.

We want the cpufreq driver to be aware of such situations to be able
to report the reason to the user. We register to opal_message_notifier
to receive OCC messages from opal.

powernv_cpufreq_throttle_check() reports any frequency throttling and
this patch will report the reason or event that caused throttling. We
can be throttled if OCC is reset or OCC limits Pmax due to power or
thermal reasons. We are also notified of unthrottling after an OCC
reset or if OCC restores Pmax on the chip.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

cb166fa9

cpufreq: powernv: Handle throttling due to Pmax capping at chip level · 053819e0

由 Shilpasri G Bhat 提交于 7月 16, 2015

The On-Chip-Controller(OCC) can throttle cpu frequency by reducing the
max allowed frequency for that chip if the chip exceeds its power or
temperature limits. As Pmax capping is a chip level condition report
this throttling behavior at chip level and also do not set the global
'throttled' on Pmax capping instead set the per-chip throttled
variable. Report unthrottling if Pmax is restored after throttling.

This patch adds a structure to store chip id and throttled state of
the chip.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

053819e0

02 4月, 2015 1 次提交

cpufreq: powernv: Report cpu frequency throttling · 09a972d1

由 Shilpasri G Bhat 提交于 4月 01, 2015

The power and thermal safety of the system is taken care by an
On-Chip-Controller (OCC) which is real-time subsystem embedded within
the POWER8 processor. OCC continuously monitors the memory and core
temperature, the total system power, state of power supply and fan.

The cpu frequency can be throttled by OCC for the following reasons:
1)If a processor crosses its power and temperature limit then OCC will
lower its Pmax to reduce the frequency and voltage.
2)If OCC crashes then the system is forced to Psafe frequency.
3)If OCC fails to recover then the kernel is not allowed to do any
further frequency changes and the chip will remain in Psafe.

The user can see a drop in performance when frequency is throttled and
is unaware of throttling. So detect and report such a condition, so
the user can check the OCC status to reboot the system or check for
power supply or fan failures.

The current status of the core is read from Power Management Status
Register(PMSR) to check if any of the throttling condition is occurred
and the appropriate throttling message is reported.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

09a972d1

29 9月, 2014 2 次提交

cpufreq: powernv: Set the cpus to nominal frequency during reboot/kexec · cf30af76

由 Shilpasri G Bhat 提交于 9月 29, 2014

This patch ensures the cpus to kexec/reboot at nominal frequency.
Nominal frequency is the highest cpu frequency on PowerPC at
which the cores can run without getting throttled.

If the host kernel had set the cpus to a low pstate and then it
kexecs/reboots to a cpufreq disabled kernel it would cause the target
kernel to perform poorly. It will also increase the boot up time of
the target kernel. So set the cpus to high pstate, in this case to
nominal frequency before rebooting to avoid such scenarios.

The reboot notifier will set the cpus to nominal frequncy.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

cf30af76

cpufreq: powernv: Set the pstate of the last hotplugged out cpu in policy->cpus to minimum · b120339c

由 Preeti U Murthy 提交于 9月 29, 2014

Its possible today that the pstate of a core is held at a high even after the
entire core is hotplugged out if a load had just run on the hotplugged cpu. This is
fair, since it is assumed that the pstate does not matter to a cpu in a deep idle
state, which is the expected state of a hotplugged core on powerpc. However on powerpc,
the pstate at a socket level is held at the maximum of the pstates of each core. Even
if the pstates of the active cores on that socket is low, the socket pstate is held
high due to the pstate of the hotplugged core in the above mentioned scenario. This
can cost significant amount of power loss for no good.

Besides, since it is a non active core, nothing can be done from the kernel's end
to set the frequency of the core right. Hence make use of the stop_cpu callback
to explicitly set the pstate of the core to a minimum when the last cpu of the
core gets hotplugged out.
Signed-off-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b120339c

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功