- 31 3月, 2018 2 次提交
-
-
由 Nicholas Piggin 提交于
When waking from a CPU idle instruction (e.g., nap or stop), the sync for ordering the KVM secondary thread state can be avoided if there wakeup is coming from a kernel context rather than KVM context. This improves performance for ping-pong benchmark with the stop0 idle state by 0.46% for 2 threads in the same core, and 1.02% for different cores. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
Implement a new function to invoke stop, power9_offline_stop, which is like power9_idle_stop but used by the cpu hotplug code. Move KVM secondary state manipulation code to the offline case. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 23 3月, 2018 1 次提交
-
-
由 Paul Mackerras 提交于
POWER9 processors up to and including "Nimbus" v2.2 have hardware bugs relating to transactional memory and thread reconfiguration. One of these bugs has a workaround which is to get the core into SMT4 state temporarily. This workaround is only needed when running bare-metal. This patch provides a function which gets the core into SMT4 mode by preventing threads from going to a stop state, and waking up those which are already in a stop state. Once at least 3 threads are not in a stop state, the core will be in SMT4 and we can continue. To do this, we add a "dont_stop" flag to the paca to tell the thread not to go into a stop state. If this flag is set, power9_idle_stop() just returns immediately with a return value of 0. The pnv_power9_force_smt4_catch() function does the following: 1. Set the dont_stop flag for each thread in the core, except ourselves (in fact we use an atomic_inc() in case more than one thread is calling this function concurrently). 2. See how many threads are awake, indicated by their requested_psscr field in the paca being 0. If this is at least 3, skip to step 5. 3. Send a doorbell interrupt to each thread that was seen as being in a stop state in step 2. 4. Until at least 3 threads are awake, scan the threads to which we sent a doorbell interrupt and check if they are awake now. This relies on the following properties: - Once dont_stop is non-zero, requested_psccr can't go from zero to non-zero, except transiently (and without the thread doing stop). - requested_psscr being zero guarantees that the thread isn't in a state-losing stop state where thread reconfiguration could occur. - Doing stop with a PSSCR value of 0 won't be a state-losing stop and thus won't allow thread reconfiguration. - Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core must be in SMT4 mode, since SMT modes are powers of 2. This does add a sync to power9_idle_stop(), which is necessary to provide the correct ordering between setting requested_psscr and checking dont_stop. The overhead of the sync should be unnoticeable compared to the latency of going into and out of a stop state. Because some objected to incurring this extra latency on systems where the XER[SO] bug is not relevant, I have put the test in power9_idle_stop inside a feature section. This means that pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will probably hang the system. In order to cater for uses where the caller has an operation that has to be done while the core is in SMT4, the core continues to be kept in SMT4 after pnv_power9_force_smt4_catch() function returns, until the pnv_power9_force_smt4_release() function is called. It undoes the effect of step 1 above and allows the other threads to go into a stop state. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 15 11月, 2017 1 次提交
-
-
由 Michael Ellerman 提交于
Recently we added a CPU feature for Power9 DD2.0, to capture the fact that some workarounds are required only on Power9 DD1 and DD2.0 but not DD2.1 or later. Then in commit 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1") and commit e3646330 "powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1") we changed CPU_FTR_SECTIONs to check for DD1 or DD20, eg: BEGIN_FTR_SECTION PPC_INVALIDATE_ERAT END_FTR_SECTION_IFSET(CPU_FTR_POWER9_DD1 | CPU_FTR_POWER9_DD20) Unfortunately although this reads as "if set DD1 or DD2.0", the or is a bitwise or and actually generates a mask of both bits. The code that does the feature patching then checks that the value of the CPU features masked with that mask are equal to the mask. So the end result is we're checking for DD1 and DD20 being set, which never happens. Yes the API is terrible. Removing the ERAT workaround on DD2.0 results in random SEGVs, the system tends to boot, but things randomly die including sometimes dhclient, udev etc. To fix the problem and hopefully avoid it in future, we remove the DD2.0 CPU feature and instead add a DD2.1 (or later) feature. This allows us to easily express that the workarounds are required if DD2.1 is not set. At some point we will drop the DD1 workarounds entirely and some of this can be cleaned up. Fixes: 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1") Fixes: e3646330 ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1") Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 06 11月, 2017 2 次提交
-
-
由 Nicholas Piggin 提交于
DD2.1 does not have to save MMCR0 for all state-loss idle states, only after deep idle states (like other PMU registers). Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
DD2.1 does not have to flush the ERAT after a state-loss idle. Performance testing was done on a DD2.1 using only the stop0 idle state (the shallowest state which supports state loss), using context_switch selftest configured to ping-poing between two threads on the same core and two different cores. Performance improvement for same core is 7.0%, different cores is 14.8%. Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 19 10月, 2017 1 次提交
-
-
由 Paul Mackerras 提交于
This reverts commit 94a04bc2. In order to run HPT guests on a radix POWER9 host, we will have to run the host in single-threaded mode, because POWER9 processors do not currently support running some threads of a core in HPT mode while others are in radix mode ("mixed mode"). That means that we will need the same mechanisms that are used on POWER8 to make the secondary threads available to KVM, which were disabled on POWER9 by commit 94a04bc2. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 29 8月, 2017 5 次提交
-
-
由 Nicholas Piggin 提交于
The hardware can execute stop in any context, and KVM does not require real mode because siblings do not share MMU state. This saves a switch to real-mode when going idle. Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Reviewed-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
There are no longer any callers of IDLE_STATE_ENTER_SEQ, all callers use IDLE_STATE_ENTER_SEQ_NORET. So drop the former. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch, write change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
We don't need to use IDLE_STATE_ENTER_SEQ_NORET on Power9. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
This macro is only used in idle_book3s.S, move it in there and add a more descriptive comment. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch and write change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
POWER9 CPUs have independent MMU contexts per thread, so KVM does not need to quiesce secondary threads, so the hwthread_req/hwthread_state protocol does not have to be used. So patch it away on POWER9, and patch away the branch from the Linux idle wakeup to kvm_start_guest that is never used. Add a warning and error out of kvmppc_grab_hwthread in case it is ever called on POWER9. This avoids a hwsync in the idle wakeup path on POWER9. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NPaul Mackerras <paulus@ozlabs.org> [mpe: Use WARN(...) instead of WARN_ON()/pr_err(...)] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 04 8月, 2017 1 次提交
-
-
由 Nicholas Piggin 提交于
POWER9 DD2 PMU can stop after a state-loss idle in some conditions. A solution is to set then clear MMCRA[60] after wake from state-loss idle. MMCRA[60] is a non-architected bit, see the user manual for details. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 01 8月, 2017 1 次提交
-
-
由 Gautham R. Shenoy 提交于
The stop4 idle state on POWER9 is a deep idle state which loses hypervisor resources, but whose latency is low enough that it can be exposed via cpuidle. Until now, the deep idle states which lose hypervisor resources (eg: winkle) were only exposed via CPU-Hotplug. Hence currently on wakeup from such states, barring a few SPRs which need to be restored to their older value, rest of the SPRS are reinitialized to their values corresponding to that at boot time. When stop4 is used in the context of cpuidle, we want these additional SPRs to be restored to their older value, to ensure that the context on the CPU coming back from idle is same as it was before going idle. In this patch, we define a SPR save area in PACA (since we have used up the volatile register space in the stack) and on POWER9, we restore SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1, SPRN_MMCR2 to the values they had before entering stop. Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 18 7月, 2017 1 次提交
-
-
由 Nicholas Piggin 提交于
POWER9 DD2 can see spurious PMU interrupts after state-loss idle in some conditions. A solution is to save and reload MMCR0 over state-loss idle. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Tested-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 28 6月, 2017 1 次提交
-
-
由 Akshay Adiga 提交于
pnv_wakeup_noloss() expects r12 to contain SRR1 value to determine if the wakeup reason is an HMI in CHECK_HMI_INTERRUPT. When we wakeup with ESL=0, SRR1 will not contain the wakeup reason, so there is no point setting r12 to SRR1. However, we don't set r12 at all so r12 contains garbage (likely a kernel pointer), and is still used to check HMI assuming that it contained SRR1. This causes the OPAL msglog to be filled with the following print: HMI: Received HMI interrupt: HMER = 0x0040000000000000 This patch clears r12 after waking up from stop with ESL=EC=0, so that we don't accidentally enter the HMI handler in pnv_wakeup_noloss() if the value of r12[42:45] corresponds to HMI as wakeup reason. Prior to commit 9d292501 ("powerpc/64s/idle: Avoid SRR usage in idle sleep/wake paths") this bug existed, in that we would incorrectly look at SRR1 to check for a HMI when SRR1 didn't contain a wakeup reason. However the SRR1 value would just happen to never have bits 42:45 set. Fixes: 9d292501 ("powerpc/64s/idle: Avoid SRR usage in idle sleep/wake paths") Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Change log and comment massaging] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 6月, 2017 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
On POWER9 the ERAT may be incorrect on wakeup from some stop states that lose state. This causes random segvs and illegal instructions when these stop states are enabled. This patch invalidates the ERAT on wakeup on POWER9 to prevent this from causing a problem. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Merge comment change with upstream changes] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 19 6月, 2017 3 次提交
-
-
由 Nicholas Piggin 提交于
In a busy system, idle wakeups can be expected from IPIs and device interrupts. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
Idle code now always runs at the 0xc... effective address whether in real or virtual mode. This means rfid can be ditched, along with a lot of SRR manipulations. In the wakeup path, carry SRR1 around in r12. Use mtmsrd to change MSR states as required. This also balances the return prediction for the idle call, by doing blr rather than rfid to return to the idle caller. On POWER9, 2-process context switch on different cores, with snooze disabled, increases performance by 2%. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Incorporate v2 fixes from Nick] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
This simplifies the asm and fixes irq-off tracing over sleep instructions. Also move powersave_nap check for POWER8 into C code, and move PSSCR register value calculation for POWER9 into C. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 30 5月, 2017 3 次提交
-
-
由 Gautham R. Shenoy 提交于
On Power9 DD1 due to a hardware bug the Power-Saving Level Status field (PLS) of the PSSCR for a thread waking up from a deep state can under-report if some other thread in the core is in a shallow stop state. The scenario in which this can manifest is as follows: 1) All the threads of the core are in deep stop. 2) One of the threads is woken up. The PLS for this thread will correctly reflect that it is waking up from deep stop. 3) The thread that has woken up now executes a shallow stop. 4) When some other thread in the core is woken, its PLS will reflect the shallow stop state. Thus, the subsequent thread for which the PLS is under-reporting the wakeup state will not restore the hypervisor resources. Hence, on DD1 systems, use the Requested Level (RL) field as a workaround to restore the contents of the hypervisor resources on the wakeup from the stop state. Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Gautham R. Shenoy 提交于
On wakeup from a deep stop state which is supposed to lose the hypervisor state, we don't restore the LPCR to the old value but set it to a "sane" value via cur_cpu_spec->cpu_restore(). The problem is that the "sane" value doesn't include UPRT and the HR bits which are required to run correctly in Radix mode. Fix this on POWER9 onwards by restoring the LPCR value whatever it was before executing the stop instruction. Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Gautham R. Shenoy 提交于
On POWER8, in case of - nap: both timebase and hypervisor state is retained. - fast-sleep: timebase is lost. But the hypervisor state is retained. - winkle: timebase and hypervisor state is lost. Hence, the current code for handling exit from a idle state assumes that if the timebase value is retained, then so is the hypervisor state. Thus, the current code doesn't restore per-core hypervisor state in such cases. But that is no longer the case on POWER9 where we do have stop states in which timebase value is retained, but the hypervisor state is lost. So we have to ensure that the per-core hypervisor state gets restored in such cases. Fix this by ensuring that even in the case when timebase is retained, we explicitly check if we are waking up from a deep stop that loses per-core hypervisor state (indicated by cr4 being eq or gt), and if this is the case, we restore the per-core hypervisor state. Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 16 5月, 2017 1 次提交
-
-
由 Gautham R. Shenoy 提交于
Commit 17ed4c8f ("powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1") promises to set the NAPSTATELOST bit in paca after recovering the correct paca for the thread waking up from stop1 on DD1, so that the GPRs can be correctly restored on the stop exit path. However, it loads the value 1 into r3, but stores the value in r0 into NAPSTATELOST(r13). Fix this by correctly set the NAPSTATELOST bit in paca after recovering the paca on POWER9 DD1. Fixes: 17ed4c8f ("powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1") Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 23 4月, 2017 8 次提交
-
-
由 Nicholas Piggin 提交于
The idle workaround does not need to load PACATOC, and it does not need to be called within a nested function that requires LR to be saved. Load the PACATOC at entry to the idle wakeup. It does not matter which PACA this comes from, so it's okay to call before the workaround. Then apply the workaround to get the right PACA. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
If not all threads were in winkle, full state loss recovery is not necessary and can be avoided. A previous patch removed this optimisation due to some complexity with the implementation. Re-implement it by counting the number of threads in winkle with the per-core idle state. Only restore full state loss if all threads were in winkle. This has a small window of false positives right before threads execute winkle and just after they wake up, when the winkle count does not reflect the true number of threads in winkle. This is not a significant problem in comparison with even the minimum winkle duration. For correctness, a false positive is not a problem (only false negatives would be). Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
When taking the core idle state lock, grab it immediately like a regular lock, rather than adding more tests in there. Holding the lock keeps it stable, so there is no need to do it whole holding the reservation. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
In preparation for adding more bits to the core idle state word, move the lock bit up, and unlock by flipping the lock bit rather than masking off all but the thread bits. Add branch hints for atomic operations while we're here. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
The ISA specifies power save wakeup due to a machine check exception can cause a machine check interrupt (rather than the usual system reset interrupt). The machine check handler copes with this by doing low level machine check recovery without restoring full state from idle, then queues up a machine check event for logging, then directly executes the same idle instruction it woke from. This minimises the work done before recovery is performed. The problem is that it requires machine specific instructions and knowledge of the book3s idle code. Currently it only has code to handle POWER8 idle, so POWER9 crashes when trying to execute the P8 idle instructions which don't exist in ISAv3.0B. cpu 0x0: Vector: e40 (Emulation Assist) at [c0000000008f3810] pc: c000000000008380: machine_check_handle_early+0x130/0x2f0 lr: c00000000053a098: stop_loop+0x68/0xd0 sp: c0000000008f3a90 msr: 9000000000081001 current = 0xc0000000008a1080 paca = 0xc00000000ffd0000 softe: 0 irq_happened: 0x01 pid = 0, comm = swapper/0 Instead of going to sleep after recovery, do the usual idle wakeup and state restoration by calling into the normal idle wakeup path. This reuses the normal idle wakeup paths. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: NMahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
This reduces the number of nops for POWER8. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
The POWER8 idle code has a neat trick of programming the power on engine to restore a low bit into HSPRG0, so idle wakeup code can test and see if it has been programmed this way and therefore lost all state. Restore time can be reduced if winkle has not been reached. However this messes with our r13 PACA pointer, and requires HSPRG0 to be written to. It also optimizes the slowest and most uncommon case at the expense of another SPR write in the common nap state wakeup. Remove this complexity and assume winkle sleeps always require a state restore. This speedup could be made entirely contained within the winkle idle code by counting per-core winkles and setting a thread bitmap when all have gone to winkle. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
No functional change. Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 11 4月, 2017 1 次提交
-
-
由 Gautham R. Shenoy 提交于
POWER9 DD1.0 hardware has a bug where the SPRs of a thread waking up from stop 0,1,2 with ESL=1 can endup being misplaced in the core. Thus the HSPRG0 of a thread waking up from can contain the paca pointer of its sibling. This patch implements a context recovery framework within threads of a core, by provisioning space in paca_struct for saving every sibling threads's paca pointers. Basically, we should be able to arrive at the right paca pointer from any of the thread's existing paca pointer. At bootup, during powernv idle-init, we save the paca address of every CPU in each one its siblings paca_struct in the slot corresponding to this CPU's index in the core. On wakeup from a stop, the thread will determine its index in the core from the TIR register and recover its PACA pointer by indexing into the correct slot in the provisioned space in the current PACA. Furthermore, ensure that the NVGPRs are restored from the stack on the way out by setting the NAPSTATELOST in paca. [Changelog written with inputs from svaidy@linux.vnet.ibm.com] Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Call it a bug] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 20 3月, 2017 1 次提交
-
-
由 Nicholas Piggin 提交于
We concluded there may be a window where the idle wakeup code could get to pnv_wakeup_tb_loss() (which clobbers non-volatile GPRs), but the hardware may set SRR1[46:47] to 01b (no state loss) which would result in the wakeup code failing to restore non-volatile GPRs. I was not able to trigger this condition with trivial tests on real hardware or simulator, but the ISA (at least 2.07) seems to allow for it, and Gautham says that it can happen if there is an exception pending when the sleep/winkle instruction is executed. Fixes: 17065671 ("powerpc/kvm: make hypervisor state restore a function") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 03 3月, 2017 1 次提交
-
-
由 Gautham R. Shenoy 提交于
Commit 09206b60 ("powernv: Pass PSSCR value and mask to power9_idle_stop") added additional code in power_enter_stop() to distinguish between stop requests whose PSSCR had ESL=EC=1 from those which did not. When ESL=EC=1, we do a forward-jump to a location labelled by "1", which had the code to handle the ESL=EC=1 case. Unfortunately just a couple of instructions before this label, is the macro IDLE_STATE_ENTER_SEQ() which also has a label "1" in its expansion. As a result, the current code can result in directly executing stop instruction for deep stop requests with PSSCR ESL=EC=1, without saving the hypervisor state. Fix this BUG by labeling the location that handles ESL=EC=1 case with a more descriptive label ".Lhandle_esl_ec_set" (local label suggestion a la .Lxx from Anton Blanchard). While at it, rename the label "2" labelling the location of the code handling entry into deep stop states with ".Lhandle_deep_stop". For a good measure, change the label in IDLE_STATE_ENTER_SEQ() macro to an not-so commonly used value in order to avoid similar mishaps in the future. Fixes: 09206b60 ("powernv: Pass PSSCR value and mask to power9_idle_stop") Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 07 2月, 2017 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
All entry points already read the MSR so they can easily do the right thing. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 31 1月, 2017 2 次提交
-
-
由 Gautham R. Shenoy 提交于
The power9_idle_stop method currently takes only the requested stop level as a parameter and picks up the rest of the PSSCR bits from a hand-coded macro. This is not a very flexible design, especially when the firmware has the capability to communicate the psscr value and the mask associated with a particular stop state via device tree. This patch modifies the power9_idle_stop API to take as parameters the PSSCR value and the PSSCR mask corresponding to the stop state that needs to be set. These PSSCR value and mask are respectively obtained by parsing the "ibm,cpu-idle-state-psscr" and "ibm,cpu-idle-state-psscr-mask" fields from the device tree. In addition to this, the patch adds support for handling stop states for which ESL and EC bits in the PSSCR are zero. As per the architecture, a wakeup from these stop states resumes execution from the subsequent instruction as opposed to waking up at the System Vector. The older firmware sets only the Requested Level (RL) field in the psscr and psscr-mask exposed in the device tree. For older firmware where psscr-mask=0xf, this patch will set the default sane values that the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and TR). For the new firmware, the patch will validate that the invariants required by the ISA for the psscr values are maintained by the firmware. This skiboot patch that exports fully populated PSSCR values and the mask for all the stop states can be found here: https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html [Optimize the number of instructions before entering STOP with ESL=EC=0, validate the PSSCR values provided by the firimware maintains the invariants required as per the ISA suggested by Balbir Singh] Acked-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Gautham R. Shenoy 提交于
Currently all the low-power idle states are expected to wake up at reset vector 0x100. Which is why the macro IDLE_STATE_ENTER_SEQ that puts the CPU to an idle state and never returns. On ISA v3.0, when the ESL and EC bits in the PSSCR are zero, the CPU is expected to wake up at the next instruction of the idle instruction. This patch adds a new macro named IDLE_STATE_ENTER_SEQ_NORET for the no-return variant and reuses the name IDLE_STATE_ENTER_SEQ for a variant that allows resuming operation at the instruction next to the idle-instruction. Acked-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 24 10月, 2016 2 次提交
-
-
由 Paul Mackerras 提交于
This fixes a race condition where one thread that is entering or leaving a power-saving state can inadvertently ignore the lock bit that was set by another thread, and potentially also clear it. The core_idle_lock_held function is called when the lock bit is seen to be set. It polls the lock bit until it is clear, then does a lwarx to load the word containing the lock bit and thread idle bits so it can be updated. However, it is possible that the value loaded with the lwarx has the lock bit set, even though an immediately preceding lwz loaded a value with the lock bit clear. If this happens then we go ahead and update the word despite the lock bit being set, and when called from pnv_enter_arch207_idle_mode, we will subsequently clear the lock bit. No identifiable misbehaviour has been attributed to this race. This fixes it by checking the lock bit in the value loaded by the lwarx. If it is set then we just go back and keep on polling. Fixes: b32aadc1 ("powerpc/powernv: Fix race in updating core_idle_state") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
Commit 8117ac6a ("powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one thread entering a KVM guest could switch the MMU context to the guest while another thread was still in host kernel context with the MMU on. That commit moved the point where a thread entering a power-saving mode set its kvm_hstate.hwthread_state field in its PACA to KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the MMU had been switched off. That commit also added a comment explaining that we have to switch to real mode before setting hwthread_state to avoid this race. Nevertheless, commit 4eae2c9a ("powerpc/powernv: Make pnv_powersave_common more generic", 2016-07-08) subsequently moved the setting of hwthread_state back to a point where the MMU is on, thus reintroducing the race, despite the comment saying that this should not be done being included in full in the context lines of the patch that did it. This fixes the race again and adds a bigger and shoutier comment explaining the potential race condition. Fixes: 4eae2c9a ("powerpc/powernv: Make pnv_powersave_common more generic") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Reviewed-by: NShreyas B. Prabhu <shreyasbp@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-