提交 · a67cc594dffd29cfe33fbee40932c9d04197ab2f · openanolis / cloud-kernel

05 4月, 2018 1 次提交

Revert "powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead" · a67cc594

由 Michael Ellerman 提交于 4月 05, 2018

As described in that commit:

When stop is executed with EC=ESL=0, it appears to execute like a
normal instruction (resuming from NIP when woken by interrupt). So
all the save/restore handling can be avoided completely.

This is true, except in the case of an NMI interrupt (sreset or
machine check) interrupting the instruction. In that case, the NMI
gets an "interrupt occurred while the processor was in power-saving
mode" indication. The power-save wakeup code uses that bit to decide
whether to restore some registers (e.g., LR). Because these are no
longer saved, this causes random register corruption.

It may be possible to restore this optimisation by detecting the case
of no register loss on the wakeup side, and avoid restoring in that
case, but that's not a minor fix because the wakeup code itself uses
some registers that would be live (e.g., LR).

Fixes: b9ee31e1 ("powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a67cc594

04 4月, 2018 2 次提交

powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead · b9ee31e1

由 Nicholas Piggin 提交于 4月 01, 2018

When stop is executed with EC=ESL=0, it appears to execute like a
normal instruction (resuming from NIP when woken by interrupt). So all
the save/restore handling can be avoided completely. In particular NV
GPRs do not have to be saved, and MSR does not have to be switched
back to kernel MSR.

So move the test for EC=ESL=0 sleep states out to power9_idle_stop,
and return directly to the caller after stop in that case.

This improves performance for ping-pong benchmark with the stop0_lite
idle state by 2.54% for 2 threads in the same core, and 2.57% for
different cores. Performance increase with HV_POSSIBLE defined will be
improved further by avoiding the hwsync.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b9ee31e1

powerpc/64s/idle: Consolidate power9_offline_stop()/power9_idle_stop() · d0b791c0

由 Michael Ellerman 提交于 4月 04, 2018

Commit 3d4fbffd ("powerpc/64s/idle: POWER9 implement a separate
idle stop function for hotplug") that added power9_offline_stop() was
written before commit 7672691a ("powerpc/powernv: Provide a way to
force a core into SMT4 mode").

When merging the former I failed to notice that it caused us to skip
the force-SMT4 logic for offline CPUs. The result is that offlined
CPUs will not correctly participate in the force-SMT4 logic, which
presumably will result in badness (not tested).

Reconcile the two commits by making power9_offline_stop() a pre-cursor
to power9_idle_stop(), so that they share the force-SMT4 logic.

This is based on an original commit from Nick, all breakage is my own.

Fixes: 3d4fbffd ("powerpc/64s/idle: POWER9 implement a separate idle stop function for hotplug")
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>

d0b791c0

03 4月, 2018 1 次提交

powerpc/powernv: Fix SMT4 forcing idle code · a2b5e056

由 Nicholas Piggin 提交于 4月 01, 2018

The PSSCR value is not stored to PACA_REQ_PSSCR if the CPU does not
have the XER[SO] bug.

Fix this by storing up-front, outside the workaround code. The initial
test is not required because it is a slow path.

The workaround is made to depend on CONFIG_KVM_BOOK3S_HV_POSSIBLE, to
match pnv_power9_force_smt4_catch() where it is used. Drop the comment
on pnv_power9_force_smt4_catch() as it's no longer true.

Fixes: 7672691a ("powerpc/powernv: Provide a way to force a core into SMT4 mode")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a2b5e056

31 3月, 2018 2 次提交

powerpc/64s/idle: avoid sync for KVM state when waking from idle · 8c1c7fb0

由 Nicholas Piggin 提交于 11月 18, 2017

When waking from a CPU idle instruction (e.g., nap or stop), the sync
for ordering the KVM secondary thread state can be avoided if there
wakeup is coming from a kernel context rather than KVM context.

This improves performance for ping-pong benchmark with the stop0 idle
state by 0.46% for 2 threads in the same core, and 1.02% for different
cores.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8c1c7fb0

powerpc/64s/idle: POWER9 implement a separate idle stop function for hotplug · 3d4fbffd

由 Nicholas Piggin 提交于 11月 18, 2017

Implement a new function to invoke stop, power9_offline_stop, which is
like power9_idle_stop but used by the cpu hotplug code.

Move KVM secondary state manipulation code to the offline case.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3d4fbffd

23 3月, 2018 1 次提交

powerpc/powernv: Provide a way to force a core into SMT4 mode · 7672691a

由 Paul Mackerras 提交于 3月 21, 2018

POWER9 processors up to and including "Nimbus" v2.2 have hardware
bugs relating to transactional memory and thread reconfiguration.
One of these bugs has a workaround which is to get the core into
SMT4 state temporarily.  This workaround is only needed when
running bare-metal.

This patch provides a function which gets the core into SMT4 mode
by preventing threads from going to a stop state, and waking up
those which are already in a stop state.  Once at least 3 threads
are not in a stop state, the core will be in SMT4 and we can
continue.

To do this, we add a "dont_stop" flag to the paca to tell the
thread not to go into a stop state.  If this flag is set,
power9_idle_stop() just returns immediately with a return value
of 0.  The pnv_power9_force_smt4_catch() function does the following:

1. Set the dont_stop flag for each thread in the core, except
   ourselves (in fact we use an atomic_inc() in case more than
   one thread is calling this function concurrently).
2. See how many threads are awake, indicated by their
   requested_psscr field in the paca being 0.  If this is at
   least 3, skip to step 5.
3. Send a doorbell interrupt to each thread that was seen as
   being in a stop state in step 2.
4. Until at least 3 threads are awake, scan the threads to which
   we sent a doorbell interrupt and check if they are awake now.

This relies on the following properties:

- Once dont_stop is non-zero, requested_psccr can't go from zero to
  non-zero, except transiently (and without the thread doing stop).
- requested_psscr being zero guarantees that the thread isn't in
  a state-losing stop state where thread reconfiguration could occur.
- Doing stop with a PSSCR value of 0 won't be a state-losing stop
  and thus won't allow thread reconfiguration.
- Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core
  must be in SMT4 mode, since SMT modes are powers of 2.

This does add a sync to power9_idle_stop(), which is necessary to
provide the correct ordering between setting requested_psscr and
checking dont_stop.  The overhead of the sync should be unnoticeable
compared to the latency of going into and out of a stop state.

Because some objected to incurring this extra latency on systems where
the XER[SO] bug is not relevant, I have put the test in
power9_idle_stop inside a feature section.  This means that
pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems
without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will
probably hang the system.

In order to cater for uses where the caller has an operation that
has to be done while the core is in SMT4, the core continues to be
kept in SMT4 after pnv_power9_force_smt4_catch() function returns,
until the pnv_power9_force_smt4_release() function is called.
It undoes the effect of step 1 above and allows the other threads
to go into a stop state.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7672691a

15 11月, 2017 1 次提交

powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature · 3ffa9d9e

由 Michael Ellerman 提交于 11月 15, 2017

Recently we added a CPU feature for Power9 DD2.0, to capture the fact
that some workarounds are required only on Power9 DD1 and DD2.0 but
not DD2.1 or later.

Then in commit 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and
DD2.0 ERAT workaround on DD2.1") and commit e3646330
"powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on
DD2.1") we changed CPU_FTR_SECTIONs to check for DD1 or DD20, eg:

  BEGIN_FTR_SECTION
          PPC_INVALIDATE_ERAT
  END_FTR_SECTION_IFSET(CPU_FTR_POWER9_DD1 | CPU_FTR_POWER9_DD20)

Unfortunately although this reads as "if set DD1 or DD2.0", the or is
a bitwise or and actually generates a mask of both bits. The code that
does the feature patching then checks that the value of the CPU
features masked with that mask are equal to the mask.

So the end result is we're checking for DD1 and DD20 being set, which
never happens. Yes the API is terrible.

Removing the ERAT workaround on DD2.0 results in random SEGVs, the
system tends to boot, but things randomly die including sometimes
dhclient, udev etc.

To fix the problem and hopefully avoid it in future, we remove the
DD2.0 CPU feature and instead add a DD2.1 (or later) feature. This
allows us to easily express that the workarounds are required if DD2.1
is not set.

At some point we will drop the DD1 workarounds entirely and some of
this can be cleaned up.

Fixes: 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1")
Fixes: e3646330 ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1")
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3ffa9d9e

06 11月, 2017 2 次提交

powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1 · e3646330

由 Nicholas Piggin 提交于 11月 03, 2017

DD2.1 does not have to save MMCR0 for all state-loss idle states,
only after deep idle states (like other PMU registers).
Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e3646330

powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1 · 9d2f510a

由 Nicholas Piggin 提交于 11月 03, 2017

DD2.1 does not have to flush the ERAT after a state-loss idle.

Performance testing was done on a DD2.1 using only the stop0 idle state
(the shallowest state which supports state loss), using context_switch
selftest configured to ping-poing between two threads on the same core
and two different cores.

Performance improvement for same core is 7.0%, different cores is 14.8%.
Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9d2f510a

19 10月, 2017 1 次提交

Revert "KVM: PPC: Book3S HV: POWER9 does not require secondary thread management" · 31a4d448

由 Paul Mackerras 提交于 10月 19, 2017

This reverts commit 94a04bc2.

In order to run HPT guests on a radix POWER9 host, we will have to run
the host in single-threaded mode, because POWER9 processors do not
currently support running some threads of a core in HPT mode while
others are in radix mode ("mixed mode").

That means that we will need the same mechanisms that are used on
POWER8 to make the secondary threads available to KVM, which were
disabled on POWER9 by commit 94a04bc2.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

31a4d448

29 8月, 2017 5 次提交

powerpc/64s: idle POWER9 can execute stop in virtual mode · 72b0d51d

由 Nicholas Piggin 提交于 8月 25, 2017

The hardware can execute stop in any context, and KVM does not
require real mode because siblings do not share MMU state. This
saves a switch to real-mode when going idle.
Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

72b0d51d

powerpc/64s: Drop no longer used IDLE_STATE_ENTER_SEQ · 65dbbe81

由 Nicholas Piggin 提交于 8月 29, 2017

There are no longer any callers of IDLE_STATE_ENTER_SEQ, all callers
use IDLE_STATE_ENTER_SEQ_NORET. So drop the former.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch, write change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

65dbbe81

powerpc/64s: POWER9 can execute stop without a sync sequence · 56ee5240

由 Nicholas Piggin 提交于 8月 29, 2017

We don't need to use IDLE_STATE_ENTER_SEQ_NORET on Power9.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

56ee5240

powerpc/64s: Move IDLE_STATE_ENTER_SEQ[_NORET] into idle_book3s.S · aafc8a83

由 Nicholas Piggin 提交于 8月 29, 2017

This macro is only used in idle_book3s.S, move it in there and add a
more descriptive comment.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch and write change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

aafc8a83

KVM: PPC: Book3S HV: POWER9 does not require secondary thread management · 94a04bc2

由 Nicholas Piggin 提交于 8月 25, 2017

POWER9 CPUs have independent MMU contexts per thread, so KVM does not
need to quiesce secondary threads, so the hwthread_req/hwthread_state
protocol does not have to be used. So patch it away on POWER9, and patch
away the branch from the Linux idle wakeup to kvm_start_guest that is
never used.

Add a warning and error out of kvmppc_grab_hwthread in case it is ever
called on POWER9.

This avoids a hwsync in the idle wakeup path on POWER9.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Acked-by: NPaul Mackerras <paulus@ozlabs.org>
[mpe: Use WARN(...) instead of WARN_ON()/pr_err(...)]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

94a04bc2

04 8月, 2017 1 次提交

powerpc/perf: POWER9 PMU stops after idle workaround · 09539f9b

由 Nicholas Piggin 提交于 7月 20, 2017

POWER9 DD2 PMU can stop after a state-loss idle in some conditions.

A solution is to set then clear MMCRA[60] after wake from state-loss
idle. MMCRA[60] is a non-architected bit, see the user manual for
details.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Acked-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

09539f9b

01 8月, 2017 1 次提交

powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle · e1c1cfed

由 Gautham R. Shenoy 提交于 7月 21, 2017

The stop4 idle state on POWER9 is a deep idle state which loses
hypervisor resources, but whose latency is low enough that it can be
exposed via cpuidle.

Until now, the deep idle states which lose hypervisor resources (eg:
winkle) were only exposed via CPU-Hotplug.  Hence currently on wakeup
from such states, barring a few SPRs which need to be restored to
their older value, rest of the SPRS are reinitialized to their values
corresponding to that at boot time.

When stop4 is used in the context of cpuidle, we want these additional
SPRs to be restored to their older value, to ensure that the context
on the CPU coming back from idle is same as it was before going idle.

In this patch, we define a SPR save area in PACA (since we have used
up the volatile register space in the stack) and on POWER9, we restore
SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1,
SPRN_MMCR2 to the values they had before entering stop.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e1c1cfed

18 7月, 2017 1 次提交

powerpc/perf: Avoid spurious PMU interrupts after idle · 101dd590

由 Nicholas Piggin 提交于 7月 10, 2017

POWER9 DD2 can see spurious PMU interrupts after state-loss idle in
some conditions.

A solution is to save and reload MMCR0 over state-loss idle.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Acked-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Tested-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

101dd590

28 6月, 2017 1 次提交

powerpc/powernv/idle: Clear r12 on wakeup from stop lite · 4d0d7c02

由 Akshay Adiga 提交于 6月 28, 2017

pnv_wakeup_noloss() expects r12 to contain SRR1 value to determine if the wakeup
reason is an HMI in CHECK_HMI_INTERRUPT.

When we wakeup with ESL=0, SRR1 will not contain the wakeup reason, so there is
no point setting r12 to SRR1.

However, we don't set r12 at all so r12 contains garbage (likely a kernel
pointer), and is still used to check HMI assuming that it contained SRR1. This
causes the OPAL msglog to be filled with the following print:

  HMI: Received HMI interrupt: HMER = 0x0040000000000000

This patch clears r12 after waking up from stop with ESL=EC=0, so that we don't
accidentally enter the HMI handler in pnv_wakeup_noloss() if the value of
r12[42:45] corresponds to HMI as wakeup reason.

Prior to commit 9d292501 ("powerpc/64s/idle: Avoid SRR usage in idle
sleep/wake paths") this bug existed, in that we would incorrectly look at SRR1
to check for a HMI when SRR1 didn't contain a wakeup reason. However the SRR1
value would just happen to never have bits 42:45 set.

Fixes: 9d292501 ("powerpc/64s/idle: Avoid SRR usage in idle sleep/wake paths")
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Change log and comment massaging]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4d0d7c02

27 6月, 2017 1 次提交

powerpc/64s: Invalidate ERAT on powersave wakeup for POWER9 · ba6d334a

由 Benjamin Herrenschmidt 提交于 6月 24, 2017

On POWER9 the ERAT may be incorrect on wakeup from some stop states
that lose state. This causes random segvs and illegal instructions
when these stop states are enabled.

This patch invalidates the ERAT on wakeup on POWER9 to prevent this
from causing a problem.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Merge comment change with upstream changes]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ba6d334a

19 6月, 2017 3 次提交

powerpc/64s/idle: Predict HMI wakeup as unlikely · 95acdc07

由 Nicholas Piggin 提交于 6月 13, 2017

In a busy system, idle wakeups can be expected from IPIs and device
interrupts.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

95acdc07

powerpc/64s/idle: Avoid SRR usage in idle sleep/wake paths · 9d292501

由 Nicholas Piggin 提交于 6月 13, 2017

Idle code now always runs at the 0xc... effective address whether
in real or virtual mode. This means rfid can be ditched, along
with a lot of SRR manipulations.

In the wakeup path, carry SRR1 around in r12. Use mtmsrd to change
MSR states as required.

This also balances the return prediction for the idle call, by
doing blr rather than rfid to return to the idle caller.

On POWER9, 2-process context switch on different cores, with snooze
disabled, increases performance by 2%.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Incorporate v2 fixes from Nick]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9d292501

powerpc/64s/idle: Move soft interrupt mask logic into C code · 2201f994

由 Nicholas Piggin 提交于 6月 13, 2017

This simplifies the asm and fixes irq-off tracing over sleep
instructions.

Also move powersave_nap check for POWER8 into C code, and move
PSSCR register value calculation for POWER9 into C.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2201f994

30 5月, 2017 3 次提交

powerpc/powernv/idle: Use Requested Level for restoring state on P9 DD1 · 22c6663d

由 Gautham R. Shenoy 提交于 5月 16, 2017

On Power9 DD1 due to a hardware bug the Power-Saving Level Status
field (PLS) of the PSSCR for a thread waking up from a deep state can
under-report if some other thread in the core is in a shallow stop
state. The scenario in which this can manifest is as follows:

   1) All the threads of the core are in deep stop.
   2) One of the threads is woken up. The PLS for this thread will
      correctly reflect that it is waking up from deep stop.
   3) The thread that has woken up now executes a shallow stop.
   4) When some other thread in the core is woken, its PLS will reflect
      the shallow stop state.

Thus, the subsequent thread for which the PLS is under-reporting the
wakeup state will not restore the hypervisor resources.

Hence, on DD1 systems, use the Requested Level (RL) field as a
workaround to restore the contents of the hypervisor resources on the
wakeup from the stop state.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

22c6663d

powerpc/powernv/idle: Restore LPCR on wakeup from deep-stop · cb0be7ec

由 Gautham R. Shenoy 提交于 5月 16, 2017

On wakeup from a deep stop state which is supposed to lose the
hypervisor state, we don't restore the LPCR to the old value but set
it to a "sane" value via cur_cpu_spec->cpu_restore().

The problem is that the "sane" value doesn't include UPRT and the HR
bits which are required to run correctly in Radix mode.

Fix this on POWER9 onwards by restoring the LPCR value whatever it was
before executing the stop instruction.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

cb0be7ec

powerpc/powernv/idle: Decouple Timebase restore & Per-core SPRs restore · ec486735

由 Gautham R. Shenoy 提交于 5月 16, 2017

On POWER8, in case of
   -  nap: both timebase and hypervisor state is retained.
   -  fast-sleep: timebase is lost. But the hypervisor state is retained.
   -  winkle: timebase and hypervisor state is lost.

Hence, the current code for handling exit from a idle state assumes
that if the timebase value is retained, then so is the hypervisor
state. Thus, the current code doesn't restore per-core hypervisor
state in such cases.

But that is no longer the case on POWER9 where we do have stop states
in which timebase value is retained, but the hypervisor state is
lost. So we have to ensure that the per-core hypervisor state gets
restored in such cases.

Fix this by ensuring that even in the case when timebase is retained,
we explicitly check if we are waking up from a deep stop that loses
per-core hypervisor state (indicated by cr4 being eq or gt), and if
this is the case, we restore the per-core hypervisor state.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ec486735

16 5月, 2017 1 次提交

powerpc/powernv: Set NAPSTATELOST after recovering paca on P9 DD1 · bbb075dd

由 Gautham R. Shenoy 提交于 5月 12, 2017

Commit 17ed4c8f ("powerpc/powernv: Recover correct PACA on wakeup
from a stop on P9 DD1") promises to set the NAPSTATELOST bit in paca
after recovering the correct paca for the thread waking up from stop1
on DD1, so that the GPRs can be correctly restored on the stop exit
path. However, it loads the value 1 into r3, but stores the value in
r0 into NAPSTATELOST(r13).

Fix this by correctly set the NAPSTATELOST bit in paca after
recovering the paca on POWER9 DD1.

Fixes: 17ed4c8f ("powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1")
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bbb075dd

23 4月, 2017 8 次提交

powerpc/64s: Simplify POWER9 DD1 idle workaround code · 9cba253d

由 Nicholas Piggin 提交于 4月 19, 2017

The idle workaround does not need to load PACATOC, and it does not
need to be called within a nested function that requires LR to be
saved.

Load the PACATOC at entry to the idle wakeup. It does not matter which
PACA this comes from, so it's okay to call before the workaround. Then
apply the workaround to get the right PACA.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9cba253d

powerpc/64s: Idle POWER8 avoid full state loss recovery where possible · 0d7720a2

由 Nicholas Piggin 提交于 4月 19, 2017

If not all threads were in winkle, full state loss recovery is not
necessary and can be avoided. A previous patch removed this optimisation
due to some complexity with the implementation. Re-implement it by
counting the number of threads in winkle with the per-core idle state.
Only restore full state loss if all threads were in winkle.

This has a small window of false positives right before threads execute
winkle and just after they wake up, when the winkle count does not
reflect the true number of threads in winkle. This is not a significant
problem in comparison with even the minimum winkle duration. For
correctness, a false positive is not a problem (only false negatives
would be).
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0d7720a2

powerpc/64s: Idle do not hold reservation longer than required · e420249d

由 Nicholas Piggin 提交于 4月 19, 2017

When taking the core idle state lock, grab it immediately like a regular
lock, rather than adding more tests in there. Holding the lock keeps it
stable, so there is no need to do it whole holding the reservation.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e420249d

powerpc/64s: Expand core idle state bits · adbcf8d7

由 Nicholas Piggin 提交于 4月 19, 2017

In preparation for adding more bits to the core idle state word, move
the lock bit up, and unlock by flipping the lock bit rather than masking
off all but the thread bits.

Add branch hints for atomic operations while we're here.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

adbcf8d7

powerpc/64s: Fix POWER9 machine check handler from stop state · 1945bc45

由 Nicholas Piggin 提交于 4月 19, 2017

The ISA specifies power save wakeup due to a machine check exception can
cause a machine check interrupt (rather than the usual system reset
interrupt).

The machine check handler copes with this by doing low level machine
check recovery without restoring full state from idle, then queues up a
machine check event for logging, then directly executes the same idle
instruction it woke from. This minimises the work done before recovery
is performed.

The problem is that it requires machine specific instructions and
knowledge of the book3s idle code. Currently it only has code to handle
POWER8 idle, so POWER9 crashes when trying to execute the P8 idle
instructions which don't exist in ISAv3.0B.

cpu 0x0: Vector: e40 (Emulation Assist) at [c0000000008f3810]
    pc: c000000000008380: machine_check_handle_early+0x130/0x2f0
    lr: c00000000053a098: stop_loop+0x68/0xd0
    sp: c0000000008f3a90
   msr: 9000000000081001
  current = 0xc0000000008a1080
  paca    = 0xc00000000ffd0000   softe: 0        irq_happened: 0x01
    pid   = 0, comm = swapper/0

Instead of going to sleep after recovery, do the usual idle wakeup and
state restoration by calling into the normal idle wakeup path. This
reuses the normal idle wakeup paths.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: NMahesh J Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1945bc45

powerpc/64s: Use alternative feature patching · 10101aa9

由 Nicholas Piggin 提交于 4月 19, 2017

This reduces the number of nops for POWER8.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

10101aa9

powerpc/64s: Stop using bit in HSPRG0 to test winkle · 544686ca

由 Nicholas Piggin 提交于 4月 19, 2017

The POWER8 idle code has a neat trick of programming the power on engine
to restore a low bit into HSPRG0, so idle wakeup code can test and see
if it has been programmed this way and therefore lost all state. Restore
time can be reduced if winkle has not been reached.

However this messes with our r13 PACA pointer, and requires HSPRG0 to be
written to. It also optimizes the slowest and most uncommon case at the
expense of another SPR write in the common nap state wakeup.

Remove this complexity and assume winkle sleeps always require a state
restore. This speedup could be made entirely contained within the winkle
idle code by counting per-core winkles and setting a thread bitmap when
all have gone to winkle.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

544686ca

powerpc/64s: Move remaining system reset idle code into idle_book3s.S · bf0153c1

由 Nicholas Piggin 提交于 4月 19, 2017

No functional change.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bf0153c1

11 4月, 2017 1 次提交

powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1 · 17ed4c8f

由 Gautham R. Shenoy 提交于 3月 22, 2017

POWER9 DD1.0 hardware has a bug where the SPRs of a thread waking up
from stop 0,1,2 with ESL=1 can endup being misplaced in the core. Thus
the HSPRG0 of a thread waking up from can contain the paca pointer of
its sibling.

This patch implements a context recovery framework within threads of a
core, by provisioning space in paca_struct for saving every sibling
threads's paca pointers. Basically, we should be able to arrive at the
right paca pointer from any of the thread's existing paca pointer.

At bootup, during powernv idle-init, we save the paca address of every
CPU in each one its siblings paca_struct in the slot corresponding to
this CPU's index in the core.

On wakeup from a stop, the thread will determine its index in the core
from the TIR register and recover its PACA pointer by indexing into
the correct slot in the provisioned space in the current PACA.

Furthermore, ensure that the NVGPRs are restored from the stack on the
way out by setting the NAPSTATELOST in paca.

[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Call it a bug]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

17ed4c8f

20 3月, 2017 1 次提交

powerpc/64s: Fix idle wakeup potential to clobber registers · 6d98ce0b

由 Nicholas Piggin 提交于 3月 17, 2017

We concluded there may be a window where the idle wakeup code could get
to pnv_wakeup_tb_loss() (which clobbers non-volatile GPRs), but the
hardware may set SRR1[46:47] to 01b (no state loss) which would result
in the wakeup code failing to restore non-volatile GPRs.

I was not able to trigger this condition with trivial tests on real
hardware or simulator, but the ISA (at least 2.07) seems to allow for
it, and Gautham says that it can happen if there is an exception pending
when the sleep/winkle instruction is executed.

Fixes: 17065671 ("powerpc/kvm: make hypervisor state restore a function")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6d98ce0b

03 3月, 2017 1 次提交

powerpc/powernv: Fix bug due to labeling ambiguity in power_enter_stop · 424f8acd

由 Gautham R. Shenoy 提交于 2月 27, 2017

Commit 09206b60 ("powernv: Pass PSSCR value and mask to
power9_idle_stop") added additional code in power_enter_stop() to
distinguish between stop requests whose PSSCR had ESL=EC=1 from those
which did not. When ESL=EC=1, we do a forward-jump to a location
labelled by "1", which had the code to handle the ESL=EC=1 case.

Unfortunately just a couple of instructions before this label, is the
macro IDLE_STATE_ENTER_SEQ() which also has a label "1" in its
expansion.

As a result, the current code can result in directly executing stop
instruction for deep stop requests with PSSCR ESL=EC=1, without saving
the hypervisor state.

Fix this BUG by labeling the location that handles ESL=EC=1 case with
a more descriptive label ".Lhandle_esl_ec_set" (local label suggestion
a la .Lxx from Anton Blanchard).

While at it, rename the label "2" labelling the location of the code
handling entry into deep stop states with ".Lhandle_deep_stop".

For a good measure, change the label in IDLE_STATE_ENTER_SEQ() macro
to an not-so commonly used value in order to avoid similar mishaps in
the future.

Fixes: 09206b60 ("powernv: Pass PSSCR value and mask to power9_idle_stop")
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

424f8acd

07 2月, 2017 1 次提交

powerpc/powernv: Remove separate entry for OPAL real mode calls · ab9bad0e

由 Benjamin Herrenschmidt 提交于 2月 07, 2017

All entry points already read the MSR so they can easily do
the right thing.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ab9bad0e

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功