提交 · c1807e3f84668c4c1e838386fdc3f6e63df1156c · openanolis / cloud-kernel

06 11月, 2017 1 次提交

KVM: PPC: Book3S HV: Handle host system reset in guest mode · 6de6638b

由 Nicholas Piggin 提交于 11月 05, 2017

If the host takes a system reset interrupt while a guest is running,
the CPU must exit the guest before processing the host exception
handler.

After this patch, taking a sysrq+x with a CPU running in a guest
gives a trace like this:

   cpu 0x27: Vector: 100 (System Reset) at [c000000fdf5776f0]
       pc: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv]
       lr: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv]
       sp: c000000fdf577850
      msr: 9000000002803033
     current = 0xc000000fdf4b1e00
     paca    = 0xc00000000fd4d680	 softe: 3	 irq_happened: 0x01
       pid   = 6608, comm = qemu-system-ppc
   Linux version 4.14.0-rc7-01489-g47e1893a404a-dirty #26 SMP
   [c000000fdf577a00] c008000010159dd4 kvmppc_vcpu_run_hv+0x3dc/0x12d0 [kvm_hv]
   [c000000fdf577b30] c0080000100a537c kvmppc_vcpu_run+0x44/0x60 [kvm]
   [c000000fdf577b60] c0080000100a1ae0 kvm_arch_vcpu_ioctl_run+0x118/0x310 [kvm]
   [c000000fdf577c00] c008000010093e98 kvm_vcpu_ioctl+0x530/0x7c0 [kvm]
   [c000000fdf577d50] c000000000357bf8 do_vfs_ioctl+0xd8/0x8c0
   [c000000fdf577df0] c000000000358448 SyS_ioctl+0x68/0x100
   [c000000fdf577e30] c00000000000b220 system_call+0x58/0x6c
   --- Exception: c01 (System Call) at 00007fff76868df0
   SP (7fff7069baf0) is in userspace

Fixes: e36d0a2e ("powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6de6638b

22 10月, 2017 2 次提交

powerpc: Disable the fast-endian switch syscall by default · 727f1361

由 Michael Ellerman 提交于 10月 09, 2017

Back in 2008 we added support for "fast little-endian switch" in the
syscall path. This added a special case syscall number 0x1ebe, which
is caught very early in the system call exception and switches endian
with as little overhead as possible. See commit 745a14cc
("[POWERPC] Add fast little-endian switch system call") for full
details.

Although it is fast, it's also completely non standard. The "syscall
number" is out of the range of normal syscalls, it can't be traced or
audited, and it's a bit of a wart. To the best of our knowledge it was
only used by one program, now long since discontinued.

So in an effort to shake out any current users, put it behind a config
option, and make it default n. If anyone *is* using it they can
quickly reinstate it with a rebuild, and we can flip it to default y.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

727f1361

M
powerpc/64s: Move the two FAST_ENDIAN macros next to each other · 5c2511bf
由 Michael Ellerman 提交于 10月 09, 2017
```
So we can #ifdef them in the next patch.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
```
5c2511bf

16 10月, 2017 1 次提交

powerpc/mce: Hookup derror (load/store) UE errors · ba41e1e1

由 Balbir Singh 提交于 9月 29, 2017

Extract physical_address for UE errors by walking the page
tables for the mm and address at the NIP, to extract the
instruction. Then use the instruction to find the effective
address via analyse_instr().

We might have page table walking races, but we expect them to
be rare, the physical address extraction is best effort. The idea
is to then hook up this infrastructure to memory failure eventually.
Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ba41e1e1

27 9月, 2017 1 次提交

powerpc/64s: Add workaround for P9 vector CI load issue · 5080332c

由 Michael Neuling 提交于 9月 15, 2017

POWER9 DD2.1 and earlier has an issue where some cache inhibited
vector load will return bad data. The workaround is two part, one
firmware/microcode part triggers HMI interrupts when hitting such
loads, the other part is this patch which then emulates the
instructions in Linux.

The affected instructions are limited to lxvd2x, lxvw4x, lxvb16x and
lxvh8x.

When an instruction triggers the HMI, all threads in the core will be
sent to the HMI handler, not just the one running the vector load.

In general, these spurious HMIs are detected by the emulation code and
we just return back to the running process. Unfortunately, if a
spurious interrupt occurs on a vector load that's to normal memory we
have no way to detect that it's spurious (unless we walk the page
tables, which is very expensive). In this case we emulate the load but
we need do so using a vector load itself to ensure 128bit atomicity is
preserved.

Some additional debugfs emulated instruction counters are added also.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Switch CONFIG_PPC_BOOK3S_64 to CONFIG_VSX to unbreak the build]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5080332c

23 8月, 2017 7 次提交

powerpc/64s: Remove spurious IRQ reason in IRQ replay · 7b76a1f5

由 Nicholas Piggin 提交于 8月 12, 2017

HVI interrupts have always used 0x500, so remove the dead branch.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7b76a1f5

powerpc/64s: Use the HV handler for external IRQ replay in HV mode on POWER9 · e6c1203d

由 Nicholas Piggin 提交于 8月 12, 2017

POWER9 host external interrupts use the h_virt_irq_common handler, so
use that to replay them rather than using the hardware_interrupt_common
handler. Both call do_IRQ, but using the correct handler reduces
i-cache footprint.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e6c1203d

powerpc/64s: Merge HV and non-HV paths for doorbell IRQ replay · d6f73fc6

由 Nicholas Piggin 提交于 8月 12, 2017

This results in smaller code, and fewer branches. This relies on the
fact that both the 0xe80 and 0xa00 handlers call the same upper level
code, namely doorbell_exception().
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Mention we rely on the implementation of the 0xe80/0xa00 handlers]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d6f73fc6

powerpc/64s: masked_interrupt() returns to kernel so avoid restoring r13 · c05f0be8

由 Nicholas Piggin 提交于 8月 12, 2017

Places in the kernel where r13 is not the PACA pointer must have
maskable interrupts disabled, so r13 does not have to be restored when
returning from a soft-masked interrupt. We should never have
interrupts soft disabled when we're in user space.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c05f0be8

powerpc/64s: Optimise clearing of MSR_EE in masked_[H]interrupt() · 6e9a2f6e

由 Nicholas Piggin 提交于 8月 12, 2017

MSR_EE is always enabled in SRR1 for masked interrupts, so we can use
xor to clear it.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6e9a2f6e

powerpc/64s: Avoid a branch in masked_[H]interrupt() · e0c827c0

由 Nicholas Piggin 提交于 8月 12, 2017

Interrupts which do not require EE to be cleared can all be tested
with a single bitwise test.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e0c827c0

powerpc/64s: Fix replay interrupt return label name · 3e23a12b

由 Michael Ellerman 提交于 8月 22, 2017

In __replay_interrupt() we take the address of a local label so we can
return to it later. However the assembler turns the local label into a
symbol with a name like ".L1^B42" - where "^B" is literally "\002".
This does not make for pleasant stack traces. Fix it by giving the
label a sensible name.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3e23a12b

10 8月, 2017 1 次提交

powerpc: Fix powerpc-specific watchdog build configuration · 75eb767e

由 Nicholas Piggin 提交于 8月 01, 2017

The powerpc kernel/watchdog.o should be built when HARDLOCKUP_DETECTOR
and HAVE_HARDLOCKUP_DETECTOR_ARCH are both selected. If only the former
is selected, then the generic perf watchdog has been selected.

To simplify this check, introduce a new Kconfig symbol PPC_WATCHDOG that
depends on both. This Kconfig option means the powerpc specific
watchdog is enabled.

Without this patch, Book3E will attempt to build the powerpc watchdog.

Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

75eb767e

03 8月, 2017 2 次提交

powerpc/mm: Use symbolic constants for filtering SRR1 bits on ISIs · b4c001dc

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

This uses the newly defined constants for this rather than open-coded
numbers. There is a side effect on 64-bit which is to pass through
some of the new P9 bits which we didn't before.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b4c001dc

powerpc/mm: Update bits used to skip hash_page · 398a719d

由 Benjamin Herrenschmidt 提交于 7月 19, 2017

We test a number of bits from DSISR/SRR1 before deciding
to call hash_page(). If any of these is set, we go directly
to do_page_fault() as the bit indicate a fault that needs
to be handled there (no hashing needed).

This updates the current open-coded masks to use the new
DSISR definitions.

This *does* change the masks actually used in two ways:

 - We used to test various bits that were defined as "always 0"
in the architecture and could be repurposed for something
else. From now on, we just ignore such bits.

 - We were missing some new bits defined on P9
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

398a719d

31 7月, 2017 1 次提交

powerpc/64s: Fix stack setup in watchdog soft_nmi_common() · cc491f1d

由 Nicholas Piggin 提交于 7月 29, 2017

The watchdog soft-NMI exception stack setup loads a stack pointer
twice, which is an obvious error. It ends up using the system reset
interrupt (true-NMI) stack, which is also a bug because the watchdog
could be preempted by a system reset interrupt that overwrites the
NMI stack.

Change the soft-NMI to use the "emergency stack". The current kernel
stack is not used, because of the longer-term goal to prevent
asynchronous stack access using soft-disable.

Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

cc491f1d

18 7月, 2017 1 次提交

powerpc/64s: Fix hypercall entry clobbering r12 input · 76fc0cfc

由 Nicholas Piggin 提交于 7月 18, 2017

A previous optimisation incorrectly assumed the PAPR hcall does
not use r12, and clobbers it upon entry. In fact it is used as
an input. This can result in KVM guests crashing (observed with
PR KVM).

Instead of using r12 to save r13, tihs patch saves r13 in ctr.
This is more costly, but not as slow as using the SPRG.

Fixes: acd7d8ce ("powerpc/64s: Optimize hypercall/syscall entry")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

76fc0cfc

13 7月, 2017 1 次提交

powerpc/64s: implement arch-specific hardlockup watchdog · 2104180a

由 Nicholas Piggin 提交于 7月 12, 2017

Implement an arch-speicfic watchdog rather than use the perf-based
hardlockup detector.

The new watchdog takes the soft-NMI directly, rather than going through
perf.  Perf interrupts are to be made maskable in future, so that would
prevent the perf detector from working in those regions.

Additionally, implement a SMP based detector where all CPUs watch one
another by pinging a shared cpumask.  This is because powerpc Book3S
does not have a true periodic local NMI, but some platforms do implement
a true NMI IPI.

If a CPU is stuck with interrupts hard disabled, the soft-NMI watchdog
does not work, but the SMP watchdog will.  Even on platforms without a
true NMI IPI to get a good trace from the stuck CPU, other CPUs will
notice the lockup sufficiently to report it and panic.

[npiggin@gmail.com: honor watchdog disable at boot/hotplug]
  Link: http://lkml.kernel.org/r/20170621001346.5bb337c9@roar.ozlabs.ibm.com
[npiggin@gmail.com: fix false positive warning at CPU unplug]
  Link: http://lkml.kernel.org/r/20170630080740.20766-1-npiggin@gmail.com
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170616065715.18390-6-npiggin@gmail.comSigned-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NDon Zickus <dzickus@redhat.com>
Tested-by: Babu Moger <babu.moger@oracle.com>	[sparc]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2104180a

03 7月, 2017 2 次提交

powerpc/64s: Blacklist functions invoked on a trap · 15770a13

由 Naveen N. Rao 提交于 6月 29, 2017

Blacklist all functions involved while handling a trap. We:
- convert some of the symbols into private symbols, and
- blacklist most functions involved while handling a trap.
Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

15770a13

powerpc/64s: Convert .L__replay_interrupt_return to a local label · 9d6c4523

由 Naveen N. Rao 提交于 6月 29, 2017

Commit b48bbb82 ("powerpc/64s: Don't unbalance the return branch
predictor in __replay_interrupt()") introduced __replay_interrupt_return
symbol with '.L' prefix in hopes of keeping it private. However, due to
the use of LOAD_REG_ADDR(), the assembler kept this symbol visible. Fix
the same by instead using the local label '1'.

Fixes: Commit b48bbb82 ("powerpc/64s: Don't unbalance the return branch
predictor in __replay_interrupt()")
Suggested-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9d6c4523

27 6月, 2017 1 次提交

powerpc/64s: Invalidate ERAT on powersave wakeup for POWER9 · ba6d334a

由 Benjamin Herrenschmidt 提交于 6月 24, 2017

On POWER9 the ERAT may be incorrect on wakeup from some stop states
that lose state. This causes random segvs and illegal instructions
when these stop states are enabled.

This patch invalidates the ERAT on wakeup on POWER9 to prevent this
from causing a problem.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Merge comment change with upstream changes]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ba6d334a

21 6月, 2017 3 次提交

powerpc/64s: Rename slb_allocate_realmode() to slb_allocate() · fd88b945

由 Michael Ellerman 提交于 6月 19, 2017

As for slb_miss_realmode(), rename slb_allocate_realmode() to avoid
confusion over whether it runs in real or virtual mode - it runs in
both.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>

fd88b945

powerpc/64s: Rename slb_miss_realmode() to slb_miss_common() · 442b6e8e

由 Michael Ellerman 提交于 6月 19, 2017

slb_miss_realmode() doesn't always runs in real mode, which is what the
name implies. So rename it to avoid confusing people.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>

442b6e8e

powerpc/64s: Use BRANCH_TO_COMMON() for slb_miss_realmode · b102063b

由 Michael Ellerman 提交于 6月 19, 2017

All the callers of slb_miss_realmode currently open code the #ifndef
CONFIG_RELOCATABLE check and the branch via CTR in the RELOCATABLE case.
We have a macro to do this, BRANCH_TO_COMMON(), so use it.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>

b102063b

20 6月, 2017 3 次提交

powerpc/64s: Avoid r3 save/restore in SLB miss handler · 4d7cd3b9

由 Nicholas Piggin 提交于 5月 21, 2017

The SLB miss handler uses r3 for the faulting address but r12 is
mostly able to be freed up to save r3 in. It just requires SRR1
be reloaded again on error.

It would be more conventional to use r12 for SRR1 (and use r11 to
save r3), but slb_allocate_realmode clobbers r11 and not r12.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4d7cd3b9

powerpc/64s: SLB miss already has CTR saved for relocatable kernel · fe5482c0

由 Nicholas Piggin 提交于 5月 21, 2017

The EXCEPTION_PROLOG_1 used by SLB miss already saves CTR when the
kernel is built with CONFIG_RELOCATABLE. So it does not have to be
saved and reloaded when branching to slb_miss_realmode. It can be
restored from the PACA as usual.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fe5482c0

powerpc/64s: Avoid saving faulting address into EX_DAR in SLB miss · 7c28f048

由 Nicholas Piggin 提交于 5月 21, 2017

The EX_DAR save area is only used in exceptional cases. With r3 no
longer clobbered by slb_allocate_realmode, saving faulting address to
EX_DAR can be deferred to those cases.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7c28f048

19 6月, 2017 4 次提交

powerpc/64s/idle: Avoid SRR usage in idle sleep/wake paths · 9d292501

由 Nicholas Piggin 提交于 6月 13, 2017

Idle code now always runs at the 0xc... effective address whether
in real or virtual mode. This means rfid can be ditched, along
with a lot of SRR manipulations.

In the wakeup path, carry SRR1 around in r12. Use mtmsrd to change
MSR states as required.

This also balances the return prediction for the idle call, by
doing blr rather than rfid to return to the idle caller.

On POWER9, 2-process context switch on different cores, with snooze
disabled, increases performance by 2%.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Incorporate v2 fixes from Nick]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9d292501

powerpc/64s/idle: Branch to handler with virtual mode offset · b51351e2

由 Nicholas Piggin 提交于 6月 13, 2017

Have the system reset idle wakeup handlers branched to in real mode
with the 0xc... kernel address applied. This allows simplifications of
avoiding rfid when switching to virtual mode in the wakeup handler.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b51351e2

powerpc/64s: Don't unbalance the return branch predictor in __replay_interrupt() · b48bbb82

由 Nicholas Piggin 提交于 6月 13, 2017

The __replay_interrupt() code is branched to with bl, but the caller is
returned to directly with rfid from the interrupt.

Instead, rfid to a stub that returns to the caller with blr, which
should keep the return branch predictor balanced.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b48bbb82

powerpc/64s: msgclr when handling doorbell exceptions from system reset · a9af97aa

由 Nicholas Piggin 提交于 6月 13, 2017

msgsnd doorbell exceptions are cleared when the doorbell interrupt is
taken. However if a doorbell exception causes a system reset interrupt
wake from power saving state, the message is not cleared. Processing
the doorbell from the system reset interrupt requires msgclr to avoid
taking the exception again.

Testing this plus the previous wakup direct patch gives:

original wakeup direct msgclr
Different threads, same core: 315k/s 264k/s 345k/s
Different cores: 235k/s 242k/s 242k/s

Net speedup is +10% for same core, and +3% for different core.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a9af97aa

16 6月, 2017 1 次提交

powerpc/64s: Handle data breakpoints in Radix mode · d89ba535

由 Naveen N. Rao 提交于 6月 14, 2017

On Power9, trying to use data breakpoints throws the splat shown
below. This is because the check for a data breakpoint in DSISR is in
do_hash_page(), which is not called when in Radix mode.

  Unable to handle kernel paging request for data at address 0xc000000000e19218
  Faulting instruction address: 0xc0000000001155e8
  cpu 0x0: Vector: 300 (Data Access) at [c0000000ef1e7b20]
  pc: c0000000001155e8: find_pid_ns+0x48/0xe0
  lr: c000000000116ac4: find_task_by_vpid+0x44/0x90
  sp: c0000000ef1e7da0
  msr: 9000000000009033
  dar: c000000000e19218
  dsisr: 400000

Move the check to handle_page_fault() so as to catch data breakpoints
in both Hash and Radix MMU modes.

We have to change the check in do_hash_page() against 0xa410 to use
0xa450, so as to include the value of (DSISR_DABRMATCH << 16).

There are two sites that call handle_page_fault() when in Radix, both
already pass DSISR in r4.

Fixes: caca285e ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
Cc: stable@vger.kernel.org # v4.7+
Reported-by: NShriya R. Kulkarni <shriykul@in.ibm.com>
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Fix the fall-through case on hash, we need to reload DSISR]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d89ba535

15 6月, 2017 1 次提交

powerpc/64s: Optimize hypercall/syscall entry · acd7d8ce

由 Nicholas Piggin 提交于 6月 09, 2017

After bc355125 ("powerpc/64: Allow for relocation-on interrupts from
guest to host"), a getppid() system call goes from 307 cycles to 358
cycles (+17%) on POWER8. This is due significantly to the scratch SPR
used by the hypercall check.

It turns out there are a some volatile registers common to both system
call and hypercall (in particular, r12, cr0, ctr), which can be used to
avoid the SPR and some other overheads. This brings getppid to 320 cycles
(+4%).

Testing hcall entry performance by running "sc 1" in guest userspace
before this patch is 854 cycles, afterwards is 826. Also a small win
there.

POWER9 syscall is improved by about the same amount, hcall not tested.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

acd7d8ce

09 5月, 2017 1 次提交

powerpc/64s: Fix unnecessary machine check handler relocation branch · 6102c005

由 Nicholas Piggin 提交于 5月 04, 2017

Similarly to commit 2563a70c ("powerpc/64s: Remove unnecessary relocation
branch from idle handler"), the machine check handler has a BRANCH_TO from
relocated to relocated code, which is unnecessary.

It has also caused build errors with some toolchains:

  arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
  arch/powerpc/kernel/exceptions-64s.S:395: Error: operand out of range
  (0xffffffffffff8280 is not between 0x0000000000000000 and
  0x000000000000ffff)

Fixes: 1945bc45 ("powerpc/64s: Fix POWER9 machine check handler from stop state")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reported-and-tested-by : Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6102c005

28 4月, 2017 5 次提交

powerpc/64s: Dedicated system reset interrupt stack · b1ee8a3d

由 Nicholas Piggin 提交于 12月 20, 2016

The system reset interrupt is used for crash/debug situations, so it is
desirable to have as little impact on the normal state of the system as
possible.

Currently it uses the current kernel stack to process the exception.
This stores into the stack which may be involved with the crash. The
stack pointer may be corrupted, or it may have overflowed.

Avoid or minimise these problems by creating a dedicated NMI stack for
the system reset interrupt to use.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b1ee8a3d

powerpc/64s: Disallow system reset vs system reset reentrancy · c4f3b52c

由 Nicholas Piggin 提交于 12月 20, 2016

In preparation for using a dedicated stack for system reset interrupts,
prevent a nested system reset from recovering, in order to simplify
code that is called in crash/debug path. This allows a system reset
interrupt to just use the base stack pointer.

Keep an in_nmi nesting counter similarly to the in_mce counter. Consider
the interrrupt non-recoverable if it is taken inside another system
reset.

Interrupt nesting could be allowed similarly to MCE, but system reset
is a special case that's not for normal operation, so simplicity wins
until there is requirement for nested system reset interrupts.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c4f3b52c

powerpc/64s: Fix system reset vs general interrupt reentrancy · a3d96f70

由 Nicholas Piggin 提交于 12月 20, 2016

The system reset interrupt can occur when MSR_EE=0, and it currently
uses the PACA_EXGEN save area.

Some PACA_EXGEN interrupts have a window where MSR_RI=1 and MSR_EE=0
when the save area is still in use. A system reset interrupt in this
window can lead to undetected corruption when the save area gets
overwritten.

This patch introduces PACA_EXNMI save area for system reset exceptions,
which closes this corruption window. It's also helpful to retain the
EXGEN state for debugging situations, even if not considering the
recoverability aspect.

This patch also moves the PACA_EXMC area down to a less frequently used
part of the paca with the new save area.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a3d96f70

powerpc/64s: Exception macro for stack frame and initial register save · a4087a4d

由 Nicholas Piggin 提交于 12月 20, 2016

This code is common to a few exceptions, and another user will be added.
This causes a trivial change to generated code:

-     604: std     r9,416(r1)
-     608: mfspr   r11,314
-     60c: std     r11,368(r1)
-     610: mfspr   r12,315
+     604: mfspr   r11,314
+     608: mfspr   r12,315
+     60c: std     r9,416(r1)
+     610: std     r11,368(r1)

machine_check_powernv_early could also use this, but that requires non
trivial changes to generated code, so that's for another patch.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a4087a4d

powerpc/64s: Add exception macro that does not enable RI · 83a980f7

由 Nicholas Piggin 提交于 12月 20, 2016

Subsequent patches will add more non-RI variant exceptions, so
create a macro for it rather than open-code it.

This does not change generated instructions.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

83a980f7

23 4月, 2017 1 次提交

powerpc/64s: Fix POWER9 machine check handler from stop state · 1945bc45

由 Nicholas Piggin 提交于 4月 19, 2017

The ISA specifies power save wakeup due to a machine check exception can
cause a machine check interrupt (rather than the usual system reset
interrupt).

The machine check handler copes with this by doing low level machine
check recovery without restoring full state from idle, then queues up a
machine check event for logging, then directly executes the same idle
instruction it woke from. This minimises the work done before recovery
is performed.

The problem is that it requires machine specific instructions and
knowledge of the book3s idle code. Currently it only has code to handle
POWER8 idle, so POWER9 crashes when trying to execute the P8 idle
instructions which don't exist in ISAv3.0B.

cpu 0x0: Vector: e40 (Emulation Assist) at [c0000000008f3810]
    pc: c000000000008380: machine_check_handle_early+0x130/0x2f0
    lr: c00000000053a098: stop_loop+0x68/0xd0
    sp: c0000000008f3a90
   msr: 9000000000081001
  current = 0xc0000000008a1080
  paca    = 0xc00000000ffd0000   softe: 0        irq_happened: 0x01
    pid   = 0, comm = swapper/0

Instead of going to sleep after recovery, do the usual idle wakeup and
state restoration by calling into the normal idle wakeup path. This
reuses the normal idle wakeup paths.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: NMahesh J Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1945bc45

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功