提交 · a4e89ffb59235fd11d27107dea3efa4562ac0a12 · openanolis / cloud-kernel

29 8月, 2017 1 次提交

powerpc/e6500: Update machine check for L1D cache err · a4e89ffb

由 Matt Weber 提交于 6月 28, 2017

This patch updates the machine check handler of Linux kernel to
handle the e6500 architecture case. In e6500 core, L1 Data Cache Write
Shadow Mode (DCWS) register is not implemented but L1 data cache always
runs in write shadow mode. So, on L1 data cache parity errors, hardware
will automatically invalidate the data cache but will still log a
machine check interrupt.
Signed-off-by: NRonak Desai <ronak.desai@rockwellcollins.com>
Signed-off-by: NMatthew Weber <matthew.weber@rockwellcollins.com>
Signed-off-by: NScott Wood <oss@buserror.net>

a4e89ffb

10 8月, 2017 10 次提交

powerpc/8xx: Remove SoftwareEmulation() · fbbcc3bb

由 Christophe Leroy 提交于 8月 08, 2017

Since commit aa42c69c ("[POWERPC] Add support for FP emulation
for the e300c2 core"), program_check_exception() can be called for
math emulation. In that case, 'reason' is 0.

On the 8xx, there is a Software Emulation interrupt which is
called for all unimplemented and illegal instructions. This
interrupt calls SoftwareEmulation() which does almost the
same as program_check_exception() called with reason = 0.

The Software Emulation interrupt sets all reason bits to 0,
it is therefore possible to call program_check_exception()
directly from the interrupt handler.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fbbcc3bb

powerpc/8xx: Move 8xx machine check handlers into platforms/8xx · f70b1e8d

由 Christophe Leroy 提交于 8月 08, 2017

In the same spirit as what was done for 4xx and 44x, move
the 8xx machine check into platforms/8xx
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f70b1e8d

powerpc/traps: Use SRR1 defines for program check reasons · d30a5a52

由 Michael Ellerman 提交于 8月 08, 2017

Currently we open code the reason codes for program checks. Instead use
the existing SRR1 defines.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d30a5a52

powerpc/mce: Move 64-bit machine check code into mce.c · ccd3cd36

由 Michael Ellerman 提交于 8月 08, 2017

We already have mce.c which is built for 64bit and contains other parts
of the machine check code, so move these bits in there too.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ccd3cd36

powerpc/traps: machine_check_generic() is only used on 32-bit · 7f3f819e

由 Michael Ellerman 提交于 8月 08, 2017

Make it clear that the fallback version of machine_check_generic() is
only used on 32-bit configs.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7f3f819e

powerpc/traps: Inline get_mc_reason() · 42bff234

由 Michael Ellerman 提交于 8月 08, 2017

get_mc_reason() no longer provides (if it ever really did) any
meaningful abstraction, so remove it.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

42bff234

powerpc/4xx: Move machine_check_4xx() into platforms/4xx · 0d0935b3

由 Michael Ellerman 提交于 8月 08, 2017

Now that we have 4xx platform directory we can move the 4xx machine
check handler in there. Again we drop get_mc_reason() and replace it
with regs->dsisr directly (which is actually SPRN_ESR).
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0d0935b3

powerpc/44x: Move 44x machine check handlers into platforms/44x · 5b4e2857

由 Michael Ellerman 提交于 8月 08, 2017

We have several 44x machine check handlers defined in traps.c. It would
be preferable if they were split out with the platforms that use them.
Do that.

In the process, drop get_mc_reason() and instead just open code the
lookup of reason from regs->dsisr. This avoids a pointless layer of
abstraction.

We know to use regs->dsisr because 44x enables BOOKE which enables
PPC_ADV_DEBUG_REGS, and FSL_BOOKE is not enabled on 44x builds.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5b4e2857

N
powerpc: Add irq accounting for system reset interrupts · ca41ad43
由 Nicholas Piggin 提交于 8月 01, 2017
```
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
```
ca41ad43

powerpc/64s: Fix mce accounting for powernv · f886f0f6

由 Nicholas Piggin 提交于 8月 01, 2017

On 64-bit Book3s, when we're in HV mode, we have already counted the
machine check exception in machine_check_early().
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Use IS_ENABLED() rather than an #ifdef]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f886f0f6

03 7月, 2017 1 次提交

powerpc/64s: Blacklist functions invoked on a trap · 15770a13

由 Naveen N. Rao 提交于 6月 29, 2017

Blacklist all functions involved while handling a trap. We:
- convert some of the symbols into private symbols, and
- blacklist most functions involved while handling a trap.
Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

15770a13

03 5月, 2017 1 次提交

powerpc/book3s/mce: Move add_taint() later in virtual mode · d93b0ac0

由 Mahesh Salgaonkar 提交于 4月 18, 2017

machine_check_early() gets called in real mode. The very first time when
add_taint() is called, it prints a warning which ends up calling opal
call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
very first machine check while we are in opal we are doomed. OPAL_CALL
overwrites the PACASAVEDMSR in r13 and in this case when we are done with
MCE handling the original opal call will use this new MSR on it's way
back to opal_return. This usually leads to unexpected behaviour or the
kernel to panic. Instead move the add_taint() call later in the virtual
mode where it is safe to call.

This is broken with current FW level. We got lucky so far for not getting
very first MCE hit while in OPAL. But easily reproducible on Mambo.

Fixes: 27ea2c42 ("powerpc: Set the correct kernel taint on machine check errors.")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d93b0ac0

28 4月, 2017 2 次提交

powerpc: Mark system reset as an NMI with nmi_enter/exit() · 2b4f3ac5

由 Nicholas Piggin 提交于 12月 20, 2016

System reset is a non-maskable interrupt from Linux's point of view
(occurs under local_irq_disable()), so it should use nmi_enter/exit.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2b4f3ac5

powerpc/64s: Disallow system reset vs system reset reentrancy · c4f3b52c

由 Nicholas Piggin 提交于 12月 20, 2016

In preparation for using a dedicated stack for system reset interrupts,
prevent a nested system reset from recovering, in order to simplify
code that is called in crash/debug path. This allows a system reset
interrupt to just use the base stack pointer.

Keep an in_nmi nesting counter similarly to the in_mce counter. Consider
the interrrupt non-recoverable if it is taken inside another system
reset.

Interrupt nesting could be allowed similarly to MCE, but system reset
is a special case that's not for normal operation, so simplicity wins
until there is requirement for nested system reset interrupts.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c4f3b52c

13 4月, 2017 2 次提交

powerpc/64s: Add SCV FSCR bit for ISA v3.0 · 9b7ff0c6

由 Nicholas Piggin 提交于 4月 07, 2017

Add the bit definition and use it in facility_unavailable_exception() so we can
intelligently report the cause if we take a fault for SCV. This doesn't actually
enable SCV.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Drop whitespace changes to the existing entries, flush out change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9b7ff0c6

N
powerpc/64s: Add msgp facility unavailable log string · 794464f4
由 Nicholas Piggin 提交于 4月 07, 2017
```
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
```
794464f4

11 4月, 2017 1 次提交

powerpc: Create asm/debugfs.h and move powerpc_debugfs_root there · 7644d581

由 Michael Ellerman 提交于 2月 10, 2017

powerpc_debugfs_root is the dentry representing the root of the
"powerpc" directory tree in debugfs.

Currently it sits in asm/debug.h, a long with some other things that
have "debug" in the name, but are otherwise unrelated.

Pull it out into a separate header, which also includes linux/debugfs.h,
and convert all the users to include debugfs.h instead of debug.h.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7644d581

02 3月, 2017 1 次提交

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> · b17b0153

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b17b0153

25 12月, 2016 1 次提交

Replace <asm/uaccess.h> with <linux/uaccess.h> globally · 7c0f6ba6

由 Linus Torvalds 提交于 12月 24, 2016

This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.
Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c0f6ba6

02 12月, 2016 1 次提交

powerpc Don't print misleading facility name in facility unavailable exception · 93c2ec0f

由 Balbir Singh 提交于 11月 30, 2016

The current facility_strings[] are correct when the trap address is
0xf80 (hypervisor facility unavailable). When the trap address is
0xf60 (facility unavailable) IC (Interruption Cause) a.k.a status in the
code is undefined for values 0 and 1.

Add a check to prevent printing the (misleading) facility name for IC 0
and 1 when we came in via 0xf60. In all cases, print the actual IC
value, to avoid any confusion.

This hasn't been seen on real hardware, on only qemu which was
misreporting an exception.
Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
[mpe: Fix indentation, combine printks(), massage change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

93c2ec0f

30 11月, 2016 1 次提交

powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead. · da665885

由 Thiago Jung Bauermann 提交于 11月 29, 2016

Commit 2965faa5 ("kexec: split kexec_load syscall from kexec core
code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether
the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE
means whether the kexec_file_load system call should be compiled-in.
These options can be set independently from each other.

Since until now powerpc only supported kexec_load, CONFIG_KEXEC and
CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we
need to make a distinction. Almost all places where CONFIG_KEXEC was
being used should be using CONFIG_KEXEC_CORE instead, since
kexec_file_load also needs that code compiled in.
Signed-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

da665885

23 11月, 2016 1 次提交

powerpc/kprobes: Invoke handlers directly · 6cc89bad

由 Naveen N. Rao 提交于 11月 21, 2016

Invoke the kprobe handlers directly rather than through notify_die(), to
reduce path taken for handling kprobes. Similar to commit 6f6343f5
("kprobes/x86: Call exception handlers directly from do_int3/do_debug").

While at it, rename post_kprobe_handler() to kprobe_post_handler() for
more uniform naming.
Reported-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6cc89bad

18 11月, 2016 2 次提交

powerpc: Fix second nested oops hang · 7458e8b2

由 Nicholas Piggin 提交于 11月 08, 2016

When ending an oops, don't clear die_owner unless the nest count
went to zero. This prevents a second nested oops from hanging forever
on the die_lock.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7458e8b2

powerpc: Fix graceful debugger recovery · 6f44b20e

由 Nicholas Piggin 提交于 11月 08, 2016

When exiting xmon with 'x' (exit and recover), oops_begin bails
out immediately, but die then calls __die() and oops_end(), which
cause a lot of bad things to happen.

If the debugger was attached then went to graceful recovery, exit
from die() immediately.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6f44b20e

14 11月, 2016 2 次提交

powerpc: Revert Load Monitor Register Support · 29a969b7

由 Michael Neuling 提交于 10月 31, 2016

Load monitored is no longer supported on POWER9 so let's remove the
code.

This reverts commit bd3ea317 ("powerpc: Load Monitor Register
Support").
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

29a969b7

powerpc: Add support for relative exception tables · 61a92f70

由 Nicholas Piggin 提交于 10月 14, 2016

This halves the exception table size on 64-bit builds, and it allows
build-time sorting of exception tables to work on relocated kernels.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Minor asm fixups and bits to keep the selftests working]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

61a92f70

04 10月, 2016 3 次提交

powerpc: tm: Enable transactional memory (TM) lazily for userspace · 5d176f75

由 Cyril Bur 提交于 9月 14, 2016

Currently the MSR TM bit is always set if the hardware is TM capable.
This adds extra overhead as it means the TM SPRS (TFHAR, TEXASR and
TFAIR) must be swapped for each process regardless of if they use TM.

For processes that don't use TM the TM MSR bit can be turned off
allowing the kernel to avoid the expensive swap of the TM registers.

A TM unavailable exception will occur if a thread does use TM and the
kernel will enable MSR_TM and leave it so for some time afterwards.
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5d176f75

powerpc/tm: Add TM Unavailable Exception · 172f7aaa

由 Cyril Bur 提交于 9月 14, 2016

If the kernel disables transactional memory (TM) and userspace still
tries TM related actions (TM instructions or TM SPR accesses) TM aware
hardware will cause the kernel to take a facility unavailable
exception.

Add checks for the exception being caused by illegal TM access in
userspace.
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
[mpe: Rewrite comment entirely, bugs in it are mine]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

172f7aaa

powerpc: tm: Always use fp_state and vr_state to store live registers · dc310669

由 Cyril Bur 提交于 9月 23, 2016

There is currently an inconsistency as to how the entire CPU register
state is saved and restored when a thread uses transactional memory
(TM).

Using transactional memory results in the CPU having duplicated
(almost) all of its register state. This duplication results in a set
of registers which can be considered 'live', those being currently
modified by the instructions being executed and another set that is
frozen at a point in time.

On context switch, both sets of state have to be saved and (later)
restored. These two states are often called a variety of different
things. Common terms for the state which only exists after the CPU has
entered a transaction (performed a TBEGIN instruction) in hardware are
'transactional' or 'speculative'.

Between a TBEGIN and a TEND or TABORT (or an event that causes the
hardware to abort), regardless of the use of TSUSPEND the
transactional state can be referred to as the live state.

The second state is often to referred to as the 'checkpointed' state
and is a duplication of the live state when the TBEGIN instruction is
executed. This state is kept in the hardware and will be rolled back
to on transaction failure.

Currently all the registers stored in pt_regs are ALWAYS the live
registers, that is, when a thread has transactional registers their
values are stored in pt_regs and the checkpointed state is in
ckpt_regs. A strange opposite is true for fp_state/vr_state. When a
thread is non transactional fp_state/vr_state holds the live
registers. When a thread has initiated a transaction fp_state/vr_state
holds the checkpointed state and transact_fp/transact_vr become the
structure which holds the live state (at this point it is a
transactional state).

This method creates confusion as to where the live state is, in some
circumstances it requires extra work to determine where to put the
live state and prevents the use of common functions designed (probably
before TM) to save the live state.

With this patch pt_regs, fp_state and vr_state all represent the
same thing and the other structures [pending rename] are for
checkpointed state.
Acked-by: NSimon Guo <wei.guo.simon@gmail.com>
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

dc310669

25 9月, 2016 3 次提交

powerpc/8xx: add dedicated machine check handler · e627f8dc

由 Christophe Leroy 提交于 9月 16, 2016

During a machine check, the 8xx provides indication of
whether the check is due to data or instruction access, so
let's display it.

Lets also move 8xx specific handling into the new handler.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

e627f8dc

powerpc/8xx: add system_reset_exception · f307939f

由 Christophe Leroy 提交于 9月 05, 2016

When the watchdog is in NMI mode, the system reset interrupt is
generated when the watchdog counter expires.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

f307939f

C
powerpc32: Use instruction symbolic names in check_io_access() · ddc6cd0d
由 Christophe Leroy 提交于 5月 17, 2016
```
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>
```
ddc6cd0d

19 9月, 2016 1 次提交

powerpc: Use kprobe blacklist for exception handlers · 03465f89

由 Nicholas Piggin 提交于 9月 16, 2016

Currently we mark the C implementations of some exception handlers as
__kprobes. This has the effect of putting them in the ".kprobes.text"
section, which separates them from the rest of the text.

Instead we can use the blacklist macros to add the symbols to a
blacklist which kprobes will check. This allows the linker to move
exception handler functions close to callers and avoids trampolines in
larger kernels.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Reword change log a bit]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

03465f89

13 9月, 2016 1 次提交

powerpc/mm: Preserve CFAR value on SLB miss caused by access to bogus address · f0f558b1

由 Paul Mackerras 提交于 9月 02, 2016

Currently, if userspace or the kernel accesses a completely bogus address,
for example with any of bits 46-59 set, we first take an SLB miss interrupt,
install a corresponding SLB entry with VSID 0, retry the instruction, then
take a DSI/ISI interrupt because there is no HPT entry mapping the address.
However, by the time of the second interrupt, the Come-From Address Register
(CFAR) has been overwritten by the rfid instruction at the end of the SLB
miss interrupt handler. Since bogus accesses can often be caused by a
function return after the stack has been overwritten, the CFAR value would
be very useful as it could indicate which function it was whose return had
led to the bogus address.

This patch adds code to create a full exception frame in the SLB miss handler
in the case of a bogus address, rather than inserting an SLB entry with a
zero VSID field. Then we call a new slb_miss_bad_addr() function in C code,
which delivers a signal for a user access or creates an oops for a kernel
access. In the latter case the oops message will show the CFAR value at the
time of the access.

In the case of the radix MMU, a segment miss interrupt indicates an access
outside the ranges mapped by the page tables. Previously this was handled
by the code for an unrecoverable SLB miss (one with MSR[RI] = 0), which is
not really correct. With this patch, we now handle these interrupts with
slb_miss_bad_addr(), which is much more consistent.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f0f558b1

22 8月, 2016 1 次提交

powerpc: migrate exception table users off module.h and onto extable.h · 8a39b05f

由 Paul Gortmaker 提交于 8月 16, 2016

These files were only including module.h for exception table
related functions.  We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8a39b05f

21 6月, 2016 2 次提交

powerpc: Load Monitor Register Support · bd3ea317

由 Jack Miller 提交于 6月 09, 2016

This enables new registers, LMRR and LMSER, that can trigger an EBB in
userspace code when a monitored load (via the new ldmx instruction)
loads memory from a monitored space. This facility is controlled by a
new FSCR bit, LM.

This patch disables the FSCR LM control bit on task init and enables
that bit when a load monitor facility unavailable exception is taken
for using it. On context switch, this bit is then used to determine
whether the two relevant registers are saved and restored. This is
done lazily for performance reasons.
Signed-off-by: NJack Miller <jack@codezen.org>
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bd3ea317

powerpc: Improve FSCR init and context switching · b57bd2de

由 Michael Neuling 提交于 6月 09, 2016

This fixes a few issues with FSCR init and switching.

In commit 152d523e ("powerpc: Create context switch helpers
save_sprs() and restore_sprs()") we moved the setting of the FSCR
register from inside an CPU_FTR_ARCH_207S section to inside just a
CPU_FTR_ARCH_DSCR section. Hence we are setting FSCR on POWER6/7 where
the FSCR doesn't exist. This is harmless but we shouldn't do it.

Also, we can simplify the FSCR context switch. We don't need to go
through the calculation involving dscr_inherit. We can just restore
what we saved last time.

We also set an initial value in INIT_THREAD, so that pid 1 which is
cloned from that gets a sane value.

Based on patch by Jack Miller.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b57bd2de

20 6月, 2016 1 次提交

KVM: PPC: Book3S HV: Fix TB corruption in guest exit path on HMI interrupt · fd7bacbc

由 Mahesh Salgaonkar 提交于 5月 15, 2016

When a guest is assigned to a core it converts the host Timebase (TB)
into guest TB by adding guest timebase offset before entering into
guest. During guest exit it restores the guest TB to host TB. This means
under certain conditions (Guest migration) host TB and guest TB can differ.

When we get an HMI for TB related issues the opal HMI handler would
try fixing errors and restore the correct host TB value. With no guest
running, we don't have any issues. But with guest running on the core
we run into TB corruption issues.

If we get an HMI while in the guest, the current HMI handler invokes opal
hmi handler before forcing guest to exit. The guest exit path subtracts
the guest TB offset from the current TB value which may have already
been restored with host value by opal hmi handler. This leads to incorrect
host and guest TB values.

With split-core, things become more complex. With split-core, TB also gets
split and each subcore gets its own TB register. When a hmi handler fixes
a TB error and restores the TB value, it affects all the TB values of
sibling subcores on the same core. On TB errors all the thread in the core
gets HMI. With existing code, the individual threads call opal hmi handle
independently which can easily throw TB out of sync if we have guest
running on subcores. Hence we will need to co-ordinate with all the
threads before making opal hmi handler call followed by TB resync.

This patch introduces a sibling subcore state structure (shared by all
threads in the core) in paca which holds information about whether sibling
subcores are in Guest mode or host mode. An array in_guest[] of size
MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore.
The subcore id is used as index into in_guest[] array. Only primary
thread entering/exiting the guest is responsible to set/unset its
designated array element.

On TB error, we get HMI interrupt on every thread on the core. Upon HMI,
this patch will now force guest to vacate the core/subcore. Primary
thread from each subcore will then turn off its respective bit
from the above bitmap during the guest exit path just after the
guest->host partition switch is complete.

All other threads that have just exited the guest OR were already in host
will wait until all other subcores clears their respective bit.
Once all the subcores turn off their respective bit, all threads will
will make call to opal hmi handler.

It is not necessary that opal hmi handler would resync the TB value for
every HMI interrupts. It would do so only for the HMI caused due to
TB errors. For rest, it would not touch TB value. Hence to make things
simpler, primary thread would call TB resync explicitly once for each
core immediately after opal hmi handler instead of subtracting guest
offset from TB. TB resync call will restore the TB with host value.
Thus we can be sure about the TB state.

One of the primary threads exiting the guest will take up the
responsibility of calling TB resync. It will use one of the top bits
(bit 63) from subcore state flags bitmap to make the decision. The first
primary thread (among the subcores) that is able to set the bit will
have to call the TB resync. Rest all other threads will wait until TB
resync is complete. Once TB resync is complete all threads will then
proceed.
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

fd7bacbc

16 6月, 2016 1 次提交

powerpc: Introduce asm-prototypes.h · 42f5b4ca

由 Daniel Axtens 提交于 5月 18, 2016

Sparse picked up a number of functions that are implemented in C and
then only referred to in asm code.

This introduces asm-prototypes.h, which provides a place for
prototypes of these functions.

This silences some sparse warnings.
Signed-off-by: NDaniel Axtens <dja@axtens.net>
[mpe: Add include guards, clean up copyright & GPL text]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

42f5b4ca

openanolis / cloud-kernel 大约 2 年 前同步成功

openanolis / cloud-kernel
大约 2 年前同步成功