提交 · 9600f261acaaabd476d7833cec2dd20f2919f1a0 · openeuler / Kernel

01 4月, 2020 14 次提交

powerpc/64s/exception: Move KVM test to common code · 9600f261

由 Nicholas Piggin 提交于 2月 26, 2020

This allows more code to be moved out of unrelocated regions. The
system call KVMTEST is changed to be open-coded and remain in the
tramp area to avoid having to move it to entry_64.S. The custom nature
of the system call entry code means the hcall case can be made more
streamlined than regular interrupt handlers.

mpe: Incorporate fix from Nick:

Moving KVM test to the common entry code missed the case of HMI and
MCE, which do not do __GEN_COMMON_ENTRY (because they don't want to
switch to virt mode).

This means a MCE or HMI exception that is taken while KVM is running a
guest context will not be switched out of that context, and KVM won't
be notified. Found by running sigfuz in guest with patched host on
POWER9 DD2.3, which causes some TM related HMI interrupts (which are
expected and supposed to be handled by KVM).

This fix adds a __GEN_REALMODE_COMMON_ENTRY for those handlers to add
the KVM test. This makes them look a little more like other handlers
that all use __GEN_COMMON_ENTRY.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-13-npiggin@gmail.com

9600f261

powerpc/64s/exception: Move soft-mask test to common code · 0eddf327

由 Nicholas Piggin 提交于 2月 26, 2020

As well as moving code out of the unrelocated vectors, this allows the
masked handlers to be moved to common code, and allows the soft_nmi
handler to be generated more like a regular handler.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-12-npiggin@gmail.com

0eddf327

powerpc/64s/exception: Move real to virt switch into the common handler · 8729c26e

由 Nicholas Piggin 提交于 2月 26, 2020

The real mode interrupt entry points currently use rfid to branch to
the common handler in virtual mode. This is a significant amount of
code, and forces other code (notably the KVM test) to live in the
real mode handler.

In the interest of minimising the amount of code that runs unrelocated
move the switch to virt mode into the common code, and do it with
mtmsrd, which avoids clobbering SRRs (although the post-KVMTEST
performance of real-mode interrupt handlers is not a big concern these
days).

This requires CTR to always be saved (real-mode needs to reach 0xc...)
but that's not a huge impact these days. It could be optimized away in
future.

mpe: Incorporate fix from Nick:

It's possible for interrupts to be replayed when TM is enabled and
suspended, for example rt_sigreturn, where the mtmsrd MSR_KERNEL in
the real-mode entry point to the common handler causes a TM Bad Thing
exception (due to attempting to clear suspended).

The fix for this is to have replay interrupts go to the _virt entry
point and skip the mtmsrd, which matches what happens before this
patch.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-11-npiggin@gmail.com

8729c26e

powerpc/64s/exception: Add ISIDE option · a3cd35be

由 Nicholas Piggin 提交于 2月 26, 2020

Rather than using DAR=2 to select the i-side registers, add an
explicit option.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-10-npiggin@gmail.com

a3cd35be

powerpc/64s/exception: Remove old INT_KVM_HANDLER · b177ae2f

由 Nicholas Piggin 提交于 2月 26, 2020

Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-9-npiggin@gmail.com

b177ae2f

powerpc/64s/exception: Remove old INT_COMMON macro · 6d71759a

由 Nicholas Piggin 提交于 2月 26, 2020

Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-8-npiggin@gmail.com

6d71759a

powerpc/64s/exception: Remove old INT_ENTRY macro · fc589ee4

由 Nicholas Piggin 提交于 2月 26, 2020

Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-7-npiggin@gmail.com

fc589ee4

powerpc/64s/exception: Move all interrupt handlers to new style code gen macros · 4f50541f

由 Nicholas Piggin 提交于 2月 26, 2020

Aside from label names and BUG line numbers, the generated code change
is an additional HMI KVM handler added for the "late" KVM handler,
because early and late HMI generation is achieved by defining two
different interrupt types.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-6-npiggin@gmail.com

4f50541f

powerpc/64s/exception: Expand EXC_COMMON and EXC_COMMON_ASYNC macros · eb204d86

由 Nicholas Piggin 提交于 2月 26, 2020

These don't provide a large amount of code sharing. Removing them
makes code easier to shuffle around. For example, some of the common
instructions will be moved into the common code gen macro.

No generated code change.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-5-npiggin@gmail.com

eb204d86

powerpc/64s/exception: Add GEN_KVM macro that uses INT_DEFINE parameters · d52fd3d3

由 Nicholas Piggin 提交于 2月 26, 2020

No generated code change.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-4-npiggin@gmail.com

d52fd3d3

powerpc/64s/exception: Add GEN_COMMON macro that uses INT_DEFINE parameters · 7cb3a1a0

由 Nicholas Piggin 提交于 2月 26, 2020

No generated code change.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-3-npiggin@gmail.com

7cb3a1a0

powerpc/64s/exception: Introduce INT_DEFINE parameter block for code generation · a42a239d

由 Nicholas Piggin 提交于 2月 26, 2020

The code generation macro arguments are difficult to read, and
defaults can't easily be used.

This introduces a block where parameters can be set for interrupt
handler code generation by the subsequent macros, and adds the first
generation macro for interrupt entry.

One interrupt handler is converted to the new macros to demonstrate
the change, the rest will be coverted all at once.

No generated code change.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-2-npiggin@gmail.com

a42a239d

powerpc/64: mark emergency stacks valid to unwind · a2e36683

由 Nicholas Piggin 提交于 3月 25, 2020

Before:

  WARNING: CPU: 0 PID: 494 at arch/powerpc/kernel/irq.c:343
  CPU: 0 PID: 494 Comm: a Tainted: G        W
  NIP:  c00000000001ed2c LR: c000000000d13190 CTR: c00000000003f910
  REGS: c0000001fffd3870 TRAP: 0700   Tainted: G        W
  MSR:  8000000000021003 <SF,ME,RI,LE>  CR: 28000488  XER: 00000000
  CFAR: c00000000001ec90 IRQMASK: 0
  GPR00: c000000000aeb12c c0000001fffd3b00 c0000000012ba300 0000000000000000
  GPR04: 0000000000000000 0000000000000000 000000010bd207c8 6b00696e74657272
  GPR08: 0000000000000000 0000000000000000 0000000000000000 efbeadde00000000
  GPR12: 0000000000000000 c0000000014a0000 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR24: 0000000000000000 0000000000000000 0000000000000000 000000010bd207bc
  GPR28: 0000000000000000 c00000000148a898 0000000000000000 c0000001ffff3f50
  NIP [c00000000001ed2c] arch_local_irq_restore.part.0+0xac/0x100
  LR [c000000000d13190] _raw_spin_unlock_irqrestore+0x50/0xc0
  Call Trace:
  Instruction dump:
  60000000 7d2000a6 71298000 41820068 39200002 7d210164 4bffff9c 60000000
  60000000 7d2000a6 71298000 4c820020 <0fe00000> 4e800020 60000000 60000000

After:

  WARNING: CPU: 0 PID: 499 at arch/powerpc/kernel/irq.c:343
  CPU: 0 PID: 499 Comm: a Not tainted
  NIP:  c00000000001ed2c LR: c000000000d13210 CTR: c00000000003f980
  REGS: c0000001fffd3870 TRAP: 0700   Not tainted
  MSR:  8000000000021003 <SF,ME,RI,LE>  CR: 28000488  XER: 00000000
  CFAR: c00000000001ec90 IRQMASK: 0
  GPR00: c000000000aeb1ac c0000001fffd3b00 c0000000012ba300 0000000000000000
  GPR04: 0000000000000000 0000000000000000 00000001347607c8 6b00696e74657272
  GPR08: 0000000000000000 0000000000000000 0000000000000000 efbeadde00000000
  GPR12: 0000000000000000 c0000000014a0000 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR24: 0000000000000000 0000000000000000 0000000000000000 00000001347607bc
  GPR28: 0000000000000000 c00000000148a898 0000000000000000 c0000001ffff3f50
  NIP [c00000000001ed2c] arch_local_irq_restore.part.0+0xac/0x100
  LR [c000000000d13210] _raw_spin_unlock_irqrestore+0x50/0xc0
  Call Trace:
  [c0000001fffd3b20] [c000000000aeb1ac] of_find_property+0x6c/0x90
  [c0000001fffd3b70] [c000000000aeb1f0] of_get_property+0x20/0x40
  [c0000001fffd3b90] [c000000000042cdc] rtas_token+0x3c/0x70
  [c0000001fffd3bb0] [c0000000000dc318] fwnmi_release_errinfo+0x28/0x70
  [c0000001fffd3c10] [c0000000000dcd8c] pseries_machine_check_realmode+0x1dc/0x540
  [c0000001fffd3cd0] [c00000000003fe04] machine_check_early+0x54/0x70
  [c0000001fffd3d00] [c000000000008384] machine_check_early_common+0x134/0x1f0
  --- interrupt: 200 at 0x1347607c8
      LR = 0x7fffafbd8328
  Instruction dump:
  60000000 7d2000a6 71298000 41820068 39200002 7d210164 4bffff9c 60000000
  60000000 7d2000a6 71298000 4c820020 <0fe00000> 4e800020 60000000 60000000
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200325104144.158362-1-npiggin@gmail.com

a2e36683

powerpc/64/tm: Don't let userspace set regs->trap via sigreturn · c7def7fb

由 Michael Ellerman 提交于 3月 31, 2020

In restore_tm_sigcontexts() we take the trap value directly from the
user sigcontext with no checking:

	err |= __get_user(regs->trap, &sc->gp_regs[PT_TRAP]);

This means we can be in the kernel with an arbitrary regs->trap value.

Although that's not immediately problematic, there is a risk we could
trigger one of the uses of CHECK_FULL_REGS():

	#define CHECK_FULL_REGS(regs)	BUG_ON(regs->trap & 1)

It can also cause us to unnecessarily save non-volatile GPRs again in
save_nvgprs(), which shouldn't be problematic but is still wrong.

It's also possible it could trick the syscall restart machinery, which
relies on regs->trap not being == 0xc00 (see 9a81c16b ("powerpc:
fix double syscall restarts")), though I haven't been able to make
that happen.

Finally it doesn't match the behaviour of the non-TM case, in
restore_sigcontext() which zeroes regs->trap.

So change restore_tm_sigcontexts() to zero regs->trap.

This was discovered while testing Nick's upcoming rewrite of the
syscall entry path. In that series the call to save_nvgprs() prior to
signal handling (do_notify_resume()) is removed, which leaves the
low-bit of regs->trap uncleared which can then trigger the FULL_REGS()
WARNs in setup_tm_sigcontexts().

Fixes: 2b0a576d ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200401023836.3286664-1-mpe@ellerman.id.au

c7def7fb

27 3月, 2020 3 次提交

powerpc/64: Avoid isync in flush_dcache_range() · 233ba546

由 Aneesh Kumar K.V 提交于 3月 20, 2020

As per ISA an isync is only needed on instruction cache block
invalidate. Remove the same from dcache invalidate.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200320103242.229223-1-aneesh.kumar@linux.ibm.com

233ba546

powerpc/boot: Delete unneeded .globl _zimage_start · 968339fa

由 Fangrui Song 提交于 3月 25, 2020

.globl sets the symbol binding to STB_GLOBAL while .weak sets the
binding to STB_WEAK. GNU as let .weak override .globl since
binutils-gdb 5ca547dc2399a0a5d9f20626d4bf5547c3ccfddd (1996). Clang
integrated assembler let the last win but it may error in the future.

Since it is a convention that only one binding directive is used, just
delete .globl.

Fixes: ee9d21b3 ("powerpc/boot: Ensure _zimage_start is a weak symbol")
Signed-off-by: NFangrui Song <maskray@google.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200325164257.170229-1-maskray@google.com

968339fa

powerpc/pseries: Handle UE event for memcpy_mcsafe · efbc4303

由 Ganesh Goudar 提交于 3月 27, 2020

memcpy_mcsafe has been implemented for power machines which is used
by pmem infrastructure, so that an UE encountered during memcpy from
pmem devices would not result in panic instead a right error code
is returned. The implementation expects machine check handler to ignore
the event and set nip to continue the execution from fixup code.

Appropriate changes are already made to powernv machine check handler,
make similar changes to pseries machine check handler to ignore the
the event and set nip to continue execution at the fixup entry if we
hit UE at an instruction with a fixup entry.

while we are at it, have a common function which searches the exception
table entry and updates nip with fixup address, and any future common
changes can be made in this function that are valid for both architectures.

powernv changes are made by
commit 895e3dce ("powerpc/mce: Handle UE event for memcpy_mcsafe")
Reviewed-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: NSantosh S <santosh@fossix.org>
Signed-off-by: NGanesh Goudar <ganeshgr@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200326184916.31172-1-ganeshgr@linux.ibm.com

efbc4303

26 3月, 2020 8 次提交

powerpc/smp: Use IS_ENABLED() to avoid #ifdef · c72e8da0

由 Michael Ellerman 提交于 3月 13, 2020

We can avoid the #ifdef by using IS_ENABLED() in the existing
condition check.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200313112020.28235-2-mpe@ellerman.id.au

c72e8da0

powerpc/smp: Drop superfluous NULL check · 4b4d181d

由 Michael Ellerman 提交于 3月 13, 2020

We don't need the NULL check of np, the result is the same because the
OF helpers cope with NULL, of_node_to_nid(NULL) == NUMA_NO_NODE (-1).
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200313112020.28235-1-mpe@ellerman.id.au

4b4d181d

powerpc/xmon: Add ASCII dump to d1,d2,d4,d8 commands. · 8ec26c25

由 Douglas Miller 提交于 2月 27, 2017

The reason debuggers add an ASCII dump to other types of memory dumps
is to give the user visual reference points in the case that ASCII
strings are adjacent to other structures or element. For example,
when examining the task_struct structure one can look for the comm[]
string and use it to locate other important elements.

ASCII strings do not have endianess, they exist in memory in the same
order regardless of CPU endianess. ASCII strings are, by definition,
human readable and so should be presented in a human readable format.

For these reasons, the supplemental ASCII dump does not re-order
the strings from memory to match the endianess of the corresponding
16, 32, or 64 bit words. That would make the ASCII dump much less
useful.
Signed-off-by: NDouglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1488205694-13337-1-git-send-email-dougmill@linux.vnet.ibm.com

8ec26c25

powerpc/xive: Add a debugfs file to dump internal XIVE state · 930914b7

由 Cédric Le Goater 提交于 3月 06, 2020

As does XMON, the debugfs file /sys/kernel/debug/powerpc/xive exposes
the XIVE internal state of the machine CPUs and interrupts. Available
on the PowerNV and sPAPR platforms.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
[mpe: Make the debugfs file 0400]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306150143.5551-5-clg@kaod.org

930914b7

powerpc/xmon: Add source flags to output of XIVE interrupts · 5191e0ba

由 Cédric Le Goater 提交于 3月 06, 2020

Some firmwares or hypervisors can advertise different source
characteristics. Track their value under XMON. What we are mostly
interested in is the StoreEOI flag.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306150143.5551-4-clg@kaod.org

5191e0ba

powerpc/xive: Fix xmon support on the PowerNV platform · 97ef2750

由 Cédric Le Goater 提交于 3月 06, 2020

The PowerNV platform has multiple IRQ chips and the xmon command
dumping the state of the XIVE interrupt should only operate on the
XIVE IRQ chip.

Fixes: 5896163f ("powerpc/xmon: Improve output of XIVE interrupts")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306150143.5551-3-clg@kaod.org

97ef2750

powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIs · b1a504a6

由 Cédric Le Goater 提交于 3月 06, 2020

When a CPU is brought up, an IPI number is allocated and recorded
under the XIVE CPU structure. Invalid IPI numbers are tracked with
interrupt number 0x0.

On the PowerNV platform, the interrupt number space starts at 0x10 and
this works fine. However, on the sPAPR platform, it is possible to
allocate the interrupt number 0x0 and this raises an issue when CPU 0
is unplugged. The XIVE spapr driver tracks allocated interrupt numbers
in a bitmask and it is not correctly updated when interrupt number 0x0
is freed. It stays allocated and it is then impossible to reallocate.

Fix by using the XIVE_BAD_IRQ value instead of zero on both platforms.
Reported-by: NDavid Gibson <david@gibson.dropbear.id.au>
Fixes: eac1e731 ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Tested-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306150143.5551-2-clg@kaod.org

b1a504a6

powerpc: Prefer __section and __printf from compiler_attributes.h · a7032637

由 Nick Desaulniers 提交于 8月 12, 2019

Reported-by: NSedat Dilek <sedat.dilek@gmail.com>
Suggested-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
[mpe: Drop changes to a/p/boot which doesn't use linux headers]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190812215052.71840-10-ndesaulniers@google.com

a7032637

25 3月, 2020 15 次提交

powerpc/prom_init: Remove leftover comment · 7074695a

由 Fabiano Rosas 提交于 3月 24, 2020

The if statement that this comment referred to was removed in
commit 11fdb309 ("powerpc/prom_init: Remove support for OPAL v2").
Signed-off-by: NFabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200324182912.1048906-1-farosas@linux.ibm.com

7074695a

powerpc/kprobes: Ignore traps that happened in real mode · 21f8b2fa

由 Christophe Leroy 提交于 2月 18, 2020

When a program check exception happens while MMU translation is
disabled, following Oops happens in kprobe_handler() in the following
code:

	} else if (*addr != BREAKPOINT_INSTRUCTION) {

  BUG: Unable to handle kernel data access on read at 0x0000e268
  Faulting instruction address: 0xc000ec34
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE PAGE_SIZE=16K PREEMPT CMPC885
  Modules linked in:
  CPU: 0 PID: 429 Comm: cat Not tainted 5.6.0-rc1-s3k-dev-00824-g84195dc6c58a #3267
  NIP:  c000ec34 LR: c000ecd8 CTR: c019cab8
  REGS: ca4d3b58 TRAP: 0300   Not tainted  (5.6.0-rc1-s3k-dev-00824-g84195dc6c58a)
  MSR:  00001032 <ME,IR,DR,RI>  CR: 2a4d3c52  XER: 00000000
  DAR: 0000e268 DSISR: c0000000
  GPR00: c000b09c ca4d3c10 c66d0620 00000000 ca4d3c60 00000000 00009032 00000000
  GPR08: 00020000 00000000 c087de44 c000afe0 c66d0ad0 100d3dd6 fffffff3 00000000
  GPR16: 00000000 00000041 00000000 ca4d3d70 00000000 00000000 0000416d 00000000
  GPR24: 00000004 c53b6128 00000000 0000e268 00000000 c07c0000 c07bb6fc ca4d3c60
  NIP [c000ec34] kprobe_handler+0x128/0x290
  LR [c000ecd8] kprobe_handler+0x1cc/0x290
  Call Trace:
  [ca4d3c30] [c000b09c] program_check_exception+0xbc/0x6fc
  [ca4d3c50] [c000e43c] ret_from_except_full+0x0/0x4
  --- interrupt: 700 at 0xe268
  Instruction dump:
  913e0008 81220000 38600001 3929ffff 91220000 80010024 bb410008 7c0803a6
  38210020 4e800020 38600000 4e800020 <813b0000> 6d2a7fe0 2f8a0008 419e0154
  ---[ end trace 5b9152d4cdadd06d ]---

kprobe is not prepared to handle events in real mode and functions
running in real mode should have been blacklisted, so kprobe_handler()
can safely bail out telling 'this trap is not mine' for any trap that
happened while in real-mode.

If the trap happened with MSR_IR or MSR_DR cleared, return 0
immediately.
Reported-by: NLarry Finger <Larry.Finger@lwfinger.net>
Fixes: 6cc89bad ("powerpc/kprobes: Invoke handlers directly")
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/424331e2006e7291a1bfe40e7f3fa58825f565e1.1582054578.git.christophe.leroy@c-s.fr

21f8b2fa

powerpc/maple: Fix declaration made after definition · af6cf95c

由 Nathan Chancellor 提交于 3月 23, 2020

When building ppc64 defconfig, Clang errors (trimmed for brevity):

  arch/powerpc/platforms/maple/setup.c:365:1: error: attribute declaration
  must precede definition [-Werror,-Wignored-attributes]
  machine_device_initcall(maple, maple_cpc925_edac_setup);
  ^

machine_device_initcall expands to __define_machine_initcall, which in
turn has the macro machine_is used in it, which declares mach_##name
with an __attribute__((weak)). define_machine actually defines
mach_##name, which in this file happens before the declaration, hence
the warning.

To fix this, move define_machine after machine_device_initcall so that
the declaration occurs before the definition, which matches how
machine_device_initcall and define_machine work throughout
arch/powerpc.

While we're here, remove some spaces before tabs.

Fixes: 8f101a05 ("edac: cpc925 MC platform device setup")
Reported-by: NNick Desaulniers <ndesaulniers@google.com>
Suggested-by: NIlie Halip <ilie.halip@gmail.com>
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200323222729.15365-1-natechancellor@gmail.com

af6cf95c

powerpc/pseries: Avoid harmless preempt warning · adde8715

由 Nicholas Piggin 提交于 3月 21, 2020

Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200320152436.1468651-1-npiggin@gmail.com

adde8715

powerpc/eeh: Rework eeh_ops->probe() · e86350f7

由 Oliver O'Halloran 提交于 3月 06, 2020

With the EEH early probe now being pseries specific there's no need for
eeh_ops->probe() to take a pci_dn. Instead, we can make it take a pci_dev
and use the probe function to map a pci_dev to an eeh_dev. This allows
the platform to implement it's own method for finding (or creating) an
eeh_dev for a given pci_dev which also removes a use of pci_dn in
generic EEH code.

This patch also renames eeh_device_add_late() to eeh_device_probe(). This
better reflects what it does does and removes the last vestiges of the
early/late EEH probe split.
Reviewed-by: NSam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306073904.4737-6-oohall@gmail.com

e86350f7

powerpc/eeh: Make early EEH init pseries specific · b6eebb09

由 Oliver O'Halloran 提交于 3月 06, 2020

The eeh_ops->probe() function is called from two different contexts:

1. On pseries, where we set EEH_PROBE_MODE_DEVTREE, it's called in
   eeh_add_device_early() which is supposed to run before we create
   a pci_dev.

2. On PowerNV, where we set EEH_PROBE_MODE_DEV, it's called in
   eeh_device_add_late() which is supposed to run *after* the
   pci_dev is created.

The "early" probe is required because PAPR requires that we perform an RTAS
call to enable EEH support on a device before we start interacting with it
via config space or MMIO. This requirement doesn't exist on PowerNV and
shoehorning two completely separate initialisation paths into a common
interface just results in a convoluted code everywhere.

Additionally the early probe requires the probe function to take an pci_dn
rather than a pci_dev argument. We'd like to make pci_dn a pseries specific
data structure since there's no real requirement for them on PowerNV. To
help both goals move the early probe into the pseries containment zone
so the platform depedence is more explicit.
Reviewed-by: NSam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306073904.4737-5-oohall@gmail.com

b6eebb09

powerpc/eeh: Remove PHB check in probe · 3ff32efb

由 Oliver O'Halloran 提交于 3月 06, 2020

This check for a missing PHB has existing in various forms since the
initial PPC64 port was upstreamed in 2002. The idea seems to be that we
need to guard against creating pci-specific data structures for the non-pci
children of a PCI device tree node (e.g. USB devices). However, we only
create pci_dn structures for DT nodes that correspond to PCI devices so
there's not much point in doing this check in the eeh_probe path.
Reviewed-by: NSam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306073904.4737-4-oohall@gmail.com

3ff32efb

powerpc/eeh: Do early EEH init only when required · a4b4f61d

由 Oliver O'Halloran 提交于 3月 06, 2020

The pci hotplug helper (pci_hp_add_devices()) calls
eeh_add_device_tree_early() to scan the device-tree for new PCI devices and
do the early EEH probe before the device is scanned. This early probe is a
no-op in a lot of cases because:

a) The early init is only required to satisfy a PAPR requirement that EEH
   be configured before we start doing config accesses. On PowerNV it is
   a no-op.

b) It's a no-op for devices that have already had their eeh_dev
   initialised.

There are four callers of pci_hp_add_devices():

1. arch/powerpc/kernel/eeh_driver.c
	Here the hotplug helper is called when re-scanning pci_devs that
	were removed during an EEH recovery pass. The EEH stat for each
	removed device (the eeh_dev) is retained across a recovery pass
	so the early init is a no-op in this case.

2. drivers/pci/hotplug/pnv_php.c
	This is also a no-op since the PowerNV hotplug driver is, suprisingly,
	PowerNV specific.

3. drivers/pci/hotplug/rpaphp_core.c
4. drivers/pci/hotplug/rpaphp_pci.c
	In these two cases new devices have been hotplugged and FW has
	provided new DT nodes for each. These are the only two cases where
	the EEH we might have new PCI device nodes in the DT so these are
	the only two cases where the early EEH probe needs to be done.

We can move the calls to eeh_add_device_tree_early() to the locations where
it's needed and remove it from the generic path. This is preparation for
making the early EEH probe pseries specific.
Reviewed-by: NSam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306073904.4737-3-oohall@gmail.com

a4b4f61d

powerpc/eeh: Remove eeh_add_device_tree_late() · 2d0953f7

由 Oliver O'Halloran 提交于 3月 06, 2020

On pseries and PowerNV pcibios_bus_add_device() calls eeh_add_device_late()
so there's no need to do a separate tree traversal to bind the eeh_dev and
pci_dev together setting up the PHB at boot. As a result we can remove
eeh_add_device_tree_late().
Reviewed-by: NSam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306073904.4737-2-oohall@gmail.com

2d0953f7

powerpc/eeh: Add sysfs files in late probe · 8645aaa8

由 Oliver O'Halloran 提交于 3月 06, 2020

Move creating the EEH specific sysfs files into eeh_add_device_late()
rather than being open-coded all over the place. Calling the function is
generally done immediately after calling eeh_add_device_late() anyway. This
is also a correctness fix since currently the sysfs files will be added
even if the EEH probe happens to fail.

Similarly, on pseries we currently add the sysfs files before calling
eeh_add_device_late(). This is flat-out broken since the sysfs files
require the pci_dev->dev.archdata.edev pointer to be set, and that is done
in eeh_add_device_late().
Reviewed-by: NSam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200306073904.4737-1-oohall@gmail.com

8645aaa8

powerpc/64: Prevent stack protection in early boot · 7053f80d

由 Michael Ellerman 提交于 3月 20, 2020

The previous commit reduced the amount of code that is run before we
setup a paca. However there are still a few remaining functions that
run with no paca, or worse, with an arbitrary value in r13 that will
be used as a paca pointer.

In particular the stack protector canary is stored in the paca, so if
stack protector is activated for any of these functions we will read
the stack canary from wherever r13 points. If r13 happens to point
outside of memory we will get a machine check / checkstop.

For example if we modify initialise_paca() to trigger stack
protection, and then boot in the mambo simulator with r13 poisoned in
skiboot before calling the kernel:

DEBUG: 19952232: (19952232): INSTRUCTION: PC=0xC0000000191FC1E8: [0x3C4C006D]: addis r2,r12,0x6D [fetch]
DEBUG: 19952236: (19952236): INSTRUCTION: PC=0xC00000001807EAD8: [0x7D8802A6]: mflr r12 [fetch]
FATAL ERROR: 19952276: (19952276): Check Stop for 0:0: Machine Check with ME bit of MSR off
DEBUG: 19952276: (19952276): INSTRUCTION: PC=0xC0000000191FCA7C: [0xE90D0CF8]: ld r8,0xCF8(r13) [Instruction Failed]
INFO: 19952276: (19952277): ** Execution stopped: Mambo Error, Machine Check Stop, **
systemsim % bt
pc: 0xC0000000191FCA7C initialise_paca+0x54
lr: 0xC0000000191FC22C early_setup+0x44
stack:0x00000000198CBED0 0x0 +0x0
stack:0x00000000198CBF00 0xC0000000191FC22C early_setup+0x44
stack:0x00000000198CBF90 0x1801C968 +0x1801C968

So annotate the relevant functions to ensure stack protection is never
enabled for them.

Fixes: 06ec27ae ("powerpc/64: add stack protector support")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200320032116.1024773-2-mpe@ellerman.id.au

7053f80d

powerpc/64: Setup a paca before parsing device tree etc. · d4a8e986

由 Daniel Axtens 提交于 3月 20, 2020

Currently we set up the paca after parsing the device tree for CPU
features. Prior to that, r13 contains random data, which means there
is random data in r13 while we're running the generic dt parsing code.

This random data varies depending on whether we boot through a vmlinux
or a zImage: for the vmlinux case it's usually around zero, but for
zImages we see random values like 912a72603d420015.

This is poor practice, and can also lead to difficult-to-debug
crashes. For example, when kcov is enabled, the kcov instrumentation
attempts to read preempt_count out of the current task, which goes via
the paca. This then crashes in the zImage case.

Similarly stack protector can cause crashes if r13 is bogus, by
reading from the stack canary in the paca.

To resolve this:

 - move the paca setup to before the CPU feature parsing.

 - because we no longer have access to CPU feature flags in paca
 setup, change the HV feature test in the paca setup path to consider
 the actual value of the MSR rather than the CPU feature.

Translations get switched on once we leave early_setup, so I think
we'd already catch any other cases where the paca or task aren't set
up.

Boot tested on a P9 guest and host.

Fixes: fb0b0a73 ("powerpc: Enable kcov")
Fixes: 06ec27ae ("powerpc/64: add stack protector support")
Cc: stable@vger.kernel.org # v4.20+
Reviewed-by: NAndrew Donnellan <ajd@linux.ibm.com>
Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NDaniel Axtens <dja@axtens.net>
[mpe: Reword comments & change log a bit to mention stack protector]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200320032116.1024773-1-mpe@ellerman.id.au

d4a8e986

powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap PTE entries · 36b78402

由 Aneesh Kumar K.V 提交于 3月 13, 2020

H_PAGE_THP_HUGE is used to differentiate between a THP hugepage and
hugetlb hugepage entries. The difference is WRT how we handle hash
fault on these address. THP address enables MPSS in segments. We want
to manage devmap hugepage entries similar to THP pt entries. Hence use
H_PAGE_THP_HUGE for devmap huge PTE entries.

With current code while handling hash PTE fault, we do set is_thp =
true when finding devmap PTE huge PTE entries.

Current code also does the below sequence we setting up huge devmap
entries.

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
	if (pfn_t_devmap(pfn))
		entry = pmd_mkdevmap(entry);

In that case we would find both H_PAGE_THP_HUGE and PAGE_DEVMAP set
for huge devmap PTE entries. This results in false positive error like
below.

  kernel BUG at /home/kvaneesh/src/linux/mm/memory.c:4321!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 56 PID: 67996 Comm: t_mmap_dio Not tainted 5.6.0-rc4-59640-g371c804dedbc #128
  ....
  NIP [c00000000044c9e4] __follow_pte_pmd+0x264/0x900
  LR [c0000000005d45f8] dax_writeback_one+0x1a8/0x740
  Call Trace:
    str_spec.74809+0x22ffb4/0x2d116c (unreliable)
    dax_writeback_one+0x1a8/0x740
    dax_writeback_mapping_range+0x26c/0x700
    ext4_dax_writepages+0x150/0x5a0
    do_writepages+0x68/0x180
    __filemap_fdatawrite_range+0x138/0x180
    file_write_and_wait_range+0xa4/0x110
    ext4_sync_file+0x370/0x6e0
    vfs_fsync_range+0x70/0xf0
    sys_msync+0x220/0x2e0
    system_call+0x5c/0x68

This is because our pmd_trans_huge check doesn't exclude _PAGE_DEVMAP.

To make this all consistent, update pmd_mkdevmap to set
H_PAGE_THP_HUGE and pmd_trans_huge check now excludes _PAGE_DEVMAP
correctly.

Fixes: ebd31197 ("powerpc/mm: Add devmap support for ppc64")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200313094842.351830-1-aneesh.kumar@linux.ibm.com

36b78402

powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits. · 697ece78

由 Christophe Leroy 提交于 3月 10, 2020

Reorder Linux PTE bits to (almost) match Hash PTE bits.

RW Kernel : PP = 00
RO Kernel : PP = 00
RW User   : PP = 01
RO User   : PP = 11

So naturally, we should have
_PAGE_USER = 0x001
_PAGE_RW   = 0x002

Today 0x001 and 0x002 and _PAGE_PRESENT and _PAGE_HASHPTE which
both are software only bits.

Switch _PAGE_USER and _PAGE_PRESET
Switch _PAGE_RW and _PAGE_HASHPTE

This allows to remove a few insns.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c4d6c18a7f8d9d3b899bc492f55fbc40ef38896a.1583861325.git.christophe.leroy@c-s.fr

697ece78

powerpc/kasan: Fix kasan_remap_early_shadow_ro() · af92bad6

由 Christophe Leroy 提交于 3月 06, 2020

At the moment kasan_remap_early_shadow_ro() does nothing, because
k_end is 0 and k_cur < 0 is always true.

Change the test to k_cur != k_end, as done in
kasan_init_shadow_page_tables()
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Fixes: cbd18991 ("powerpc/mm: Fix an Oops in kasan_mmu_init()")
Cc: stable@vger.kernel.org
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4e7b56865e01569058914c991143f5961b5d4719.1583507333.git.christophe.leroy@c-s.fr

af92bad6

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功