提交 · 8e93fb33c84f68db20c0bc2821334a4c54c3e251 · openeuler / Kernel

28 9月, 2022 32 次提交

powerpc/64: provide a helper macro to load r2 with the kernel TOC · 8e93fb33

由 Nicholas Piggin 提交于 9月 26, 2022

A later change stops the kernel using r2 and loads it with a poison
value.  Provide a PACATOC loading abstraction which can hide this
detail.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-5-npiggin@gmail.com

8e93fb33

powerpc/64: switch asm helpers from GOT to TOC relative addressing · 754f6117

由 Nicholas Piggin 提交于 9月 26, 2022

There is no need to use GOT addressing within the kernel.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-4-npiggin@gmail.com

754f6117

powerpc/64: asm use consistent global variable declaration and access · dab3b8f4

由 Nicholas Piggin 提交于 9月 26, 2022

Use helper macros to access global variables, and place them in .data
sections rather than in .toc. Putting addresses in TOC is not required
because the kernel is linked with a single TOC.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-3-npiggin@gmail.com

dab3b8f4

powerpc/64: use 32-bit immediate for STACK_FRAME_REGS_MARKER · 17773afd

由 Nicholas Piggin 提交于 9月 26, 2022

Using a 32-bit constant for this marker allows it to be loaded with
two ALU instructions, like 32-bit. This avoids a TOC entry and a
TOC load that depends on the r2 value that has just been loaded from
the PACA.

This changes the value for 32-bit as well, so both have the same
value in the low 4 bytes and 64-bit has 0 in the top bytes.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-2-npiggin@gmail.com

17773afd

powerpc/64s: POWER10 CPU Kconfig build option · 4b2a9315

由 Nicholas Piggin 提交于 9月 23, 2022

This adds basic POWER10_CPU option, which builds with -mcpu=power10.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220923033004.536127-1-npiggin@gmail.com

4b2a9315

powerpc/pseries: Move vas_migration_handler early during migration · 465dda9d

由 Haren Myneni 提交于 9月 22, 2022

When the migration is initiated, the hypervisor changes VAS
mappings as part of pre-migration event. Then the OS gets the
migration event which closes all VAS windows before the migration
starts. NX generates continuous faults until windows are closed
and the user space can not differentiate these NX faults coming
from the actual migration. So to reduce this time window, close
VAS windows first in pseries_migrate_partition().
Signed-off-by: NHaren Myneni <haren@linux.ibm.com>
Reviewed-by: NNathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d8efade91dda831c9ed4abb226dab627da594c5f.camel@linux.ibm.com

465dda9d

powerpc/64/irq: tidy soft-masked irq replay and improve documentation · 1da5351f

由 Nicholas Piggin 提交于 9月 26, 2022

irq replay is quite complicated because of softirq processing which
itself enables and disables irqs. Several considerations need to be
accounted for due to this, and they are not clearly documented.

Refactor the irq replay code a bit to tidy and deduplicate some common
functions. Add comments, debug checks.

This has a minor functional change that irq tracing enable/disable is
done after each interrupt replayed, rather than after a batch. It also
re-sets state to IRQS_ALL_DISABLED after an interrupt, which doesn't
matter much because interrupts are hard disabled at this point, but it
is more consistent with how interrupt handlers are called.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-8-npiggin@gmail.com

1da5351f

powerpc/64/interrupt: avoid BUG/WARN recursion in interrupt entry · f7bff6e7

由 Nicholas Piggin 提交于 9月 26, 2022

BUG/WARN are handled with a program interrupt which can turn into an
infinite recursion when there are bugs in interrupt handler entry
(which can be irritated by bugs in other parts of the code).

There is one feeble attempt to avoid this recursion, but it misses
several cases. Make a tidier macro for this and switch most bugs in
the interrupt entry wrapper over to use it.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-7-npiggin@gmail.com

f7bff6e7

powerpc/64s/interrupt: masked handler debug check for previous hard disable · c39fb71a

由 Nicholas Piggin 提交于 9月 26, 2022

Prior changes eliminated cases of masked PACA_IRQ_MUST_HARD_MASK
interrupts that re-fire due to MSR[EE] being enabled while they are
pending. Add a debug check in the masked interrupt handler to catch
if this occurs.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-6-npiggin@gmail.com

c39fb71a

powerpc/64s: Fix irq state management in runlatch functions · 9524f227

由 Nicholas Piggin 提交于 9月 26, 2022

When irqs are soft-disabled, MSR[EE] is volatile and can change from
1 to 0 asynchronously (if a PACA_IRQ_MUST_HARD_MASK interrupt hits).
So it can not be used to check hard IRQ enabled status, except to
confirm it is disabled.

ppc64_runlatch_on/off functions use MSR this way to decide whether to
re-enable MSR[EE] after disabling it, which leads to MSR[EE] being
enabled when it shouldn't be (when a PACA_IRQ_MUST_HARD_MASK had
disabled it between reading the MSR and clearing EE).

This has been tolerated in the kernel previously, and it doesn't seem
to cause a problem, but it is unexpected and may trip warnings or cause
other problems as we tighten up this state management. Fix this by only
re-enabling if PACA_IRQ_HARD_DIS is clear.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-5-npiggin@gmail.com

9524f227

powerpc/64/interrupt: Fix return to masked context after hard-mask irq becomes pending · e485f6c7

由 Nicholas Piggin 提交于 9月 26, 2022

If a synchronous interrupt (e.g., hash fault) is taken inside an
irqs-disabled region which has MSR[EE]=1, then an asynchronous interrupt
that is PACA_IRQ_MUST_HARD_MASK (e.g., PMI) is taken inside the
synchronous interrupt handler, then the synchronous interrupt will
return with MSR[EE]=1 and the asynchronous interrupt fires again.

If the asynchronous interrupt is a PMI and the original context does not
have PMIs disabled (only Linux IRQs), the asynchronous interrupt will
fire despite having the PMI marked soft pending. This can confuse the
perf code and cause warnings.

This patch changes the interrupt return so that irqs-disabled MSR[EE]=1
contexts will be returned to with MSR[EE]=0 if a PACA_IRQ_MUST_HARD_MASK
interrupt has become pending in the meantime.

The longer explanation for what happens:
1. local_irq_disable()
2. Hash fault interrupt fires, do_hash_fault handler runs
3. interrupt_enter_prepare() sets IRQS_ALL_DISABLED
4. interrupt_enter_prepare() sets MSR[EE]=1
5. PMU interrupt fires, masked handler runs
6. Masked handler marks PMI pending
7. Masked handler returns with PACA_IRQ_HARD_DIS set, MSR[EE]=0
8. do_hash_fault interrupt return handler runs
9. interrupt_exit_kernel_prepare() clears PACA_IRQ_HARD_DIS
10. interrupt returns with MSR[EE]=1
11. PMU interrupt fires, perf handler runs

Fixes: 4423eb5a ("powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if possible")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-4-npiggin@gmail.com

e485f6c7

powerpc/64: mark irqs hard disabled in boot paca · 799f7063

由 Nicholas Piggin 提交于 9月 26, 2022

This prevents interrupts in early boot (e.g., program check) from
enabling MSR[EE], potentially causing endian mismatch or other
crashes when reporting early boot traps.

Fixes: 4423eb5a ("powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if possible")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-3-npiggin@gmail.com

799f7063

powerpc/64/interrupt: Fix false warning in context tracking due to idle state · 56adbb7a

由 Nicholas Piggin 提交于 9月 26, 2022

Commit 17147677 ("context_tracking: Convert state to atomic_t")
added a CONTEXT_IDLE state which can be encountered by interrupts from
kernel mode in the idle thread, causing a false positive warning.

Fixes: 17147677 ("context_tracking: Convert state to atomic_t")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-2-npiggin@gmail.com

56adbb7a

powerpc/64s: Enable KFENCE on book3s64 · a5edf981

由 Nicholas Miehlbradt 提交于 9月 26, 2022

KFENCE support was added for ppc32 in commit 90cbac0e
("powerpc: Enable KFENCE for PPC32").
Enable KFENCE on ppc64 architecture with hash and radix MMUs.
It uses the same mechanism as debug pagealloc to
protect/unprotect pages. All KFENCE kunit tests pass on both
MMUs.

KFENCE memory is initially allocated using memblock but is
later marked as SLAB allocated. This necessitates the change
to __pud_free to ensure that the KFENCE pages are freed
appropriately.

Based on previous work by Christophe Leroy and Jordan Niethe.
Signed-off-by: NNicholas Miehlbradt <nicholas@linux.ibm.com>
Reviewed-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926075726.2846-4-nicholas@linux.ibm.com

a5edf981

powerpc/64s: Allow double call of kernel_[un]map_linear_page() · d7902d31

由 Christophe Leroy 提交于 9月 26, 2022

If the page is already mapped resp. already unmapped, bail out.
Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: NNicholas Miehlbradt <nicholas@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926075726.2846-3-nicholas@linux.ibm.com

d7902d31

powerpc/64s: Remove unneeded #ifdef CONFIG_DEBUG_PAGEALLOC in hash_utils · 3e791d0f

由 Christophe Leroy 提交于 9月 26, 2022

debug_pagealloc_enabled() is always defined and constant folds to
'false' when CONFIG_DEBUG_PAGEALLOC is not enabled.

Remove the #ifdefs, the code and associated static variables will
be optimised out by the compiler when CONFIG_DEBUG_PAGEALLOC is
not defined.
Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: NNicholas Miehlbradt <nicholas@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926075726.2846-2-nicholas@linux.ibm.com

3e791d0f

powerpc/64s: Add DEBUG_PAGEALLOC for radix · 5e8b2c4d

由 Nicholas Miehlbradt 提交于 9月 26, 2022

There is support for DEBUG_PAGEALLOC on hash but not on radix.
Add support on radix.
Signed-off-by: NNicholas Miehlbradt <nicholas@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926075726.2846-1-nicholas@linux.ibm.com

5e8b2c4d

powerpc/64s: update cpu selection options · 7fd123e5

由 Nicholas Piggin 提交于 9月 21, 2022

Update the 64s GENERIC_CPU option. POWER4 support has been dropped, so
make that clear in the option name. The POWER5_CPU option is dropped
because it's uncommon, and GENERIC_CPU covers it.

-mtune= before power8 is dropped because the minimum gcc version
supports power8, and tuning is made consistent between big and little
endian.

A 970 option is added for PowerPC 970 / G5 because they still have a
user base, and -mtune=power8 does not generate good code for the 970.

This also updates the ISA versions document to add Power4/Power4+
because I didn't realise Power4+ used 2.01.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NSegher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921014103.587954-2-npiggin@gmail.com

7fd123e5

powerpc/64s: Fix GENERIC_CPU build flags for PPC970 / G5 · 58ec7f06

由 Nicholas Piggin 提交于 9月 21, 2022

Big-endian GENERIC_CPU supports 970, but builds with -mcpu=power5.
POWER5 is ISA v2.02 whereas 970 is v2.01 plus Altivec. 2.02 added
the popcntb instruction which a compiler might use.

Use -mcpu=power4.

Fixes: 471d7ff8 ("powerpc/64s: Remove POWER4 support")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NSegher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921014103.587954-1-npiggin@gmail.com

58ec7f06

powerpc/64s: Make POWER10 and later use pause_short in cpu_relax loops · 9c7bfc2d

由 Nicholas Piggin 提交于 9月 20, 2022

We want to move away from using SMT priority updates for cpu_relax, and
use a 'wait' instruction which is similar to x86. As well as being a
much better fit for what everybody else uses and tests with, priority
nops are stateful which is nasty (interrupts have to consider they might
be taken at a different priority), and they're expensive to execute,
similar to a mtSPR which can effect other threads in the pipe.

This has shown to give results that are less affected by code alignment
on benchmarks that cause a lot of spin waiting (e.g., rwsem contention
on unixbench filesystem benchmarks) on POWER10.

QEMU TCG only supports this instruction correctly since v7.1, versions
without the fix may cause hangs whne running POWER10 CPUs.
Reviewed-by: NSegher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Fix checkpatch warnings RE the macros]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220920122259.363092-2-npiggin@gmail.com

9c7bfc2d

powerpc: add ISA v3.0 / v3.1 wait opcode macro · dabeb572

由 Nicholas Piggin 提交于 9月 20, 2022

The wait instruction encoding changed between ISA v2.07 and ISA v3.0.
In v3.1 the instruction gained a new field.

Update the PPC_WAIT macro to the current encoding. Rename the older
incompatible one with a _v203 suffix as it was introduced in v2.03
(the WC field was introduced in v2.07 but the kernel only uses WC=0).
Reviewed-by: NSegher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220920122259.363092-1-npiggin@gmail.com

dabeb572

powerpc/time: avoid programming DEC at the start of the timer interrupt · c8455020

由 Nicholas Piggin 提交于 9月 10, 2022

Setting DEC to maximum at the start of the timer interrupt is not
necessary and can be avoided for performance when MSR[EE] is not
enabled during the handler as explained in commit 0faf20a1
("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless
perf is in use"), where this change was first attempted.

The idea is that the timer interrupt runs with MSR[EE]=0, and at the end
of the interrupt DEC is programmed to the next timer interval, so there
is no need to clear the decrementer exception before then.

When the above commit was merged, that was not quite true. The low res
timer subsystem had some cases in the oneshot timer code where if the
tick was to be stopped and no timers active, the clock device would not
get the ->set_state_oneshot_stopped() call, so DEC would not get
reprogrammed, and this would hang taking continual timer interrupts.

So this was reverted in commit d2b9be1f ("powerpc/time: Always set
decrementer in timer_interrupt()"), which was a partial revert of the
above commit.

Commit 62c1256d ("timers/nohz: Switch to ONESHOT_STOPPED in the
low-res handler when the tick is stopped") was later merged to fix this
missing case in the timer subsystem, so now the behaviour can be
restored.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220909142457.278032-1-npiggin@gmail.com

c8455020

powerpc: Add support for early debugging via Serial 16550 console · b19448fe

由 Pali Rohár 提交于 8月 23, 2022

Currently powerpc early debugging contains lot of platform specific
options, but does not support standard UART / serial 16550 console.

Later legacy_serial.c code supports registering UART as early debug console
from device tree but it is not early during booting, but rather later after
machine description code finishes.

So for real early debugging via UART is current code unsuitable.

Add support for new early debugging option CONFIG_PPC_EARLY_DEBUG_16550
which enable Serial 16550 console on address defined by new option
CONFIG_PPC_EARLY_DEBUG_16550_PHYSADDR and by stride by option
CONFIG_PPC_EARLY_DEBUG_16550_STRIDE.

With this change it is possible to debug powerpc machine descriptor code.
For example this early debugging code can print on serial console also
"No suitable machine description found" error which is done before
legacy_serial.c code.
Signed-off-by: NPali Rohár <pali@kernel.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220822231501.16827-1-pali@kernel.org

b19448fe

powerpc/64/kdump: Limit kdump base to 512MB · bd7dc90e

由 Hari Bathini 提交于 9月 12, 2022

Since commit e641eb03 ("powerpc: Fix up the kdump base cap to
128M") memory for kdump kernel has been reserved at an offset of 128MB.
This held up well for a long time before running into boot failure on
LPARs having a lot of cores. Commit 7c5ed82b ("powerpc: Set
crashkernel offset to mid of RMA region") fixed this boot failure by
moving the offset to mid of RMA region. This change meant the offset is
either 256MB or 512MB on LPARs as ppc64_rma_size was 512MB or 1024MB
owing to commit 103a8542 ("powerpc/book3s64/ radix: Fix boot
failure with large amount of guest memory").

But ppc64_rma_size can be larger as well with newer f/w. So, limit
crashkernel reservation offset to 512MB to avoid running into boot
failures during kdump kernel boot, due to RTAS or other allocation
restrictions.

Also, while here, use SZ_128M instead of opening coding it.
Signed-off-by: NHari Bathini <hbathini@linux.ibm.com>
Tested-by: NSachin Sant <sachinp@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220912065031.57416-1-hbathini@linux.ibm.com

bd7dc90e

powerpc: Provide syscall wrapper · 7e92e01b

由 Rohan McLure 提交于 9月 21, 2022

Implement syscall wrapper as per s390, x86, arm64. When enabled
cause handlers to accept parameters from a stack frame rather than
from user scratch register state. This allows for user registers to be
safely cleared in order to reduce caller influence on speculation
within syscall routine. The wrapper is a macro that emits syscall
handler symbols that call into the target handler, obtaining its
parameters from a struct pt_regs on the stack.

As registers are already saved to the stack prior to calling
system_call_exception, it appears that this function is executed more
efficiently with the new stack-pointer convention than with parameters
passed by registers, avoiding the allocation of a stack frame for this
method. On a 32-bit system, we see >20% performance increases on the
null_syscall microbenchmark, and on a Power 8 the performance gains
amortise the cost of clearing and restoring registers which is
implemented at the end of this series, seeing final result of ~5.6%
performance improvement on null_syscall.

Syscalls are wrapped in this fashion on all platforms except for the
Cell processor as this commit does not provide SPU support. This can be
quickly fixed in a successive patch, but requires spu_sys_callback to
allocate a pt_regs structure to satisfy the wrapped calling convention.
Co-developed-by: NAndrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: NAndrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmai.com>
[mpe: Make incompatible with COMPAT to retain clearing of high bits of args]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-22-rmclure@linux.ibm.com

7e92e01b

powerpc: Change system_call_exception calling convention · f8971c62

由 Rohan McLure 提交于 9月 21, 2022

Change system_call_exception arguments to pass a pointer to a stack
frame container caller state, as well as the original r0, which
determines the number of the syscall. This has been observed to yield
improved performance to passing them by registers, circumventing the
need to allocate a stack frame.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Retain clearing of high bits of args for compat tasks]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-21-rmclure@linux.ibm.com

f8971c62

powerpc: Use common syscall handler type · 8640de0d

由 Rohan McLure 提交于 9月 21, 2022

Cause syscall handlers to be typed as follows when called indirectly
throughout the kernel. This is to allow for better type checking.

typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long,
                           unsigned long, unsigned long, unsigned long);

Since both 32 and 64-bit abis allow for at least the first six
machine-word length parameters to a function to be passed by registers,
even handlers which admit fewer than six parameters may be viewed as
having the above type.

Coercing syscalls to syscall_fn requires a cast to void* to avoid
-Wcast-function-type.

Fixup comparisons in VDSO to avoid pointer-integer comparison. Introduce
explicit cast on systems with SPUs.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-19-rmclure@linux.ibm.com

8640de0d

powerpc: Enable compile-time check for syscall handlers · 39859aea

由 Rohan McLure 提交于 9月 21, 2022

The table of syscall handlers and registered compatibility syscall
handlers has in past been produced using assembly, with function
references resolved at link time. This moves link-time errors to
compile-time, by rewriting systbl.S in C, and including the
linux/syscalls.h, linux/compat.h and asm/syscalls.h headers for
prototypes.
Reported-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-18-rmclure@linux.ibm.com

39859aea

powerpc: Include all arch-specific syscall prototypes · 8cd1def4

由 Rohan McLure 提交于 9月 21, 2022

Forward declare all syscall handler prototypes where a generic prototype
is not provided in either linux/syscalls.h or linux/compat.h in
asm/syscalls.h. This is required for compile-time type-checking for
syscall handlers, which is implemented later in this series.

32-bit compatibility syscall handlers are expressed in terms of types in
ppc32.h. Expose this header globally.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Acked-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Use standard include guard naming for syscalls_32.h]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-17-rmclure@linux.ibm.com

8cd1def4

powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers · dec20c50

由 Rohan McLure 提交于 9月 21, 2022

Arch-specific implementations of syscall handlers are currently used
over generic implementations for the following reasons:

1. Semantics unique to powerpc
2. Compatibility syscalls require 'argument padding' to comply with
   64-bit argument convention in ELF32 abi.
3. Parameter types or order is different in other architectures.

These syscall handlers have been defined prior to this patch series
without invoking the SYSCALL_DEFINE or COMPAT_SYSCALL_DEFINE macros with
custom input and output types. We remove every such direct definition in
favour of the aforementioned macros.

Also update syscalls.tbl in order to refer to the symbol names generated
by each of these macros. Since ppc64_personality can be called by both
64 bit and 32 bit binaries through compatibility, we must generate both
both compat_sys_ and sys_ symbols for this handler.

As an aside:
A number of architectures including arm and powerpc agree on an
alternative argument order and numbering for most of these arch-specific
handlers. A future patch series may allow for asm/unistd.h to signal
through its defines that a generic implementation of these syscall
handlers with the correct calling convention be emitted, through the
__ARCH_WANT_COMPAT_SYS_... convention.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-16-rmclure@linux.ibm.com

dec20c50

powerpc: Provide do_ppc64_personality helper · ac17defb

由 Rohan McLure 提交于 9月 21, 2022

Avoid duplication in future patch that will define the ppc64_personality
syscall handler in terms of the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE
macros, by extracting the common body of ppc64_personality into a helper
function.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-15-rmclure@linux.ibm.com

ac17defb

powerpc: Remove direct call to mmap2 syscall handlers · b7fa9ce8

由 Rohan McLure 提交于 9月 21, 2022

Syscall handlers should not be invoked internally by their symbol names,
as these symbols defined by the architecture-defined SYSCALL_DEFINE
macro. Move the compatibility syscall definition for mmap2 to
syscalls.c, so that all mmap implementations can share a helper function.

Remove 'inline' on static mmap helper.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Fix compat_sys_mmap2() prototype and offset handling]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-14-rmclure@linux.ibm.com

b7fa9ce8

26 9月, 2022 8 次提交

powerpc: Remove direct call to personality syscall handler · 4df0221f

由 Rohan McLure 提交于 9月 21, 2022

Syscall handlers should not be invoked internally by their symbol names,
as these symbols defined by the architecture-defined SYSCALL_DEFINE
macro. Fortunately, in the case of ppc64_personality, its call to
sys_personality can be replaced with an invocation to the
equivalent ksys_personality inline helper in <linux/syscalls.h>.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-13-rmclure@linux.ibm.com

4df0221f

powerpc/32: Remove powerpc select specialisation · b6b1334c

由 Rohan McLure 提交于 9月 21, 2022

Syscall #82 has been implemented for 32-bit platforms in a unique way on
powerpc systems. This hack will in effect guess whether the caller is
expecting new select semantics or old select semantics. It does so via a
guess, based off the first parameter. In new select, this parameter
represents the length of a user-memory array of file descriptors, and in
old select this is a pointer to an arguments structure.

The heuristic simply interprets sufficiently large values of its first
parameter as being a call to old select. The following is a discussion
on how this syscall should be handled.

As discussed in this thread, the existence of such a hack suggests that for
whatever powerpc binaries may predate glibc, it is most likely that they
would have taken use of the old select semantics. x86 and arm64 both
implement this syscall with oldselect semantics.

Remove the powerpc implementation, and update syscall.tbl to refer to emit
a reference to sys_old_select and compat_sys_old_select
for 32-bit binaries, in keeping with how other architectures support
syscall #82.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a1a0@csgroup.eu/
Link: https://lore.kernel.org/r/20220921065605.1051927-12-rmclure@linux.ibm.com

b6b1334c

powerpc: Use generic fallocate compatibility syscall · c2e7a198

由 Rohan McLure 提交于 9月 21, 2022

The powerpc fallocate compat syscall handler is identical to the
generic implementation provided by commit 59c10c52 ("riscv:
compat: syscall: Add compat_sys_call_table implementation"), and as
such can be removed in favour of the generic implementation.

A future patch series will replace more architecture-defined syscall
handlers with generic implementations, dependent on introducing generic
implementations that are compatible with powerpc and arm's parameter
reorderings.
Reported-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-11-rmclure@linux.ibm.com

c2e7a198

asm-generic: compat: Support BE for long long args in 32-bit ABIs · 43d5de2b

由 Rohan McLure 提交于 9月 21, 2022

32-bit ABIs support passing 64-bit integers by registers via argument
translation. Commit 59c10c52 ("riscv: compat: syscall: Add
compat_sys_call_table implementation") implements the compat_arg_u64
macro for efficiently defining little endian compatibility syscalls.

Architectures supporting big endianness may benefit from reciprocal
argument translation, but are welcome also to implement their own.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NArnd Bergmann <arnd@anrdb.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-10-rmclure@linux.ibm.com

43d5de2b

powerpc: Fix fallocate and fadvise64_64 compat parameter combination · 016ff72b

由 Rohan McLure 提交于 9月 21, 2022

As reported[1] by Arnd, the arch-specific fadvise64_64 and fallocate
compatibility handlers assume parameters are passed with 32-bit
big-endian ABI. This affects the assignment of odd-even parameter pairs
to the high or low words of a 64-bit syscall parameter.

Fix fadvise64_64 fallocate compat handlers to correctly swap upper/lower
32 bits conditioned on endianness.

A future patch will replace the arch-specific compat fallocate with an
asm-generic implementation. This patch is intended for ease of
back-port.

[1]: https://lore.kernel.org/all/be29926f-226e-48dc-871a-e29a54e80583@www.fastmail.com/

Fixes: 57f48b4b ("powerpc/compat_sys: swap hi/lo parts of 64-bit syscall args in LE mode")
Reported-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reviewed-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-9-rmclure@linux.ibm.com

016ff72b

powerpc/64s: Fix comment on interrupt handler prologue · 620f5c59

由 Rohan McLure 提交于 9月 21, 2022

Interrupt handlers on 64s systems will often need to save register state
from the interrupted process to make space for loading special purpose
registers or for internal state.

Fix a comment documenting a common code path macro in the beginning of
interrupt handlers where r10 is saved to the PACA to afford space for
the value of the CFAR. Comment is currently written as if r10-r12 are
saved to PACA, but in fact only r10 is saved, with r11-r12 saved much
later. The distance in code between these saves has grown over the many
revisions of this macro. Fix this by signalling with a comment where
r11-r12 are saved to the PACA.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-8-rmclure@linux.ibm.com

620f5c59

powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRS · 53ecaa67

由 Rohan McLure 提交于 9月 21, 2022

The common interrupt handler prologue macro and the bad_stack
trampolines include consecutive sequences of register saves, and some
register clears. Neaten such instances by expanding use of the SAVE_GPRS
macro and employing the ZEROIZE_GPR macro when appropriate.

Also simplify an invocation of SAVE_GPRS targetting all non-volatile
registers to SAVE_NVGPRS.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reported-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-7-rmclure@linux.ibm.com

53ecaa67

powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S · 15ba7450

由 Rohan McLure 提交于 9月 21, 2022

Restoring the register state of the interrupted thread involves issuing
a large number of predictable loads to the kernel stack frame. Issue the
REST_GPR{,S} macros to clearly signal when this is happening, and bunch
together restores at the end of the interrupt handler where the saved
value is not consumed earlier in the handler code.
Signed-off-by: NRohan McLure <rmclure@linux.ibm.com>
Reported-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-6-rmclure@linux.ibm.com

15ba7450

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功