提交 201729d5 编写于 作者: C Catalin Marinas

Merge branches 'for-next/sme', 'for-next/stacktrace',...

Merge branches 'for-next/sme', 'for-next/stacktrace', 'for-next/fault-in-subpage', 'for-next/misc', 'for-next/ftrace' and 'for-next/crashkernel', remote-tracking branch 'arm64/for-next/perf' into for-next/core

* arm64/for-next/perf:
  perf/arm-cmn: Decode CAL devices properly in debugfs
  perf/arm-cmn: Fix filter_sel lookup
  perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type first
  drivers/perf: hisi: Add Support for CPA PMU
  drivers/perf: hisi: Associate PMUs in SICL with CPUs online
  drivers/perf: arm_spe: Expose saturating counter to 16-bit
  perf/arm-cmn: Add CMN-700 support
  perf/arm-cmn: Refactor occupancy filter selector
  perf/arm-cmn: Add CMN-650 support
  dt-bindings: perf: arm-cmn: Add CMN-650 and CMN-700
  perf: check return value of armpmu_request_irq()
  perf: RISC-V: Remove non-kernel-doc ** comments

* for-next/sme: (30 commits)
  : Scalable Matrix Extensions support.
  arm64/sve: Move sve_free() into SVE code section
  arm64/sve: Make kernel FPU protection RT friendly
  arm64/sve: Delay freeing memory in fpsimd_flush_thread()
  arm64/sme: More sensibly define the size for the ZA register set
  arm64/sme: Fix NULL check after kzalloc
  arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding()
  arm64/sme: Provide Kconfig for SME
  KVM: arm64: Handle SME host state when running guests
  KVM: arm64: Trap SME usage in guest
  KVM: arm64: Hide SME system registers from guests
  arm64/sme: Save and restore streaming mode over EFI runtime calls
  arm64/sme: Disable streaming mode and ZA when flushing CPU state
  arm64/sme: Add ptrace support for ZA
  arm64/sme: Implement ptrace support for streaming mode SVE registers
  arm64/sme: Implement ZA signal handling
  arm64/sme: Implement streaming SVE signal handling
  arm64/sme: Disable ZA and streaming mode when handling signals
  arm64/sme: Implement traps and syscall handling for SME
  arm64/sme: Implement ZA context switching
  arm64/sme: Implement streaming SVE context switching
  ...

* for-next/stacktrace:
  : Stacktrace cleanups.
  arm64: stacktrace: align with common naming
  arm64: stacktrace: rename stackframe to unwind_state
  arm64: stacktrace: rename unwinder functions
  arm64: stacktrace: make struct stackframe private to stacktrace.c
  arm64: stacktrace: delete PCS comment
  arm64: stacktrace: remove NULL task check from unwind_frame()

* for-next/fault-in-subpage:
  : btrfs search_ioctl() live-lock fix using fault_in_subpage_writeable().
  btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
  arm64: Add support for user sub-page fault probing
  mm: Add fault_in_subpage_writeable() to probe at sub-page granularity

* for-next/misc:
  : Miscellaneous patches.
  arm64: Kconfig.platforms: Add comments
  arm64: Kconfig: Fix indentation and add comments
  arm64: mm: avoid writable executable mappings in kexec/hibernate code
  arm64: lds: move special code sections out of kernel exec segment
  arm64/hugetlb: Implement arm64 specific huge_ptep_get()
  arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
  arm64: mm: Make arch_faults_on_old_pte() check for migratability
  arm64: mte: Clean up user tag accessors
  arm64/hugetlb: Drop TLB flush from get_clear_flush()
  arm64: Declare non global symbols as static
  arm64: mm: Cleanup useless parameters in zone_sizes_init()
  arm64: fix types in copy_highpage()
  arm64: Set ARCH_NR_GPIO to 2048 for ARCH_APPLE
  arm64: cputype: Avoid overflow using MIDR_IMPLEMENTOR_MASK
  arm64: document the boot requirements for MTE
  arm64/mm: Compute PTRS_PER_[PMD|PUD] independently of PTRS_PER_PTE

* for-next/ftrace:
  : ftrace cleanups.
  arm64/ftrace: Make function graph use ftrace directly
  ftrace: cleanup ftrace_graph_caller enable and disable

* for-next/crashkernel:
  : Support for crashkernel reservations above ZONE_DMA.
  arm64: kdump: Do not allocate crash low memory if not needed
  docs: kdump: Update the crashkernel description for arm64
  of: Support more than one crash kernel regions for kexec -s
  of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  arm64: kdump: Reimplement crashkernel=X
  arm64: Use insert_resource() to simplify code
  kdump: return -ENOENT if required cmdline option does not exist
......@@ -808,7 +808,7 @@
Documentation/admin-guide/kdump/kdump.rst for an example.
crashkernel=size[KMG],high
[KNL, X86-64] range could be above 4G. Allow kernel
[KNL, X86-64, ARM64] range could be above 4G. Allow kernel
to allocate physical memory region from top, so could
be above 4G if system have more than 4G ram installed.
Otherwise memory region will be allocated below 4G, if
......@@ -821,14 +821,20 @@
that require some amount of low memory, e.g. swiotlb
requires at least 64M+32K low memory, also enough extra
low memory is needed to make sure DMA buffers for 32-bit
devices won't run out. Kernel would try to allocate at
devices won't run out. Kernel would try to allocate
at least 256M below 4G automatically.
This one let user to specify own low range under 4G
This one lets the user specify own low range under 4G
for second kernel instead.
0: to disable low allocation.
It will be ignored when crashkernel=X,high is not used
or memory reserved is below 4G.
[KNL, ARM64] range in low memory.
This one lets the user specify a low range in the
DMA zone for the crash dump kernel.
It will be ignored when crashkernel=X,high is not used
or memory reserved is located in the DMA zones.
cryptomgr.notests
[KNL] Disable crypto self-tests
......
......@@ -350,6 +350,16 @@ Before jumping into the kernel, the following conditions must be met:
- SMCR_EL2.FA64 (bit 31) must be initialised to 0b1.
For CPUs with the Memory Tagging Extension feature (FEAT_MTE2):
- If EL3 is present:
- SCR_EL3.ATA (bit 26) must be initialised to 0b1.
- If the kernel is entered at EL1 and EL2 is present:
- HCR_EL2.ATA (bit 56) must be initialised to 0b1.
The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs. All CPUs must
enter the kernel in the same exception level. Where the values documented
......
......@@ -264,6 +264,39 @@ HWCAP2_MTE3
Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0011, as described
by Documentation/arm64/memory-tagging-extension.rst.
HWCAP2_SME
Functionality implied by ID_AA64PFR1_EL1.SME == 0b0001, as described
by Documentation/arm64/sme.rst.
HWCAP2_SME_I16I64
Functionality implied by ID_AA64SMFR0_EL1.I16I64 == 0b1111.
HWCAP2_SME_F64F64
Functionality implied by ID_AA64SMFR0_EL1.F64F64 == 0b1.
HWCAP2_SME_I8I32
Functionality implied by ID_AA64SMFR0_EL1.I8I32 == 0b1111.
HWCAP2_SME_F16F32
Functionality implied by ID_AA64SMFR0_EL1.F16F32 == 0b1.
HWCAP2_SME_B16F32
Functionality implied by ID_AA64SMFR0_EL1.B16F32 == 0b1.
HWCAP2_SME_F32F32
Functionality implied by ID_AA64SMFR0_EL1.F32F32 == 0b1.
HWCAP2_SME_FA64
Functionality implied by ID_AA64SMFR0_EL1.FA64 == 0b1.
4. Unused AT_HWCAP bits
-----------------------
......
......@@ -21,6 +21,7 @@ ARM64 Architecture
perf
pointer-authentication
silicon-errata
sme
sve
tagged-address-abi
tagged-pointers
......
===================================================
Scalable Matrix Extension support for AArch64 Linux
===================================================
This document outlines briefly the interface provided to userspace by Linux in
order to support use of the ARM Scalable Matrix Extension (SME).
This is an outline of the most important features and issues only and not
intended to be exhaustive. It should be read in conjunction with the SVE
documentation in sve.rst which provides details on the Streaming SVE mode
included in SME.
This document does not aim to describe the SME architecture or programmer's
model. To aid understanding, a minimal description of relevant programmer's
model features for SME is included in Appendix A.
1. General
-----------
* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA
register state and TPIDR2_EL0 are tracked per thread.
* The presence of SME is reported to userspace via HWCAP2_SME in the aux vector
AT_HWCAP2 entry. Presence of this flag implies the presence of the SME
instructions and registers, and the Linux-specific system interfaces
described in this document. SME is reported in /proc/cpuinfo as "sme".
* Support for the execution of SME instructions in userspace can also be
detected by reading the CPU ID register ID_AA64PFR1_EL1 using an MRS
instruction, and checking that the value of the SME field is nonzero. [3]
It does not guarantee the presence of the system interfaces described in the
following sections: software that needs to verify that those interfaces are
present must check for HWCAP2_SME instead.
* There are a number of optional SME features, presence of these is reported
through AT_HWCAP2 through:
HWCAP2_SME_I16I64
HWCAP2_SME_F64F64
HWCAP2_SME_I8I32
HWCAP2_SME_F16F32
HWCAP2_SME_B16F32
HWCAP2_SME_F32F32
HWCAP2_SME_FA64
This list may be extended over time as the SME architecture evolves.
These extensions are also reported via the CPU ID register ID_AA64SMFR0_EL1,
which userspace can read using an MRS instruction. See elf_hwcaps.txt and
cpu-feature-registers.txt for details.
* Debuggers should restrict themselves to interacting with the target via the
NT_ARM_SVE, NT_ARM_SSVE and NT_ARM_ZA regsets. The recommended way
of detecting support for these regsets is to connect to a target process
first and then attempt a
ptrace(PTRACE_GETREGSET, pid, NT_ARM_<regset>, &iov).
* Whenever ZA register values are exchanged in memory between userspace and
the kernel, the register value is encoded in memory as a series of horizontal
vectors from 0 to VL/8-1 stored in the same endianness invariant format as is
used for SVE vectors.
* On thread creation TPIDR2_EL0 is preserved unless CLONE_SETTLS is specified,
in which case it is set to 0.
2. Vector lengths
------------------
SME defines a second vector length similar to the SVE vector length which is
controls the size of the streaming mode SVE vectors and the ZA matrix array.
The ZA matrix is square with each side having as many bytes as a streaming
mode SVE vector.
3. Sharing of streaming and non-streaming mode SVE state
---------------------------------------------------------
It is implementation defined which if any parts of the SVE state are shared
between streaming and non-streaming modes. When switching between modes
via software interfaces such as ptrace if no register content is provided as
part of switching no state will be assumed to be shared and everything will
be zeroed.
4. System call behaviour
-------------------------
* On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the
ZA matrix are preserved.
* On syscall PSTATE.SM will be cleared and the SVE registers will be handled
as per the standard SVE ABI.
* Neither the SVE registers nor ZA are used to pass arguments to or receive
results from any syscall.
* On process creation (eg, clone()) the newly created process will have
PSTATE.SM cleared.
* All other SME state of a thread, including the currently configured vector
length, the state of the PR_SME_VL_INHERIT flag, and the deferred vector
length (if any), is preserved across all syscalls, subject to the specific
exceptions for execve() described in section 6.
5. Signal handling
-------------------
* Signal handlers are invoked with streaming mode and ZA disabled.
* A new signal frame record za_context encodes the ZA register contents on
signal delivery. [1]
* The signal frame record for ZA always contains basic metadata, in particular
the thread's vector length (in za_context.vl).
* The ZA matrix may or may not be included in the record, depending on
the value of PSTATE.ZA. The registers are present if and only if:
za_context.head.size >= ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl))
in which case PSTATE.ZA == 1.
* If matrix data is present, the remainder of the record has a vl-dependent
size and layout. Macros ZA_SIG_* are defined [1] to facilitate access to
them.
* The matrix is stored as a series of horizontal vectors in the same format as
is used for SVE vectors.
* If the ZA context is too big to fit in sigcontext.__reserved[], then extra
space is allocated on the stack, an extra_context record is written in
__reserved[] referencing this space. za_context is then written in the
extra space. Refer to [1] for further details about this mechanism.
5. Signal return
-----------------
When returning from a signal handler:
* If there is no za_context record in the signal frame, or if the record is
present but contains no register data as described in the previous section,
then ZA is disabled.
* If za_context is present in the signal frame and contains matrix data then
PSTATE.ZA is set to 1 and ZA is populated with the specified data.
* The vector length cannot be changed via signal return. If za_context.vl in
the signal frame does not match the current vector length, the signal return
attempt is treated as illegal, resulting in a forced SIGSEGV.
6. prctl extensions
--------------------
Some new prctl() calls are added to allow programs to manage the SME vector
length:
prctl(PR_SME_SET_VL, unsigned long arg)
Sets the vector length of the calling thread and related flags, where
arg == vl | flags. Other threads of the calling process are unaffected.
vl is the desired vector length, where sve_vl_valid(vl) must be true.
flags:
PR_SME_VL_INHERIT
Inherit the current vector length across execve(). Otherwise, the
vector length is reset to the system default at execve(). (See
Section 9.)
PR_SME_SET_VL_ONEXEC
Defer the requested vector length change until the next execve()
performed by this thread.
The effect is equivalent to implicit execution of the following
call immediately after the next execve() (if any) by the thread:
prctl(PR_SME_SET_VL, arg & ~PR_SME_SET_VL_ONEXEC)
This allows launching of a new program with a different vector
length, while avoiding runtime side effects in the caller.
Without PR_SME_SET_VL_ONEXEC, the requested change takes effect
immediately.
Return value: a nonnegative on success, or a negative value on error:
EINVAL: SME not supported, invalid vector length requested, or
invalid flags.
On success:
* Either the calling thread's vector length or the deferred vector length
to be applied at the next execve() by the thread (dependent on whether
PR_SME_SET_VL_ONEXEC is present in arg), is set to the largest value
supported by the system that is less than or equal to vl. If vl ==
SVE_VL_MAX, the value set will be the largest value supported by the
system.
* Any previously outstanding deferred vector length change in the calling
thread is cancelled.
* The returned value describes the resulting configuration, encoded as for
PR_SME_GET_VL. The vector length reported in this value is the new
current vector length for this thread if PR_SME_SET_VL_ONEXEC was not
present in arg; otherwise, the reported vector length is the deferred
vector length that will be applied at the next execve() by the calling
thread.
* Changing the vector length causes all of ZA, P0..P15, FFR and all bits of
Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
unspecified, including both streaming and non-streaming SVE state.
Calling PR_SME_SET_VL with vl equal to the thread's current vector
length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
does not constitute a change to the vector length for this purpose.
* Changing the vector length causes PSTATE.ZA and PSTATE.SM to be cleared.
Calling PR_SME_SET_VL with vl equal to the thread's current vector
length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
does not constitute a change to the vector length for this purpose.
prctl(PR_SME_GET_VL)
Gets the vector length of the calling thread.
The following flag may be OR-ed into the result:
PR_SME_VL_INHERIT
Vector length will be inherited across execve().
There is no way to determine whether there is an outstanding deferred
vector length change (which would only normally be the case between a
fork() or vfork() and the corresponding execve() in typical use).
To extract the vector length from the result, bitwise and it with
PR_SME_VL_LEN_MASK.
Return value: a nonnegative value on success, or a negative value on error:
EINVAL: SME not supported.
7. ptrace extensions
---------------------
* A new regset NT_ARM_SSVE is defined for access to streaming mode SVE
state via PTRACE_GETREGSET and PTRACE_SETREGSET, this is documented in
sve.rst.
* A new regset NT_ARM_ZA is defined for ZA state for access to ZA state via
PTRACE_GETREGSET and PTRACE_SETREGSET.
Refer to [2] for definitions.
The regset data starts with struct user_za_header, containing:
size
Size of the complete regset, in bytes.
This depends on vl and possibly on other things in the future.
If a call to PTRACE_GETREGSET requests less data than the value of
size, the caller can allocate a larger buffer and retry in order to
read the complete regset.
max_size
Maximum size in bytes that the regset can grow to for the target
thread. The regset won't grow bigger than this even if the target
thread changes its vector length etc.
vl
Target thread's current streaming vector length, in bytes.
max_vl
Maximum possible streaming vector length for the target thread.
flags
Zero or more of the following flags, which have the same
meaning and behaviour as the corresponding PR_SET_VL_* flags:
SME_PT_VL_INHERIT
SME_PT_VL_ONEXEC (SETREGSET only).
* The effects of changing the vector length and/or flags are equivalent to
those documented for PR_SME_SET_VL.
The caller must make a further GETREGSET call if it needs to know what VL is
actually set by SETREGSET, unless is it known in advance that the requested
VL is supported.
* The size and layout of the payload depends on the header fields. The
SME_PT_ZA_*() macros are provided to facilitate access to the data.
* In either case, for SETREGSET it is permissible to omit the payload, in which
case the vector length and flags are changed and PSTATE.ZA is set to 0
(along with any consequences of those changes). If a payload is provided
then PSTATE.ZA will be set to 1.
* For SETREGSET, if the requested VL is not supported, the effect will be the
same as if the payload were omitted, except that an EIO error is reported.
No attempt is made to translate the payload data to the correct layout
for the vector length actually set. It is up to the caller to translate the
payload layout for the actual VL and retry.
* The effect of writing a partial, incomplete payload is unspecified.
8. ELF coredump extensions
---------------------------
* NT_ARM_SSVE notes will be added to each coredump for
each thread of the dumped process. The contents will be equivalent to the
data that would have been read if a PTRACE_GETREGSET of the corresponding
type were executed for each thread when the coredump was generated.
* A NT_ARM_ZA note will be added to each coredump for each thread of the
dumped process. The contents will be equivalent to the data that would have
been read if a PTRACE_GETREGSET of NT_ARM_ZA were executed for each thread
when the coredump was generated.
9. System runtime configuration
--------------------------------
* To mitigate the ABI impact of expansion of the signal frame, a policy
mechanism is provided for administrators, distro maintainers and developers
to set the default vector length for userspace processes:
/proc/sys/abi/sme_default_vector_length
Writing the text representation of an integer to this file sets the system
default vector length to the specified value, unless the value is greater
than the maximum vector length supported by the system in which case the
default vector length is set to that maximum.
The result can be determined by reopening the file and reading its
contents.
At boot, the default vector length is initially set to 32 or the maximum
supported vector length, whichever is smaller and supported. This
determines the initial vector length of the init process (PID 1).
Reading this file returns the current system default vector length.
* At every execve() call, the new vector length of the new process is set to
the system default vector length, unless
* PR_SME_VL_INHERIT (or equivalently SME_PT_VL_INHERIT) is set for the
calling thread, or
* a deferred vector length change is pending, established via the
PR_SME_SET_VL_ONEXEC flag (or SME_PT_VL_ONEXEC).
* Modifying the system default vector length does not affect the vector length
of any existing process or thread that does not make an execve() call.
Appendix A. SME programmer's model (informative)
=================================================
This section provides a minimal description of the additions made by SVE to the
ARMv8-A programmer's model that are relevant to this document.
Note: This section is for information only and not intended to be complete or
to replace any architectural specification.
A.1. Registers
---------------
In A64 state, SME adds the following:
* A new mode, streaming mode, in which a subset of the normal FPSIMD and SVE
features are available. When supported EL0 software may enter and leave
streaming mode at any time.
For best system performance it is strongly encouraged for software to enable
streaming mode only when it is actively being used.
* A new vector length controlling the size of ZA and the Z registers when in
streaming mode, separately to the vector length used for SVE when not in
streaming mode. There is no requirement that either the currently selected
vector length or the set of vector lengths supported for the two modes in
a given system have any relationship. The streaming mode vector length
is referred to as SVL.
* A new ZA matrix register. This is a square matrix of SVLxSVL bits. Most
operations on ZA require that streaming mode be enabled but ZA can be
enabled without streaming mode in order to load, save and retain data.
For best system performance it is strongly encouraged for software to enable
ZA only when it is actively being used.
* Two new 1 bit fields in PSTATE which may be controlled via the SMSTART and
SMSTOP instructions or by access to the SVCR system register:
* PSTATE.ZA, if this is 1 then the ZA matrix is accessible and has valid
data while if it is 0 then ZA can not be accessed. When PSTATE.ZA is
changed from 0 to 1 all bits in ZA are cleared.
* PSTATE.SM, if this is 1 then the PE is in streaming mode. When the value
of PSTATE.SM is changed then it is implementation defined if the subset
of the floating point register bits valid in both modes may be retained.
Any other bits will be cleared.
References
==========
[1] arch/arm64/include/uapi/asm/sigcontext.h
AArch64 Linux signal ABI definitions
[2] arch/arm64/include/uapi/asm/ptrace.h
AArch64 Linux ptrace ABI definitions
[3] Documentation/arm64/cpu-feature-registers.rst
......@@ -7,7 +7,9 @@ Author: Dave Martin <Dave.Martin@arm.com>
Date: 4 August 2017
This document outlines briefly the interface provided to userspace by Linux in
order to support use of the ARM Scalable Vector Extension (SVE).
order to support use of the ARM Scalable Vector Extension (SVE), including
interactions with Streaming SVE mode added by the Scalable Matrix Extension
(SME).
This is an outline of the most important features and issues only and not
intended to be exhaustive.
......@@ -23,6 +25,10 @@ model features for SVE is included in Appendix A.
* SVE registers Z0..Z31, P0..P15 and FFR and the current vector length VL, are
tracked per-thread.
* In streaming mode FFR is not accessible unless HWCAP2_SME_FA64 is present
in the system, when it is not supported and these interfaces are used to
access streaming mode FFR is read and written as zero.
* The presence of SVE is reported to userspace via HWCAP_SVE in the aux vector
AT_HWCAP entry. Presence of this flag implies the presence of the SVE
instructions and registers, and the Linux-specific system interfaces
......@@ -53,10 +59,19 @@ model features for SVE is included in Appendix A.
which userspace can read using an MRS instruction. See elf_hwcaps.txt and
cpu-feature-registers.txt for details.
* On hardware that supports the SME extensions, HWCAP2_SME will also be
reported in the AT_HWCAP2 aux vector entry. Among other things SME adds
streaming mode which provides a subset of the SVE feature set using a
separate SME vector length and the same Z/V registers. See sme.rst
for more details.
* Debuggers should restrict themselves to interacting with the target via the
NT_ARM_SVE regset. The recommended way of detecting support for this regset
is to connect to a target process first and then attempt a
ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov).
ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov). Note that when SME is
present and streaming SVE mode is in use the FPSIMD subset of registers
will be read via NT_ARM_SVE and NT_ARM_SVE writes will exit streaming mode
in the target.
* Whenever SVE scalable register values (Zn, Pn, FFR) are exchanged in memory
between userspace and the kernel, the register value is encoded in memory in
......@@ -126,6 +141,11 @@ the SVE instruction set architecture.
are only present in fpsimd_context. For convenience, the content of V0..V31
is duplicated between sve_context and fpsimd_context.
* The record contains a flag field which includes a flag SVE_SIG_FLAG_SM which
if set indicates that the thread is in streaming mode and the vector length
and register data (if present) describe the streaming SVE data and vector
length.
* The signal frame record for SVE always contains basic metadata, in particular
the thread's vector length (in sve_context.vl).
......@@ -170,6 +190,11 @@ When returning from a signal handler:
the signal frame does not match the current vector length, the signal return
attempt is treated as illegal, resulting in a forced SIGSEGV.
* It is permitted to enter or leave streaming mode by setting or clearing
the SVE_SIG_FLAG_SM flag but applications should take care to ensure that
when doing so sve_context.vl and any register data are appropriate for the
vector length in the new mode.
6. prctl extensions
--------------------
......@@ -265,8 +290,14 @@ prctl(PR_SVE_GET_VL)
7. ptrace extensions
---------------------
* A new regset NT_ARM_SVE is defined for use with PTRACE_GETREGSET and
PTRACE_SETREGSET.
* New regsets NT_ARM_SVE and NT_ARM_SSVE are defined for use with
PTRACE_GETREGSET and PTRACE_SETREGSET. NT_ARM_SSVE describes the
streaming mode SVE registers and NT_ARM_SVE describes the
non-streaming mode SVE registers.
In this description a register set is referred to as being "live" when
the target is in the appropriate streaming or non-streaming mode and is
using data beyond the subset shared with the FPSIMD Vn registers.
Refer to [2] for definitions.
......@@ -297,7 +328,7 @@ The regset data starts with struct user_sve_header, containing:
flags
either
at most one of
SVE_PT_REGS_FPSIMD
......@@ -331,6 +362,10 @@ The regset data starts with struct user_sve_header, containing:
SVE_PT_VL_ONEXEC (SETREGSET only).
If neither FPSIMD nor SVE flags are provided then no register
payload is available, this is only possible when SME is implemented.
* The effects of changing the vector length and/or flags are equivalent to
those documented for PR_SVE_SET_VL.
......@@ -346,6 +381,13 @@ The regset data starts with struct user_sve_header, containing:
case only the vector length and flags are changed (along with any
consequences of those changes).
* In systems supporting SME when in streaming mode a GETREGSET for
NT_REG_SVE will return only the user_sve_header with no register data,
similarly a GETREGSET for NT_REG_SSVE will not return any register data
when not in streaming mode.
* A GETREGSET for NT_ARM_SSVE will never return SVE_PT_REGS_FPSIMD.
* For SETREGSET, if an SVE_PT_REGS_SVE payload is present and the
requested VL is not supported, the effect will be the same as if the
payload were omitted, except that an EIO error is reported. No
......@@ -355,17 +397,25 @@ The regset data starts with struct user_sve_header, containing:
unspecified. It is up to the caller to translate the payload layout
for the actual VL and retry.
* Where SME is implemented it is not possible to GETREGSET the register
state for normal SVE when in streaming mode, nor the streaming mode
register state when in normal mode, regardless of the implementation defined
behaviour of the hardware for sharing data between the two modes.
* Any SETREGSET of NT_ARM_SVE will exit streaming mode if the target was in
streaming mode and any SETREGSET of NT_ARM_SSVE will enter streaming mode
if the target was not in streaming mode.
* The effect of writing a partial, incomplete payload is unspecified.
8. ELF coredump extensions
---------------------------
* A NT_ARM_SVE note will be added to each coredump for each thread of the
dumped process. The contents will be equivalent to the data that would have
been read if a PTRACE_GETREGSET of NT_ARM_SVE were executed for each thread
when the coredump was generated.
* NT_ARM_SVE and NT_ARM_SSVE notes will be added to each coredump for
each thread of the dumped process. The contents will be equivalent to the
data that would have been read if a PTRACE_GETREGSET of the corresponding
type were executed for each thread when the coredump was generated.
9. System runtime configuration
--------------------------------
......
......@@ -24,6 +24,13 @@ config KEXEC_ELF
config HAVE_IMA_KEXEC
bool
config ARCH_HAS_SUBPAGE_FAULTS
bool
help
Select if the architecture can check permissions at sub-page
granularity (e.g. arm64 MTE). The probe_user_*() functions
must be implemented.
config HOTPLUG_SMT
bool
......
......@@ -253,31 +253,31 @@ config ARM64_CONT_PMD_SHIFT
default 4
config ARCH_MMAP_RND_BITS_MIN
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
default 18
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
default 18
# max bits determined by the following formula:
# VA_BITS - PAGE_SHIFT - 3
config ARCH_MMAP_RND_BITS_MAX
default 19 if ARM64_VA_BITS=36
default 24 if ARM64_VA_BITS=39
default 27 if ARM64_VA_BITS=42
default 30 if ARM64_VA_BITS=47
default 29 if ARM64_VA_BITS=48 && ARM64_64K_PAGES
default 31 if ARM64_VA_BITS=48 && ARM64_16K_PAGES
default 33 if ARM64_VA_BITS=48
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
default 18
default 19 if ARM64_VA_BITS=36
default 24 if ARM64_VA_BITS=39
default 27 if ARM64_VA_BITS=42
default 30 if ARM64_VA_BITS=47
default 29 if ARM64_VA_BITS=48 && ARM64_64K_PAGES
default 31 if ARM64_VA_BITS=48 && ARM64_16K_PAGES
default 33 if ARM64_VA_BITS=48
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
default 18
config ARCH_MMAP_RND_COMPAT_BITS_MIN
default 7 if ARM64_64K_PAGES
default 9 if ARM64_16K_PAGES
default 11
default 7 if ARM64_64K_PAGES
default 9 if ARM64_16K_PAGES
default 11
config ARCH_MMAP_RND_COMPAT_BITS_MAX
default 16
default 16
config NO_IOPORT_MAP
def_bool y if !PCI
......@@ -304,7 +304,7 @@ config GENERIC_HWEIGHT
def_bool y
config GENERIC_CSUM
def_bool y
def_bool y
config GENERIC_CALIBRATE_DELAY
def_bool y
......@@ -1037,8 +1037,7 @@ config SOCIONEXT_SYNQUACER_PREITS
If unsure, say Y.
endmenu
endmenu # "ARM errata workarounds via the alternatives framework"
choice
prompt "Page size"
......@@ -1566,9 +1565,9 @@ config SETEND_EMULATION
be unexpected results in the applications.
If unsure, say Y
endif
endif # ARMV8_DEPRECATED
endif
endif # COMPAT
menu "ARMv8.1 architectural features"
......@@ -1593,15 +1592,15 @@ config ARM64_PAN
bool "Enable support for Privileged Access Never (PAN)"
default y
help
Privileged Access Never (PAN; part of the ARMv8.1 Extensions)
prevents the kernel or hypervisor from accessing user-space (EL0)
memory directly.
Privileged Access Never (PAN; part of the ARMv8.1 Extensions)
prevents the kernel or hypervisor from accessing user-space (EL0)
memory directly.
Choosing this option will cause any unprotected (not using
copy_to_user et al) memory access to fail with a permission fault.
Choosing this option will cause any unprotected (not using
copy_to_user et al) memory access to fail with a permission fault.
The feature is detected at runtime, and will remain as a 'nop'
instruction if the cpu does not implement the feature.
The feature is detected at runtime, and will remain as a 'nop'
instruction if the cpu does not implement the feature.
config AS_HAS_LDAPR
def_bool $(as-instr,.arch_extension rcpc)
......@@ -1629,15 +1628,15 @@ config ARM64_USE_LSE_ATOMICS
built with binutils >= 2.25 in order for the new instructions
to be used.
endmenu
endmenu # "ARMv8.1 architectural features"
menu "ARMv8.2 architectural features"
config AS_HAS_ARMV8_2
def_bool $(cc-option,-Wa$(comma)-march=armv8.2-a)
def_bool $(cc-option,-Wa$(comma)-march=armv8.2-a)
config AS_HAS_SHA3
def_bool $(as-instr,.arch armv8.2-a+sha3)
def_bool $(as-instr,.arch armv8.2-a+sha3)
config ARM64_PMEM
bool "Enable support for persistent memory"
......@@ -1681,7 +1680,7 @@ config ARM64_CNP
at runtime, and does not affect PEs that do not implement
this feature.
endmenu
endmenu # "ARMv8.2 architectural features"
menu "ARMv8.3 architectural features"
......@@ -1744,7 +1743,7 @@ config AS_HAS_PAC
config AS_HAS_CFI_NEGATE_RA_STATE
def_bool $(as-instr,.cfi_startproc\n.cfi_negate_ra_state\n.cfi_endproc\n)
endmenu
endmenu # "ARMv8.3 architectural features"
menu "ARMv8.4 architectural features"
......@@ -1785,7 +1784,7 @@ config ARM64_TLB_RANGE
The feature introduces new assembly instructions, and they were
support when binutils >= 2.30.
endmenu
endmenu # "ARMv8.4 architectural features"
menu "ARMv8.5 architectural features"
......@@ -1871,6 +1870,7 @@ config ARM64_MTE
depends on AS_HAS_LSE_ATOMICS
# Required for tag checking in the uaccess routines
depends on ARM64_PAN
select ARCH_HAS_SUBPAGE_FAULTS
select ARCH_USES_HIGH_VMA_FLAGS
help
Memory Tagging (part of the ARMv8.5 Extensions) provides
......@@ -1892,7 +1892,7 @@ config ARM64_MTE
Documentation/arm64/memory-tagging-extension.rst.
endmenu
endmenu # "ARMv8.5 architectural features"
menu "ARMv8.7 architectural features"
......@@ -1901,12 +1901,12 @@ config ARM64_EPAN
default y
depends on ARM64_PAN
help
Enhanced Privileged Access Never (EPAN) allows Privileged
Access Never to be used with Execute-only mappings.
Enhanced Privileged Access Never (EPAN) allows Privileged
Access Never to be used with Execute-only mappings.
The feature is detected at runtime, and will remain disabled
if the cpu does not implement the feature.
endmenu
The feature is detected at runtime, and will remain disabled
if the cpu does not implement the feature.
endmenu # "ARMv8.7 architectural features"
config ARM64_SVE
bool "ARM Scalable Vector Extension support"
......@@ -1939,6 +1939,17 @@ config ARM64_SVE
booting the kernel. If unsure and you are not observing these
symptoms, you should assume that it is safe to say Y.
config ARM64_SME
bool "ARM Scalable Matrix Extension support"
default y
depends on ARM64_SVE
help
The Scalable Matrix Extension (SME) is an extension to the AArch64
execution state which utilises a substantial subset of the SVE
instruction set, together with the addition of new architectural
register state capable of holding two dimensional matrix tiles to
enable various matrix operations.
config ARM64_MODULE_PLTS
bool "Use PLTs to allow module memory to spill over into vmalloc area"
depends on MODULES
......@@ -1982,7 +1993,7 @@ config ARM64_DEBUG_PRIORITY_MASKING
the validity of ICC_PMR_EL1 when calling concerned functions.
If unsure, say N
endif
endif # ARM64_PSEUDO_NMI
config RELOCATABLE
bool "Build a relocatable kernel image" if EXPERT
......@@ -2041,7 +2052,19 @@ config STACKPROTECTOR_PER_TASK
def_bool y
depends on STACKPROTECTOR && CC_HAVE_STACKPROTECTOR_SYSREG
endmenu
# The GPIO number here must be sorted by descending number. In case of
# a multiplatform kernel, we just want the highest value required by the
# selected platforms.
config ARCH_NR_GPIO
int
default 2048 if ARCH_APPLE
default 0
help
Maximum number of GPIOs in the system.
If unsure, leave the default value.
endmenu # "Kernel Features"
menu "Boot options"
......@@ -2105,7 +2128,7 @@ config EFI
help
This option provides support for runtime services provided
by UEFI firmware (such as non-volatile variables, realtime
clock, and platform reset). A UEFI stub is also provided to
clock, and platform reset). A UEFI stub is also provided to
allow the kernel to be booted as an EFI application. This
is only useful on systems that have UEFI firmware.
......@@ -2120,7 +2143,7 @@ config DMI
However, even with this option, the resultant kernel should
continue to boot on existing non-UEFI platforms.
endmenu
endmenu # "Boot options"
config SYSVIPC_COMPAT
def_bool y
......@@ -2141,7 +2164,7 @@ config ARCH_HIBERNATION_HEADER
config ARCH_SUSPEND_POSSIBLE
def_bool y
endmenu
endmenu # "Power management options"
menu "CPU Power Management"
......@@ -2149,7 +2172,7 @@ source "drivers/cpuidle/Kconfig"
source "drivers/cpufreq/Kconfig"
endmenu
endmenu # "CPU Power Management"
source "drivers/acpi/Kconfig"
......@@ -2157,4 +2180,4 @@ source "arch/arm64/kvm/Kconfig"
if CRYPTO
source "arch/arm64/crypto/Kconfig"
endif
endif # CRYPTO
......@@ -325,4 +325,4 @@ config ARCH_ZYNQMP
help
This enables support for Xilinx ZynqMP Family
endmenu
endmenu # "Platform selection"
......@@ -58,11 +58,15 @@ struct cpuinfo_arm64 {
u64 reg_id_aa64pfr0;
u64 reg_id_aa64pfr1;
u64 reg_id_aa64zfr0;
u64 reg_id_aa64smfr0;
struct cpuinfo_32bit aarch32;
/* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */
u64 reg_zcr;
/* pseudo-SMCR for recording maximum SMCR_EL1 LEN value: */
u64 reg_smcr;
};
DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
......
......@@ -622,6 +622,13 @@ static inline bool id_aa64pfr0_sve(u64 pfr0)
return val > 0;
}
static inline bool id_aa64pfr1_sme(u64 pfr1)
{
u32 val = cpuid_feature_extract_unsigned_field(pfr1, ID_AA64PFR1_SME_SHIFT);
return val > 0;
}
static inline bool id_aa64pfr1_mte(u64 pfr1)
{
u32 val = cpuid_feature_extract_unsigned_field(pfr1, ID_AA64PFR1_MTE_SHIFT);
......@@ -759,6 +766,23 @@ static __always_inline bool system_supports_sve(void)
cpus_have_const_cap(ARM64_SVE);
}
static __always_inline bool system_supports_sme(void)
{
return IS_ENABLED(CONFIG_ARM64_SME) &&
cpus_have_const_cap(ARM64_SME);
}
static __always_inline bool system_supports_fa64(void)
{
return IS_ENABLED(CONFIG_ARM64_SME) &&
cpus_have_const_cap(ARM64_SME_FA64);
}
static __always_inline bool system_supports_tpidr2(void)
{
return system_supports_sme();
}
static __always_inline bool system_supports_cnp(void)
{
return IS_ENABLED(CONFIG_ARM64_CNP) &&
......
......@@ -36,7 +36,7 @@
#define MIDR_VARIANT(midr) \
(((midr) & MIDR_VARIANT_MASK) >> MIDR_VARIANT_SHIFT)
#define MIDR_IMPLEMENTOR_SHIFT 24
#define MIDR_IMPLEMENTOR_MASK (0xff << MIDR_IMPLEMENTOR_SHIFT)
#define MIDR_IMPLEMENTOR_MASK (0xffU << MIDR_IMPLEMENTOR_SHIFT)
#define MIDR_IMPLEMENTOR(midr) \
(((midr) & MIDR_IMPLEMENTOR_MASK) >> MIDR_IMPLEMENTOR_SHIFT)
......
......@@ -143,6 +143,50 @@
.Lskip_sve_\@:
.endm
/* SME register access and priority mapping */
.macro __init_el2_nvhe_sme
mrs x1, id_aa64pfr1_el1
ubfx x1, x1, #ID_AA64PFR1_SME_SHIFT, #4
cbz x1, .Lskip_sme_\@
bic x0, x0, #CPTR_EL2_TSM // Also disable SME traps
msr cptr_el2, x0 // Disable copro. traps to EL2
isb
mrs x1, sctlr_el2
orr x1, x1, #SCTLR_ELx_ENTP2 // Disable TPIDR2 traps
msr sctlr_el2, x1
isb
mov x1, #0 // SMCR controls
mrs_s x2, SYS_ID_AA64SMFR0_EL1
ubfx x2, x2, #ID_AA64SMFR0_FA64_SHIFT, #1 // Full FP in SM?
cbz x2, .Lskip_sme_fa64_\@
orr x1, x1, SMCR_ELx_FA64_MASK
.Lskip_sme_fa64_\@:
orr x1, x1, #SMCR_ELx_LEN_MASK // Enable full SME vector
msr_s SYS_SMCR_EL2, x1 // length for EL1.
mrs_s x1, SYS_SMIDR_EL1 // Priority mapping supported?
ubfx x1, x1, #SYS_SMIDR_EL1_SMPS_SHIFT, #1
cbz x1, .Lskip_sme_\@
msr_s SYS_SMPRIMAP_EL2, xzr // Make all priorities equal
mrs x1, id_aa64mmfr1_el1 // HCRX_EL2 present?
ubfx x1, x1, #ID_AA64MMFR1_HCX_SHIFT, #4
cbz x1, .Lskip_sme_\@
mrs_s x1, SYS_HCRX_EL2
orr x1, x1, #HCRX_EL2_SMPME_MASK // Enable priority mapping
msr_s SYS_HCRX_EL2, x1
.Lskip_sme_\@:
.endm
/* Disable any fine grained traps */
.macro __init_el2_fgt
mrs x1, id_aa64mmfr0_el1
......@@ -153,15 +197,26 @@
mrs x1, id_aa64dfr0_el1
ubfx x1, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4
cmp x1, #3
b.lt .Lset_fgt_\@
b.lt .Lset_debug_fgt_\@
/* Disable PMSNEVFR_EL1 read and write traps */
orr x0, x0, #(1 << 62)
.Lset_fgt_\@:
.Lset_debug_fgt_\@:
msr_s SYS_HDFGRTR_EL2, x0
msr_s SYS_HDFGWTR_EL2, x0
msr_s SYS_HFGRTR_EL2, xzr
msr_s SYS_HFGWTR_EL2, xzr
mov x0, xzr
mrs x1, id_aa64pfr1_el1
ubfx x1, x1, #ID_AA64PFR1_SME_SHIFT, #4
cbz x1, .Lset_fgt_\@
/* Disable nVHE traps of TPIDR2 and SMPRI */
orr x0, x0, #HFGxTR_EL2_nSMPRI_EL1_MASK
orr x0, x0, #HFGxTR_EL2_nTPIDR2_EL0_MASK
.Lset_fgt_\@:
msr_s SYS_HFGRTR_EL2, x0
msr_s SYS_HFGWTR_EL2, x0
msr_s SYS_HFGITR_EL2, xzr
mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU
......@@ -196,6 +251,7 @@
__init_el2_nvhe_idregs
__init_el2_nvhe_cptr
__init_el2_nvhe_sve
__init_el2_nvhe_sme
__init_el2_fgt
__init_el2_nvhe_prepare_eret
.endm
......
......@@ -37,7 +37,8 @@
#define ESR_ELx_EC_ERET (0x1a) /* EL2 only */
/* Unallocated EC: 0x1B */
#define ESR_ELx_EC_FPAC (0x1C) /* EL1 and above */
/* Unallocated EC: 0x1D - 0x1E */
#define ESR_ELx_EC_SME (0x1D)
/* Unallocated EC: 0x1E */
#define ESR_ELx_EC_IMP_DEF (0x1f) /* EL3 only */
#define ESR_ELx_EC_IABT_LOW (0x20)
#define ESR_ELx_EC_IABT_CUR (0x21)
......@@ -75,6 +76,7 @@
#define ESR_ELx_IL_SHIFT (25)
#define ESR_ELx_IL (UL(1) << ESR_ELx_IL_SHIFT)
#define ESR_ELx_ISS_MASK (ESR_ELx_IL - 1)
#define ESR_ELx_ISS(esr) ((esr) & ESR_ELx_ISS_MASK)
/* ISS field definitions shared by different classes */
#define ESR_ELx_WNR_SHIFT (6)
......@@ -327,6 +329,15 @@
#define ESR_ELx_CP15_32_ISS_SYS_CNTFRQ (ESR_ELx_CP15_32_ISS_SYS_VAL(0, 0, 14, 0) |\
ESR_ELx_CP15_32_ISS_DIR_READ)
/*
* ISS values for SME traps
*/
#define ESR_ELx_SME_ISS_SME_DISABLED 0
#define ESR_ELx_SME_ISS_ILL 1
#define ESR_ELx_SME_ISS_SM_DISABLED 2
#define ESR_ELx_SME_ISS_ZA_DISABLED 3
#ifndef __ASSEMBLY__
#include <asm/types.h>
......
......@@ -64,6 +64,7 @@ void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr,
struct pt_regs *regs);
void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs);
void do_sve_acc(unsigned int esr, struct pt_regs *regs);
void do_sme_acc(unsigned int esr, struct pt_regs *regs);
void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs);
void do_sysinstr(unsigned int esr, struct pt_regs *regs);
void do_sp_pc_abort(unsigned long addr, unsigned int esr, struct pt_regs *regs);
......
......@@ -32,6 +32,18 @@
#define VFP_STATE_SIZE ((32 * 8) + 4)
#endif
/*
* When we defined the maximum SVE vector length we defined the ABI so
* that the maximum vector length included all the reserved for future
* expansion bits in ZCR rather than those just currently defined by
* the architecture. While SME follows a similar pattern the fact that
* it includes a square matrix means that any allocations that attempt
* to cover the maximum potential vector length (such as happen with
* the regset used for ptrace) end up being extremely large. Define
* the much lower actual limit for use in such situations.
*/
#define SME_VQ_MAX 16
struct task_struct;
extern void fpsimd_save_state(struct user_fpsimd_state *state);
......@@ -46,11 +58,23 @@ extern void fpsimd_restore_current_state(void);
extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
void *sve_state, unsigned int sve_vl);
void *sve_state, unsigned int sve_vl,
void *za_state, unsigned int sme_vl,
u64 *svcr);
extern void fpsimd_flush_task_state(struct task_struct *target);
extern void fpsimd_save_and_flush_cpu_state(void);
static inline bool thread_sm_enabled(struct thread_struct *thread)
{
return system_supports_sme() && (thread->svcr & SYS_SVCR_EL0_SM_MASK);
}
static inline bool thread_za_enabled(struct thread_struct *thread)
{
return system_supports_sme() && (thread->svcr & SYS_SVCR_EL0_ZA_MASK);
}
/* Maximum VL that SVE/SME VL-agnostic software can transparently support */
#define VL_ARCH_MAX 0x100
......@@ -62,7 +86,14 @@ static inline size_t sve_ffr_offset(int vl)
static inline void *sve_pffr(struct thread_struct *thread)
{
return (char *)thread->sve_state + sve_ffr_offset(thread_get_sve_vl(thread));
unsigned int vl;
if (system_supports_sme() && thread_sm_enabled(thread))
vl = thread_get_sme_vl(thread);
else
vl = thread_get_sve_vl(thread);
return (char *)thread->sve_state + sve_ffr_offset(vl);
}
extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
......@@ -71,11 +102,17 @@ extern void sve_load_state(void const *state, u32 const *pfpsr,
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
extern unsigned int sve_get_vl(void);
extern void sve_set_vq(unsigned long vq_minus_1);
extern void sme_set_vq(unsigned long vq_minus_1);
extern void za_save_state(void *state);
extern void za_load_state(void const *state);
struct arm64_cpu_capabilities;
extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
extern void sme_kernel_enable(const struct arm64_cpu_capabilities *__unused);
extern void fa64_kernel_enable(const struct arm64_cpu_capabilities *__unused);
extern u64 read_zcr_features(void);
extern u64 read_smcr_features(void);
/*
* Helpers to translate bit indices in sve_vq_map to VQ values (and
......@@ -119,6 +156,7 @@ struct vl_info {
extern void sve_alloc(struct task_struct *task);
extern void fpsimd_release_task(struct task_struct *task);
extern void fpsimd_sync_to_sve(struct task_struct *task);
extern void fpsimd_force_sync_to_sve(struct task_struct *task);
extern void sve_sync_to_fpsimd(struct task_struct *task);
extern void sve_sync_from_fpsimd_zeropad(struct task_struct *task);
......@@ -170,6 +208,12 @@ static inline void write_vl(enum vec_type type, u64 val)
tmp = read_sysreg_s(SYS_ZCR_EL1) & ~ZCR_ELx_LEN_MASK;
write_sysreg_s(tmp | val, SYS_ZCR_EL1);
break;
#endif
#ifdef CONFIG_ARM64_SME
case ARM64_VEC_SME:
tmp = read_sysreg_s(SYS_SMCR_EL1) & ~SMCR_ELx_LEN_MASK;
write_sysreg_s(tmp | val, SYS_SMCR_EL1);
break;
#endif
default:
WARN_ON_ONCE(1);
......@@ -208,6 +252,8 @@ static inline bool sve_vq_available(unsigned int vq)
return vq_available(ARM64_VEC_SVE, vq);
}
size_t sve_state_size(struct task_struct const *task);
#else /* ! CONFIG_ARM64_SVE */
static inline void sve_alloc(struct task_struct *task) { }
......@@ -247,8 +293,93 @@ static inline void vec_update_vq_map(enum vec_type t) { }
static inline int vec_verify_vq_map(enum vec_type t) { return 0; }
static inline void sve_setup(void) { }
static inline size_t sve_state_size(struct task_struct const *task)
{
return 0;
}
#endif /* ! CONFIG_ARM64_SVE */
#ifdef CONFIG_ARM64_SME
static inline void sme_user_disable(void)
{
sysreg_clear_set(cpacr_el1, CPACR_EL1_SMEN_EL0EN, 0);
}
static inline void sme_user_enable(void)
{
sysreg_clear_set(cpacr_el1, 0, CPACR_EL1_SMEN_EL0EN);
}
static inline void sme_smstart_sm(void)
{
asm volatile(__msr_s(SYS_SVCR_SMSTART_SM_EL0, "xzr"));
}
static inline void sme_smstop_sm(void)
{
asm volatile(__msr_s(SYS_SVCR_SMSTOP_SM_EL0, "xzr"));
}
static inline void sme_smstop(void)
{
asm volatile(__msr_s(SYS_SVCR_SMSTOP_SMZA_EL0, "xzr"));
}
extern void __init sme_setup(void);
static inline int sme_max_vl(void)
{
return vec_max_vl(ARM64_VEC_SME);
}
static inline int sme_max_virtualisable_vl(void)
{
return vec_max_virtualisable_vl(ARM64_VEC_SME);
}
extern void sme_alloc(struct task_struct *task);
extern unsigned int sme_get_vl(void);
extern int sme_set_current_vl(unsigned long arg);
extern int sme_get_current_vl(void);
/*
* Return how many bytes of memory are required to store the full SME
* specific state (currently just ZA) for task, given task's currently
* configured vector length.
*/
static inline size_t za_state_size(struct task_struct const *task)
{
unsigned int vl = task_get_sme_vl(task);
return ZA_SIG_REGS_SIZE(sve_vq_from_vl(vl));
}
#else
static inline void sme_user_disable(void) { BUILD_BUG(); }
static inline void sme_user_enable(void) { BUILD_BUG(); }
static inline void sme_smstart_sm(void) { }
static inline void sme_smstop_sm(void) { }
static inline void sme_smstop(void) { }
static inline void sme_alloc(struct task_struct *task) { }
static inline void sme_setup(void) { }
static inline unsigned int sme_get_vl(void) { return 0; }
static inline int sme_max_vl(void) { return 0; }
static inline int sme_max_virtualisable_vl(void) { return 0; }
static inline int sme_set_current_vl(unsigned long arg) { return -EINVAL; }
static inline int sme_get_current_vl(void) { return -EINVAL; }
static inline size_t za_state_size(struct task_struct const *task)
{
return 0;
}
#endif /* ! CONFIG_ARM64_SME */
/* For use by EFI runtime services calls only */
extern void __efi_fpsimd_begin(void);
extern void __efi_fpsimd_end(void);
......
......@@ -93,6 +93,12 @@
.endif
.endm
.macro _sme_check_wv v
.if (\v) < 12 || (\v) > 15
.error "Bad vector select register \v."
.endif
.endm
/* SVE instruction encodings for non-SVE-capable assemblers */
/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
......@@ -174,6 +180,54 @@
| (\np)
.endm
/* SME instruction encodings for non-SME-capable assemblers */
/* (pre binutils 2.38/LLVM 13) */
/* RDSVL X\nx, #\imm */
.macro _sme_rdsvl nx, imm
_check_general_reg \nx
_check_num (\imm), -0x20, 0x1f
.inst 0x04bf5800 \
| (\nx) \
| (((\imm) & 0x3f) << 5)
.endm
/*
* STR (vector from ZA array):
* STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
*/
.macro _sme_str_zav nw, nxbase, offset=0
_sme_check_wv \nw
_check_general_reg \nxbase
_check_num (\offset), -0x100, 0xff
.inst 0xe1200000 \
| (((\nw) & 3) << 13) \
| ((\nxbase) << 5) \
| ((\offset) & 7)
.endm
/*
* LDR (vector to ZA array):
* LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
*/
.macro _sme_ldr_zav nw, nxbase, offset=0
_sme_check_wv \nw
_check_general_reg \nxbase
_check_num (\offset), -0x100, 0xff
.inst 0xe1000000 \
| (((\nw) & 3) << 13) \
| ((\nxbase) << 5) \
| ((\offset) & 7)
.endm
/*
* Zero the entire ZA array
* ZERO ZA
*/
.macro zero_za
.inst 0xc00800ff
.endm
.macro __for from:req, to:req
.if (\from) == (\to)
_for__body %\from
......@@ -208,6 +262,17 @@
921:
.endm
/* Update SMCR_EL1.LEN with the new VQ */
.macro sme_load_vq xvqminus1, xtmp, xtmp2
mrs_s \xtmp, SYS_SMCR_EL1
bic \xtmp2, \xtmp, SMCR_ELx_LEN_MASK
orr \xtmp2, \xtmp2, \xvqminus1
cmp \xtmp2, \xtmp
b.eq 921f
msr_s SYS_SMCR_EL1, \xtmp2 //self-synchronising
921:
.endm
/* Preserve the first 128-bits of Znz and zero the rest. */
.macro _sve_flush_z nz
_sve_check_zreg \nz
......@@ -254,3 +319,25 @@
ldr w\nxtmp, [\xpfpsr, #4]
msr fpcr, x\nxtmp
.endm
.macro sme_save_za nxbase, xvl, nw
mov w\nw, #0
423:
_sme_str_zav \nw, \nxbase
add x\nxbase, x\nxbase, \xvl
add x\nw, x\nw, #1
cmp \xvl, x\nw
bne 423b
.endm
.macro sme_load_za nxbase, xvl, nw
mov w\nw, #0
423:
_sme_ldr_zav \nw, \nxbase
add x\nxbase, x\nxbase, \xvl
add x\nw, x\nw, #1
cmp \xvl, x\nw
bne 423b
.endm
......@@ -80,8 +80,15 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
struct dyn_ftrace;
struct ftrace_ops;
struct ftrace_regs;
int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
#define ftrace_init_nop ftrace_init_nop
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs);
#define ftrace_graph_func ftrace_graph_func
#endif
#define ftrace_return_address(n) return_address(n)
......
......@@ -44,6 +44,8 @@ extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
#define __HAVE_ARCH_HUGE_PTE_CLEAR
extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, unsigned long sz);
#define __HAVE_ARCH_HUGE_PTEP_GET
extern pte_t huge_ptep_get(pte_t *ptep);
extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, unsigned long sz);
#define set_huge_swap_pte_at set_huge_swap_pte_at
......
......@@ -109,6 +109,14 @@
#define KERNEL_HWCAP_AFP __khwcap2_feature(AFP)
#define KERNEL_HWCAP_RPRES __khwcap2_feature(RPRES)
#define KERNEL_HWCAP_MTE3 __khwcap2_feature(MTE3)
#define KERNEL_HWCAP_SME __khwcap2_feature(SME)
#define KERNEL_HWCAP_SME_I16I64 __khwcap2_feature(SME_I16I64)
#define KERNEL_HWCAP_SME_F64F64 __khwcap2_feature(SME_F64F64)
#define KERNEL_HWCAP_SME_I8I32 __khwcap2_feature(SME_I8I32)
#define KERNEL_HWCAP_SME_F16F32 __khwcap2_feature(SME_F16F32)
#define KERNEL_HWCAP_SME_B16F32 __khwcap2_feature(SME_B16F32)
#define KERNEL_HWCAP_SME_F32F32 __khwcap2_feature(SME_F32F32)
#define KERNEL_HWCAP_SME_FA64 __khwcap2_feature(SME_FA64)
/*
* This yields a mask that user programs can use to figure out what
......
......@@ -279,6 +279,7 @@
#define CPTR_EL2_TCPAC (1U << 31)
#define CPTR_EL2_TAM (1 << 30)
#define CPTR_EL2_TTA (1 << 20)
#define CPTR_EL2_TSM (1 << 12)
#define CPTR_EL2_TFP (1 << CPTR_EL2_TFP_SHIFT)
#define CPTR_EL2_TZ (1 << 8)
#define CPTR_NVHE_EL2_RES1 0x000032ff /* known RES1 bits in CPTR_EL2 (nVHE) */
......
......@@ -295,8 +295,11 @@ struct vcpu_reset_state {
struct kvm_vcpu_arch {
struct kvm_cpu_context ctxt;
/* Guest floating point state */
void *sve_state;
unsigned int sve_max_vl;
u64 svcr;
/* Stage 2 paging state used by the hardware on next switch */
struct kvm_s2_mmu *hw_mmu;
......@@ -451,6 +454,7 @@ struct kvm_vcpu_arch {
#define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context if active */
#define KVM_ARM64_FP_FOREIGN_FPSTATE (1 << 14)
#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 15) /* Physical CPU not in supported_cpus */
#define KVM_ARM64_HOST_SME_ENABLED (1 << 16) /* SME enabled for EL0 */
#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
KVM_GUESTDBG_USE_SW_BP | \
......
......@@ -47,6 +47,7 @@ long set_mte_ctrl(struct task_struct *task, unsigned long arg);
long get_mte_ctrl(struct task_struct *task);
int mte_ptrace_copy_tags(struct task_struct *child, long request,
unsigned long addr, unsigned long data);
size_t mte_probe_user_range(const char __user *uaddr, size_t size);
#else /* CONFIG_ARM64_MTE */
......
......@@ -49,7 +49,7 @@
#define PMD_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(2)
#define PMD_SIZE (_AC(1, UL) << PMD_SHIFT)
#define PMD_MASK (~(PMD_SIZE-1))
#define PTRS_PER_PMD PTRS_PER_PTE
#define PTRS_PER_PMD (1 << (PAGE_SHIFT - 3))
#endif
/*
......@@ -59,7 +59,7 @@
#define PUD_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(1)
#define PUD_SIZE (_AC(1, UL) << PUD_SHIFT)
#define PUD_MASK (~(PUD_SIZE-1))
#define PTRS_PER_PUD PTRS_PER_PTE
#define PTRS_PER_PUD (1 << (PAGE_SHIFT - 3))
#endif
/*
......
......@@ -1001,7 +1001,8 @@ static inline void update_mmu_cache(struct vm_area_struct *vma,
*/
static inline bool arch_faults_on_old_pte(void)
{
WARN_ON(preemptible());
/* The register read below requires a stable CPU to make any sense */
cant_migrate();
return !cpu_has_hw_af();
}
......
......@@ -118,6 +118,7 @@ struct debug_info {
enum vec_type {
ARM64_VEC_SVE = 0,
ARM64_VEC_SME,
ARM64_VEC_MAX,
};
......@@ -153,6 +154,7 @@ struct thread_struct {
unsigned int fpsimd_cpu;
void *sve_state; /* SVE registers, if any */
void *za_state; /* ZA register, if any */
unsigned int vl[ARM64_VEC_MAX]; /* vector length */
unsigned int vl_onexec[ARM64_VEC_MAX]; /* vl after next exec */
unsigned long fault_address; /* fault info */
......@@ -168,6 +170,8 @@ struct thread_struct {
u64 mte_ctrl;
#endif
u64 sctlr_user;
u64 svcr;
u64 tpidr2_el0;
};
static inline unsigned int thread_get_vl(struct thread_struct *thread,
......@@ -181,6 +185,19 @@ static inline unsigned int thread_get_sve_vl(struct thread_struct *thread)
return thread_get_vl(thread, ARM64_VEC_SVE);
}
static inline unsigned int thread_get_sme_vl(struct thread_struct *thread)
{
return thread_get_vl(thread, ARM64_VEC_SME);
}
static inline unsigned int thread_get_cur_vl(struct thread_struct *thread)
{
if (system_supports_sme() && (thread->svcr & SYS_SVCR_EL0_SM_MASK))
return thread_get_sme_vl(thread);
else
return thread_get_sve_vl(thread);
}
unsigned int task_get_vl(const struct task_struct *task, enum vec_type type);
void task_set_vl(struct task_struct *task, enum vec_type type,
unsigned long vl);
......@@ -194,6 +211,11 @@ static inline unsigned int task_get_sve_vl(const struct task_struct *task)
return task_get_vl(task, ARM64_VEC_SVE);
}
static inline unsigned int task_get_sme_vl(const struct task_struct *task)
{
return task_get_vl(task, ARM64_VEC_SME);
}
static inline void task_set_sve_vl(struct task_struct *task, unsigned long vl)
{
task_set_vl(task, ARM64_VEC_SVE, vl);
......@@ -354,9 +376,11 @@ extern void __init minsigstksz_setup(void);
*/
#include <asm/fpsimd.h>
/* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
/* Userspace interface for PR_S[MV]E_{SET,GET}_VL prctl()s: */
#define SVE_SET_VL(arg) sve_set_current_vl(arg)
#define SVE_GET_VL() sve_get_current_vl()
#define SME_SET_VL(arg) sme_set_current_vl(arg)
#define SME_GET_VL() sme_get_current_vl()
/* PR_PAC_RESET_KEYS prctl */
#define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg)
......
......@@ -31,38 +31,6 @@ struct stack_info {
enum stack_type type;
};
/*
* A snapshot of a frame record or fp/lr register values, along with some
* accounting information necessary for robust unwinding.
*
* @fp: The fp value in the frame record (or the real fp)
* @pc: The lr value in the frame record (or the real lr)
*
* @stacks_done: Stacks which have been entirely unwound, for which it is no
* longer valid to unwind to.
*
* @prev_fp: The fp that pointed to this frame record, or a synthetic value
* of 0. This is used to ensure that within a stack, each
* subsequent frame record is at an increasing address.
* @prev_type: The type of stack this frame record was on, or a synthetic
* value of STACK_TYPE_UNKNOWN. This is used to detect a
* transition from one stack to another.
*
* @kr_cur: When KRETPROBES is selected, holds the kretprobe instance
* associated with the most recently encountered replacement lr
* value.
*/
struct stackframe {
unsigned long fp;
unsigned long pc;
DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
unsigned long prev_fp;
enum stack_type prev_type;
#ifdef CONFIG_KRETPROBES
struct llist_node *kr_cur;
#endif
};
extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
const char *loglvl);
......
......@@ -118,6 +118,10 @@
* System registers, organised loosely by encoding but grouped together
* where the architected name contains an index. e.g. ID_MMFR<n>_EL1.
*/
#define SYS_SVCR_SMSTOP_SM_EL0 sys_reg(0, 3, 4, 2, 3)
#define SYS_SVCR_SMSTART_SM_EL0 sys_reg(0, 3, 4, 3, 3)
#define SYS_SVCR_SMSTOP_SMZA_EL0 sys_reg(0, 3, 4, 6, 3)
#define SYS_OSDTRRX_EL1 sys_reg(2, 0, 0, 0, 2)
#define SYS_MDCCINT_EL1 sys_reg(2, 0, 0, 2, 0)
#define SYS_MDSCR_EL1 sys_reg(2, 0, 0, 2, 2)
......@@ -181,6 +185,7 @@
#define SYS_ID_AA64PFR0_EL1 sys_reg(3, 0, 0, 4, 0)
#define SYS_ID_AA64PFR1_EL1 sys_reg(3, 0, 0, 4, 1)
#define SYS_ID_AA64ZFR0_EL1 sys_reg(3, 0, 0, 4, 4)
#define SYS_ID_AA64SMFR0_EL1 sys_reg(3, 0, 0, 4, 5)
#define SYS_ID_AA64DFR0_EL1 sys_reg(3, 0, 0, 5, 0)
#define SYS_ID_AA64DFR1_EL1 sys_reg(3, 0, 0, 5, 1)
......@@ -204,6 +209,8 @@
#define SYS_ZCR_EL1 sys_reg(3, 0, 1, 2, 0)
#define SYS_TRFCR_EL1 sys_reg(3, 0, 1, 2, 1)
#define SYS_SMPRI_EL1 sys_reg(3, 0, 1, 2, 4)
#define SYS_SMCR_EL1 sys_reg(3, 0, 1, 2, 6)
#define SYS_TTBR0_EL1 sys_reg(3, 0, 2, 0, 0)
#define SYS_TTBR1_EL1 sys_reg(3, 0, 2, 0, 1)
......@@ -396,6 +403,8 @@
#define TRBIDR_ALIGN_MASK GENMASK(3, 0)
#define TRBIDR_ALIGN_SHIFT 0
#define SMPRI_EL1_PRIORITY_MASK 0xf
#define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1)
#define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2)
......@@ -451,8 +460,13 @@
#define SYS_CCSIDR_EL1 sys_reg(3, 1, 0, 0, 0)
#define SYS_CLIDR_EL1 sys_reg(3, 1, 0, 0, 1)
#define SYS_GMID_EL1 sys_reg(3, 1, 0, 0, 4)
#define SYS_SMIDR_EL1 sys_reg(3, 1, 0, 0, 6)
#define SYS_AIDR_EL1 sys_reg(3, 1, 0, 0, 7)
#define SYS_SMIDR_EL1_IMPLEMENTER_SHIFT 24
#define SYS_SMIDR_EL1_SMPS_SHIFT 15
#define SYS_SMIDR_EL1_AFFINITY_SHIFT 0
#define SYS_CSSELR_EL1 sys_reg(3, 2, 0, 0, 0)
#define SYS_CTR_EL0 sys_reg(3, 3, 0, 0, 1)
......@@ -461,6 +475,10 @@
#define SYS_RNDR_EL0 sys_reg(3, 3, 2, 4, 0)
#define SYS_RNDRRS_EL0 sys_reg(3, 3, 2, 4, 1)
#define SYS_SVCR_EL0 sys_reg(3, 3, 4, 2, 2)
#define SYS_SVCR_EL0_ZA_MASK 2
#define SYS_SVCR_EL0_SM_MASK 1
#define SYS_PMCR_EL0 sys_reg(3, 3, 9, 12, 0)
#define SYS_PMCNTENSET_EL0 sys_reg(3, 3, 9, 12, 1)
#define SYS_PMCNTENCLR_EL0 sys_reg(3, 3, 9, 12, 2)
......@@ -477,6 +495,7 @@
#define SYS_TPIDR_EL0 sys_reg(3, 3, 13, 0, 2)
#define SYS_TPIDRRO_EL0 sys_reg(3, 3, 13, 0, 3)
#define SYS_TPIDR2_EL0 sys_reg(3, 3, 13, 0, 5)
#define SYS_SCXTNUM_EL0 sys_reg(3, 3, 13, 0, 7)
......@@ -546,6 +565,9 @@
#define SYS_HFGITR_EL2 sys_reg(3, 4, 1, 1, 6)
#define SYS_ZCR_EL2 sys_reg(3, 4, 1, 2, 0)
#define SYS_TRFCR_EL2 sys_reg(3, 4, 1, 2, 1)
#define SYS_HCRX_EL2 sys_reg(3, 4, 1, 2, 2)
#define SYS_SMPRIMAP_EL2 sys_reg(3, 4, 1, 2, 5)
#define SYS_SMCR_EL2 sys_reg(3, 4, 1, 2, 6)
#define SYS_DACR32_EL2 sys_reg(3, 4, 3, 0, 0)
#define SYS_HDFGRTR_EL2 sys_reg(3, 4, 3, 1, 4)
#define SYS_HDFGWTR_EL2 sys_reg(3, 4, 3, 1, 5)
......@@ -605,6 +627,7 @@
#define SYS_SCTLR_EL12 sys_reg(3, 5, 1, 0, 0)
#define SYS_CPACR_EL12 sys_reg(3, 5, 1, 0, 2)
#define SYS_ZCR_EL12 sys_reg(3, 5, 1, 2, 0)
#define SYS_SMCR_EL12 sys_reg(3, 5, 1, 2, 6)
#define SYS_TTBR0_EL12 sys_reg(3, 5, 2, 0, 0)
#define SYS_TTBR1_EL12 sys_reg(3, 5, 2, 0, 1)
#define SYS_TCR_EL12 sys_reg(3, 5, 2, 0, 2)
......@@ -628,6 +651,7 @@
#define SYS_CNTV_CVAL_EL02 sys_reg(3, 5, 14, 3, 2)
/* Common SCTLR_ELx flags. */
#define SCTLR_ELx_ENTP2 (BIT(60))
#define SCTLR_ELx_DSSBS (BIT(44))
#define SCTLR_ELx_ATA (BIT(43))
......@@ -836,6 +860,7 @@
#define ID_AA64PFR0_ELx_32BIT_64BIT 0x2
/* id_aa64pfr1 */
#define ID_AA64PFR1_SME_SHIFT 24
#define ID_AA64PFR1_MPAMFRAC_SHIFT 16
#define ID_AA64PFR1_RASFRAC_SHIFT 12
#define ID_AA64PFR1_MTE_SHIFT 8
......@@ -846,6 +871,7 @@
#define ID_AA64PFR1_SSBS_PSTATE_ONLY 1
#define ID_AA64PFR1_SSBS_PSTATE_INSNS 2
#define ID_AA64PFR1_BT_BTI 0x1
#define ID_AA64PFR1_SME 1
#define ID_AA64PFR1_MTE_NI 0x0
#define ID_AA64PFR1_MTE_EL0 0x1
......@@ -874,6 +900,23 @@
#define ID_AA64ZFR0_AES_PMULL 0x2
#define ID_AA64ZFR0_SVEVER_SVE2 0x1
/* id_aa64smfr0 */
#define ID_AA64SMFR0_FA64_SHIFT 63
#define ID_AA64SMFR0_I16I64_SHIFT 52
#define ID_AA64SMFR0_F64F64_SHIFT 48
#define ID_AA64SMFR0_I8I32_SHIFT 36
#define ID_AA64SMFR0_F16F32_SHIFT 35
#define ID_AA64SMFR0_B16F32_SHIFT 34
#define ID_AA64SMFR0_F32F32_SHIFT 32
#define ID_AA64SMFR0_FA64 0x1
#define ID_AA64SMFR0_I16I64 0x4
#define ID_AA64SMFR0_F64F64 0x1
#define ID_AA64SMFR0_I8I32 0x4
#define ID_AA64SMFR0_F16F32 0x1
#define ID_AA64SMFR0_B16F32 0x1
#define ID_AA64SMFR0_F32F32 0x1
/* id_aa64mmfr0 */
#define ID_AA64MMFR0_ECV_SHIFT 60
#define ID_AA64MMFR0_FGT_SHIFT 56
......@@ -926,6 +969,7 @@
/* id_aa64mmfr1 */
#define ID_AA64MMFR1_ECBHB_SHIFT 60
#define ID_AA64MMFR1_HCX_SHIFT 40
#define ID_AA64MMFR1_AFP_SHIFT 44
#define ID_AA64MMFR1_ETS_SHIFT 36
#define ID_AA64MMFR1_TWED_SHIFT 32
......@@ -1119,9 +1163,24 @@
#define ZCR_ELx_LEN_SIZE 9
#define ZCR_ELx_LEN_MASK 0x1ff
#define SMCR_ELx_FA64_SHIFT 31
#define SMCR_ELx_FA64_MASK (1 << SMCR_ELx_FA64_SHIFT)
/*
* The SMCR_ELx_LEN_* definitions intentionally include bits [8:4] which
* are reserved by the SME architecture for future expansion of the LEN
* field, with compatible semantics.
*/
#define SMCR_ELx_LEN_SHIFT 0
#define SMCR_ELx_LEN_SIZE 9
#define SMCR_ELx_LEN_MASK 0x1ff
#define CPACR_EL1_FPEN_EL1EN (BIT(20)) /* enable EL1 access */
#define CPACR_EL1_FPEN_EL0EN (BIT(21)) /* enable EL0 access, if EL1EN set */
#define CPACR_EL1_SMEN_EL1EN (BIT(24)) /* enable EL1 access */
#define CPACR_EL1_SMEN_EL0EN (BIT(25)) /* enable EL0 access, if EL1EN set */
#define CPACR_EL1_ZEN_EL1EN (BIT(16)) /* enable EL1 access */
#define CPACR_EL1_ZEN_EL0EN (BIT(17)) /* enable EL0 access, if EL1EN set */
......@@ -1170,6 +1229,8 @@
#define TRFCR_ELx_ExTRE BIT(1)
#define TRFCR_ELx_E0TRE BIT(0)
/* HCRX_EL2 definitions */
#define HCRX_EL2_SMPME_MASK (1 << 5)
/* GIC Hypervisor interface registers */
/* ICH_MISR_EL2 bit definitions */
......@@ -1233,6 +1294,12 @@
#define ICH_VTR_TDS_SHIFT 19
#define ICH_VTR_TDS_MASK (1 << ICH_VTR_TDS_SHIFT)
/* HFG[WR]TR_EL2 bit definitions */
#define HFGxTR_EL2_nTPIDR2_EL0_SHIFT 55
#define HFGxTR_EL2_nTPIDR2_EL0_MASK BIT_MASK(HFGxTR_EL2_nTPIDR2_EL0_SHIFT)
#define HFGxTR_EL2_nSMPRI_EL1_SHIFT 54
#define HFGxTR_EL2_nSMPRI_EL1_MASK BIT_MASK(HFGxTR_EL2_nSMPRI_EL1_SHIFT)
#define ARM64_FEATURE_FIELD_BITS 4
/* Create a mask for the feature bits of the specified feature. */
......
......@@ -82,6 +82,8 @@ int arch_dup_task_struct(struct task_struct *dst,
#define TIF_SVE_VL_INHERIT 24 /* Inherit SVE vl_onexec across exec */
#define TIF_SSBD 25 /* Wants SSB mitigation */
#define TIF_TAGGED_ADDR 26 /* Allow tagged user addresses */
#define TIF_SME 27 /* SME in use */
#define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */
#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
......
......@@ -460,4 +460,19 @@ static inline int __copy_from_user_flushcache(void *dst, const void __user *src,
}
#endif
#ifdef CONFIG_ARCH_HAS_SUBPAGE_FAULTS
/*
* Return 0 on success, the number of bytes not probed otherwise.
*/
static inline size_t probe_subpage_writeable(const char __user *uaddr,
size_t size)
{
if (!system_supports_mte())
return 0;
return mte_probe_user_range(uaddr, size);
}
#endif /* CONFIG_ARCH_HAS_SUBPAGE_FAULTS */
#endif /* __ASM_UACCESS_H */
......@@ -79,5 +79,13 @@
#define HWCAP2_AFP (1 << 20)
#define HWCAP2_RPRES (1 << 21)
#define HWCAP2_MTE3 (1 << 22)
#define HWCAP2_SME (1 << 23)
#define HWCAP2_SME_I16I64 (1 << 24)
#define HWCAP2_SME_F64F64 (1 << 25)
#define HWCAP2_SME_I8I32 (1 << 26)
#define HWCAP2_SME_F16F32 (1 << 27)
#define HWCAP2_SME_B16F32 (1 << 28)
#define HWCAP2_SME_F32F32 (1 << 29)
#define HWCAP2_SME_FA64 (1 << 30)
#endif /* _UAPI__ASM_HWCAP_H */
......@@ -109,7 +109,7 @@ struct user_hwdebug_state {
} dbg_regs[16];
};
/* SVE/FP/SIMD state (NT_ARM_SVE) */
/* SVE/FP/SIMD state (NT_ARM_SVE & NT_ARM_SSVE) */
struct user_sve_header {
__u32 size; /* total meaningful regset content in bytes */
......@@ -220,6 +220,7 @@ struct user_sve_header {
(SVE_PT_SVE_PREG_OFFSET(vq, __SVE_NUM_PREGS) - \
SVE_PT_SVE_PREGS_OFFSET(vq))
/* For streaming mode SVE (SSVE) FFR must be read and written as zero */
#define SVE_PT_SVE_FFR_OFFSET(vq) \
(SVE_PT_REGS_OFFSET + __SVE_FFR_OFFSET(vq))
......@@ -240,10 +241,12 @@ struct user_sve_header {
- SVE_PT_SVE_OFFSET + (__SVE_VQ_BYTES - 1)) \
/ __SVE_VQ_BYTES * __SVE_VQ_BYTES)
#define SVE_PT_SIZE(vq, flags) \
(((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_SVE ? \
SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, flags) \
: SVE_PT_FPSIMD_OFFSET + SVE_PT_FPSIMD_SIZE(vq, flags))
#define SVE_PT_SIZE(vq, flags) \
(((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_SVE ? \
SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, flags) \
: ((((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD ? \
SVE_PT_FPSIMD_OFFSET + SVE_PT_FPSIMD_SIZE(vq, flags) \
: SVE_PT_REGS_OFFSET)))
/* pointer authentication masks (NT_ARM_PAC_MASK) */
......@@ -265,6 +268,62 @@ struct user_pac_generic_keys {
__uint128_t apgakey;
};
/* ZA state (NT_ARM_ZA) */
struct user_za_header {
__u32 size; /* total meaningful regset content in bytes */
__u32 max_size; /* maxmium possible size for this thread */
__u16 vl; /* current vector length */
__u16 max_vl; /* maximum possible vector length */
__u16 flags;
__u16 __reserved;
};
/*
* Common ZA_PT_* flags:
* These must be kept in sync with prctl interface in <linux/prctl.h>
*/
#define ZA_PT_VL_INHERIT ((1 << 17) /* PR_SME_VL_INHERIT */ >> 16)
#define ZA_PT_VL_ONEXEC ((1 << 18) /* PR_SME_SET_VL_ONEXEC */ >> 16)
/*
* The remainder of the ZA state follows struct user_za_header. The
* total size of the ZA state (including header) depends on the
* metadata in the header: ZA_PT_SIZE(vq, flags) gives the total size
* of the state in bytes, including the header.
*
* Refer to <asm/sigcontext.h> for details of how to pass the correct
* "vq" argument to these macros.
*/
/* Offset from the start of struct user_za_header to the register data */
#define ZA_PT_ZA_OFFSET \
((sizeof(struct user_za_header) + (__SVE_VQ_BYTES - 1)) \
/ __SVE_VQ_BYTES * __SVE_VQ_BYTES)
/*
* The payload starts at offset ZA_PT_ZA_OFFSET, and is of size
* ZA_PT_ZA_SIZE(vq, flags).
*
* The ZA array is stored as a sequence of horizontal vectors ZAV of SVL/8
* bytes each, starting from vector 0.
*
* Additional data might be appended in the future.
*
* The ZA matrix is represented in memory in an endianness-invariant layout
* which differs from the layout used for the FPSIMD V-registers on big-endian
* systems: see sigcontext.h for more explanation.
*/
#define ZA_PT_ZAV_OFFSET(vq, n) \
(ZA_PT_ZA_OFFSET + ((vq * __SVE_VQ_BYTES) * n))
#define ZA_PT_ZA_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES))
#define ZA_PT_SIZE(vq) \
(ZA_PT_ZA_OFFSET + ZA_PT_ZA_SIZE(vq))
#endif /* __ASSEMBLY__ */
#endif /* _UAPI__ASM_PTRACE_H */
......@@ -132,6 +132,17 @@ struct extra_context {
#define SVE_MAGIC 0x53564501
struct sve_context {
struct _aarch64_ctx head;
__u16 vl;
__u16 flags;
__u16 __reserved[2];
};
#define SVE_SIG_FLAG_SM 0x1 /* Context describes streaming mode */
#define ZA_MAGIC 0x54366345
struct za_context {
struct _aarch64_ctx head;
__u16 vl;
__u16 __reserved[3];
......@@ -186,9 +197,16 @@ struct sve_context {
* sve_context.vl must equal the thread's current vector length when
* doing a sigreturn.
*
* On systems with support for SME the SVE register state may reflect either
* streaming or non-streaming mode. In streaming mode the streaming mode
* vector length will be used and the flag SVE_SIG_FLAG_SM will be set in
* the flags field. It is permitted to enter or leave streaming mode in
* a signal return, applications should take care to ensure that any difference
* in vector length between the two modes is handled, including any resizing
* and movement of context blocks.
*
* Note: for all these macros, the "vq" argument denotes the SVE
* vector length in quadwords (i.e., units of 128 bits).
* Note: for all these macros, the "vq" argument denotes the vector length
* in quadwords (i.e., units of 128 bits).
*
* The correct way to obtain vq is to use sve_vq_from_vl(vl). The
* result is valid if and only if sve_vl_valid(vl) is true. This is
......@@ -249,4 +267,37 @@ struct sve_context {
#define SVE_SIG_CONTEXT_SIZE(vq) \
(SVE_SIG_REGS_OFFSET + SVE_SIG_REGS_SIZE(vq))
/*
* If the ZA register is enabled for the thread at signal delivery then,
* za_context.head.size >= ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl))
* and the register data may be accessed using the ZA_SIG_*() macros.
*
* If za_context.head.size < ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl))
* then ZA was not enabled and no register data was included in which case
* ZA register was not enabled for the thread and no register data
* the ZA_SIG_*() macros should not be used except for this check.
*
* The same convention applies when returning from a signal: a caller
* will need to remove or resize the za_context block if it wants to
* enable the ZA register when it was previously non-live or vice-versa.
* This may require the caller to allocate fresh memory and/or move other
* context blocks in the signal frame.
*
* Changing the vector length during signal return is not permitted:
* za_context.vl must equal the thread's current SME vector length when
* doing a sigreturn.
*/
#define ZA_SIG_REGS_OFFSET \
((sizeof(struct za_context) + (__SVE_VQ_BYTES - 1)) \
/ __SVE_VQ_BYTES * __SVE_VQ_BYTES)
#define ZA_SIG_REGS_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES))
#define ZA_SIG_ZAV_OFFSET(vq, n) (ZA_SIG_REGS_OFFSET + \
(SVE_SIG_ZREG_SIZE(vq) * n))
#define ZA_SIG_CONTEXT_SIZE(vq) \
(ZA_SIG_REGS_OFFSET + ZA_SIG_REGS_SIZE(vq))
#endif /* _UAPI__ASM_SIGCONTEXT_H */
......@@ -215,7 +215,7 @@ static const struct arm64_cpu_capabilities arm64_repeat_tlbi_list[] = {
#endif
#ifdef CONFIG_CAVIUM_ERRATUM_23154
const struct midr_range cavium_erratum_23154_cpus[] = {
static const struct midr_range cavium_erratum_23154_cpus[] = {
MIDR_ALL_VERSIONS(MIDR_THUNDERX),
MIDR_ALL_VERSIONS(MIDR_THUNDERX_81XX),
MIDR_ALL_VERSIONS(MIDR_THUNDERX_83XX),
......
......@@ -261,6 +261,8 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
};
static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SME_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MPAMFRAC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_RASFRAC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
......@@ -293,6 +295,24 @@ static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = {
ARM64_FTR_END,
};
static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = {
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_FA64_SHIFT, 1, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_I16I64_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F64F64_SHIFT, 1, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_I8I32_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F16F32_SHIFT, 1, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_B16F32_SHIFT, 1, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F32F32_SHIFT, 1, 0),
ARM64_FTR_END,
};
static const struct arm64_ftr_bits ftr_id_aa64mmfr0[] = {
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_ECV_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_FGT_SHIFT, 4, 0),
......@@ -561,6 +581,12 @@ static const struct arm64_ftr_bits ftr_zcr[] = {
ARM64_FTR_END,
};
static const struct arm64_ftr_bits ftr_smcr[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE,
SMCR_ELx_LEN_SHIFT, SMCR_ELx_LEN_SIZE, 0), /* LEN */
ARM64_FTR_END,
};
/*
* Common ftr bits for a 32bit register with all hidden, strict
* attributes, with 4bit feature fields and a default safe value of
......@@ -645,6 +671,7 @@ static const struct __ftr_reg_entry {
ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1,
&id_aa64pfr1_override),
ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0),
ARM64_FTR_REG(SYS_ID_AA64SMFR0_EL1, ftr_id_aa64smfr0),
/* Op1 = 0, CRn = 0, CRm = 5 */
ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0),
......@@ -666,6 +693,7 @@ static const struct __ftr_reg_entry {
/* Op1 = 0, CRn = 1, CRm = 2 */
ARM64_FTR_REG(SYS_ZCR_EL1, ftr_zcr),
ARM64_FTR_REG(SYS_SMCR_EL1, ftr_smcr),
/* Op1 = 1, CRn = 0, CRm = 0 */
ARM64_FTR_REG(SYS_GMID_EL1, ftr_gmid),
......@@ -960,6 +988,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0);
init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);
init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0);
init_cpu_ftr_reg(SYS_ID_AA64SMFR0_EL1, info->reg_id_aa64smfr0);
if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0))
init_32bit_cpu_features(&info->aarch32);
......@@ -969,6 +998,12 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
vec_init_vq_map(ARM64_VEC_SVE);
}
if (id_aa64pfr1_sme(info->reg_id_aa64pfr1)) {
init_cpu_ftr_reg(SYS_SMCR_EL1, info->reg_smcr);
if (IS_ENABLED(CONFIG_ARM64_SME))
vec_init_vq_map(ARM64_VEC_SME);
}
if (id_aa64pfr1_mte(info->reg_id_aa64pfr1))
init_cpu_ftr_reg(SYS_GMID_EL1, info->reg_gmid);
......@@ -1195,6 +1230,9 @@ void update_cpu_features(int cpu,
taint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu,
info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0);
taint |= check_update_ftr_reg(SYS_ID_AA64SMFR0_EL1, cpu,
info->reg_id_aa64smfr0, boot->reg_id_aa64smfr0);
if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) {
taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu,
info->reg_zcr, boot->reg_zcr);
......@@ -1205,6 +1243,16 @@ void update_cpu_features(int cpu,
vec_update_vq_map(ARM64_VEC_SVE);
}
if (id_aa64pfr1_sme(info->reg_id_aa64pfr1)) {
taint |= check_update_ftr_reg(SYS_SMCR_EL1, cpu,
info->reg_smcr, boot->reg_smcr);
/* Probe vector lengths, unless we already gave up on SME */
if (id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1)) &&
!system_capabilities_finalized())
vec_update_vq_map(ARM64_VEC_SME);
}
/*
* The kernel uses the LDGM/STGM instructions and the number of tags
* they read/write depends on the GMID_EL1.BS field. Check that the
......@@ -1288,6 +1336,7 @@ u64 __read_sysreg_by_encoding(u32 sys_id)
read_sysreg_case(SYS_ID_AA64PFR0_EL1);
read_sysreg_case(SYS_ID_AA64PFR1_EL1);
read_sysreg_case(SYS_ID_AA64ZFR0_EL1);
read_sysreg_case(SYS_ID_AA64SMFR0_EL1);
read_sysreg_case(SYS_ID_AA64DFR0_EL1);
read_sysreg_case(SYS_ID_AA64DFR1_EL1);
read_sysreg_case(SYS_ID_AA64MMFR0_EL1);
......@@ -2442,6 +2491,33 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.matches = has_cpuid_feature,
.min_field_value = 1,
},
#ifdef CONFIG_ARM64_SME
{
.desc = "Scalable Matrix Extension",
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.capability = ARM64_SME,
.sys_reg = SYS_ID_AA64PFR1_EL1,
.sign = FTR_UNSIGNED,
.field_pos = ID_AA64PFR1_SME_SHIFT,
.field_width = 4,
.min_field_value = ID_AA64PFR1_SME,
.matches = has_cpuid_feature,
.cpu_enable = sme_kernel_enable,
},
/* FA64 should be sorted after the base SME capability */
{
.desc = "FA64",
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.capability = ARM64_SME_FA64,
.sys_reg = SYS_ID_AA64SMFR0_EL1,
.sign = FTR_UNSIGNED,
.field_pos = ID_AA64SMFR0_FA64_SHIFT,
.field_width = 1,
.min_field_value = ID_AA64SMFR0_FA64,
.matches = has_cpuid_feature,
.cpu_enable = fa64_kernel_enable,
},
#endif /* CONFIG_ARM64_SME */
{},
};
......@@ -2575,6 +2651,16 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV),
HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP),
HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES),
#ifdef CONFIG_ARM64_SME
HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SME_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_SME, CAP_HWCAP, KERNEL_HWCAP_SME),
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_FA64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_FA64, CAP_HWCAP, KERNEL_HWCAP_SME_FA64),
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_I16I64_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_I16I64, CAP_HWCAP, KERNEL_HWCAP_SME_I16I64),
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F64F64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F64F64, CAP_HWCAP, KERNEL_HWCAP_SME_F64F64),
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_I8I32_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_I8I32, CAP_HWCAP, KERNEL_HWCAP_SME_I8I32),
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F16F32, CAP_HWCAP, KERNEL_HWCAP_SME_F16F32),
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_B16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_B16F32, CAP_HWCAP, KERNEL_HWCAP_SME_B16F32),
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F32F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F32F32, CAP_HWCAP, KERNEL_HWCAP_SME_F32F32),
#endif /* CONFIG_ARM64_SME */
{},
};
......@@ -2872,6 +2958,23 @@ static void verify_sve_features(void)
/* Add checks on other ZCR bits here if necessary */
}
static void verify_sme_features(void)
{
u64 safe_smcr = read_sanitised_ftr_reg(SYS_SMCR_EL1);
u64 smcr = read_smcr_features();
unsigned int safe_len = safe_smcr & SMCR_ELx_LEN_MASK;
unsigned int len = smcr & SMCR_ELx_LEN_MASK;
if (len < safe_len || vec_verify_vq_map(ARM64_VEC_SME)) {
pr_crit("CPU%d: SME: vector length support mismatch\n",
smp_processor_id());
cpu_die_early();
}
/* Add checks on other SMCR bits here if necessary */
}
static void verify_hyp_capabilities(void)
{
u64 safe_mmfr1, mmfr0, mmfr1;
......@@ -2924,6 +3027,9 @@ static void verify_local_cpu_capabilities(void)
if (system_supports_sve())
verify_sve_features();
if (system_supports_sme())
verify_sme_features();
if (is_hyp_mode_available())
verify_hyp_capabilities();
}
......@@ -3041,6 +3147,7 @@ void __init setup_cpu_features(void)
pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n");
sve_setup();
sme_setup();
minsigstksz_setup();
/* Advertise that we have computed the system capabilities */
......
......@@ -98,6 +98,14 @@ static const char *const hwcap_str[] = {
[KERNEL_HWCAP_AFP] = "afp",
[KERNEL_HWCAP_RPRES] = "rpres",
[KERNEL_HWCAP_MTE3] = "mte3",
[KERNEL_HWCAP_SME] = "sme",
[KERNEL_HWCAP_SME_I16I64] = "smei16i64",
[KERNEL_HWCAP_SME_F64F64] = "smef64f64",
[KERNEL_HWCAP_SME_I8I32] = "smei8i32",
[KERNEL_HWCAP_SME_F16F32] = "smef16f32",
[KERNEL_HWCAP_SME_B16F32] = "smeb16f32",
[KERNEL_HWCAP_SME_F32F32] = "smef32f32",
[KERNEL_HWCAP_SME_FA64] = "smefa64",
};
#ifdef CONFIG_COMPAT
......@@ -401,6 +409,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1);
info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);
info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1);
info->reg_id_aa64smfr0 = read_cpuid(ID_AA64SMFR0_EL1);
if (id_aa64pfr1_mte(info->reg_id_aa64pfr1))
info->reg_gmid = read_cpuid(GMID_EL1);
......@@ -412,6 +421,10 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
id_aa64pfr0_sve(info->reg_id_aa64pfr0))
info->reg_zcr = read_zcr_features();
if (IS_ENABLED(CONFIG_ARM64_SME) &&
id_aa64pfr1_sme(info->reg_id_aa64pfr1))
info->reg_smcr = read_smcr_features();
cpuinfo_detect_icache_policy(info);
}
......
......@@ -537,6 +537,14 @@ static void noinstr el0_sve_acc(struct pt_regs *regs, unsigned long esr)
exit_to_user_mode(regs);
}
static void noinstr el0_sme_acc(struct pt_regs *regs, unsigned long esr)
{
enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sme_acc(esr, regs);
exit_to_user_mode(regs);
}
static void noinstr el0_fpsimd_exc(struct pt_regs *regs, unsigned long esr)
{
enter_from_user_mode(regs);
......@@ -645,6 +653,9 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
case ESR_ELx_EC_SVE:
el0_sve_acc(regs, esr);
break;
case ESR_ELx_EC_SME:
el0_sme_acc(regs, esr);
break;
case ESR_ELx_EC_FP_EXC64:
el0_fpsimd_exc(regs, esr);
break;
......
......@@ -86,3 +86,39 @@ SYM_FUNC_START(sve_flush_live)
SYM_FUNC_END(sve_flush_live)
#endif /* CONFIG_ARM64_SVE */
#ifdef CONFIG_ARM64_SME
SYM_FUNC_START(sme_get_vl)
_sme_rdsvl 0, 1
ret
SYM_FUNC_END(sme_get_vl)
SYM_FUNC_START(sme_set_vq)
sme_load_vq x0, x1, x2
ret
SYM_FUNC_END(sme_set_vq)
/*
* Save the SME state
*
* x0 - pointer to buffer for state
*/
SYM_FUNC_START(za_save_state)
_sme_rdsvl 1, 1 // x1 = VL/8
sme_save_za 0, x1, 12
ret
SYM_FUNC_END(za_save_state)
/*
* Load the SME state
*
* x0 - pointer to buffer for state
*/
SYM_FUNC_START(za_load_state)
_sme_rdsvl 1, 1 // x1 = VL/8
sme_load_za 0, x1, 12
ret
SYM_FUNC_END(za_load_state)
#endif /* CONFIG_ARM64_SME */
......@@ -97,12 +97,6 @@ SYM_CODE_START(ftrace_common)
SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
bl ftrace_stub
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) // ftrace_graph_caller();
nop // If enabled, this will be replaced
// "b ftrace_graph_caller"
#endif
/*
* At the callsite x0-x8 and x19-x30 were live. Any C code will have preserved
* x19-x29 per the AAPCS, and we created frame records upon entry, so we need
......@@ -127,17 +121,6 @@ ftrace_common_return:
ret x9
SYM_CODE_END(ftrace_common)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
SYM_CODE_START(ftrace_graph_caller)
ldr x0, [sp, #S_PC]
sub x0, x0, #AARCH64_INSN_SIZE // ip (callsite's BL insn)
add x1, sp, #S_LR // parent_ip (callsite's LR)
ldr x2, [sp, #PT_REGS_SIZE] // parent fp (callsite's FP)
bl prepare_ftrace_return
b ftrace_common_return
SYM_CODE_END(ftrace_graph_caller)
#endif
#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
/*
......
此差异已折叠。
......@@ -268,6 +268,22 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
}
#ifdef CONFIG_DYNAMIC_FTRACE
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs)
{
/*
* When DYNAMIC_FTRACE_WITH_REGS is selected, `fregs` can never be NULL
* and arch_ftrace_get_regs(fregs) will always give a non-NULL pt_regs
* in which we can safely modify the LR.
*/
struct pt_regs *regs = arch_ftrace_get_regs(fregs);
unsigned long *parent = (unsigned long *)&procedure_link_pointer(regs);
prepare_ftrace_return(ip, parent, frame_pointer(regs));
}
#else
/*
* Turn on/off the call to ftrace_graph_caller() in ftrace_caller()
* depending on @enable.
......@@ -297,5 +313,6 @@ int ftrace_disable_ftrace_graph_caller(void)
{
return ftrace_modify_graph_caller(false);
}
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
......@@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
/* in reserved memory? */
addr = __pfn_to_phys(pfn);
if ((addr < crashk_res.start) || (crashk_res.end < addr))
return false;
if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
if (!crashk_low_res.end)
return false;
if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
return false;
}
if (!kexec_crash_image)
return true;
......
......@@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
/* Exclude crashkernel region */
ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
if (ret)
goto out;
if (crashk_low_res.end) {
ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
if (ret)
goto out;
}
if (!ret)
ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
out:
kfree(cmem);
return ret;
}
......
......@@ -15,6 +15,7 @@
#include <linux/swapops.h>
#include <linux/thread_info.h>
#include <linux/types.h>
#include <linux/uaccess.h>
#include <linux/uio.h>
#include <asm/barrier.h>
......@@ -543,3 +544,32 @@ static int register_mte_tcf_preferred_sysctl(void)
return 0;
}
subsys_initcall(register_mte_tcf_preferred_sysctl);
/*
* Return 0 on success, the number of bytes not probed otherwise.
*/
size_t mte_probe_user_range(const char __user *uaddr, size_t size)
{
const char __user *end = uaddr + size;
int err = 0;
char val;
__raw_get_user(val, uaddr, err);
if (err)
return size;
uaddr = PTR_ALIGN(uaddr, MTE_GRANULE_SIZE);
while (uaddr < end) {
/*
* A read is sufficient for mte, the caller should have probed
* for the pte write permission if required.
*/
__raw_get_user(val, uaddr, err);
if (err)
return end - uaddr;
uaddr += MTE_GRANULE_SIZE;
}
(void)val;
return 0;
}
......@@ -250,6 +250,8 @@ void show_regs(struct pt_regs *regs)
static void tls_thread_flush(void)
{
write_sysreg(0, tpidr_el0);
if (system_supports_tpidr2())
write_sysreg_s(0, SYS_TPIDR2_EL0);
if (is_compat_task()) {
current->thread.uw.tp_value = 0;
......@@ -298,16 +300,42 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
/*
* Detach src's sve_state (if any) from dst so that it does not
* get erroneously used or freed prematurely. dst's sve_state
* get erroneously used or freed prematurely. dst's copies
* will be allocated on demand later on if dst uses SVE.
* For consistency, also clear TIF_SVE here: this could be done
* later in copy_process(), but to avoid tripping up future
* maintainers it is best not to leave TIF_SVE and sve_state in
* maintainers it is best not to leave TIF flags and buffers in
* an inconsistent state, even temporarily.
*/
dst->thread.sve_state = NULL;
clear_tsk_thread_flag(dst, TIF_SVE);
/*
* In the unlikely event that we create a new thread with ZA
* enabled we should retain the ZA state so duplicate it here.
* This may be shortly freed if we exec() or if CLONE_SETTLS
* but it's simpler to do it here. To avoid confusing the rest
* of the code ensure that we have a sve_state allocated
* whenever za_state is allocated.
*/
if (thread_za_enabled(&src->thread)) {
dst->thread.sve_state = kzalloc(sve_state_size(src),
GFP_KERNEL);
if (!dst->thread.sve_state)
return -ENOMEM;
dst->thread.za_state = kmemdup(src->thread.za_state,
za_state_size(src),
GFP_KERNEL);
if (!dst->thread.za_state) {
kfree(dst->thread.sve_state);
dst->thread.sve_state = NULL;
return -ENOMEM;
}
} else {
dst->thread.za_state = NULL;
clear_tsk_thread_flag(dst, TIF_SME);
}
/* clear any pending asynchronous tag fault raised by the parent */
clear_tsk_thread_flag(dst, TIF_MTE_ASYNC_FAULT);
......@@ -343,6 +371,8 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
* out-of-sync with the saved value.
*/
*task_user_tls(p) = read_sysreg(tpidr_el0);
if (system_supports_tpidr2())
p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
if (stack_start) {
if (is_compat_thread(task_thread_info(p)))
......@@ -353,10 +383,12 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
/*
* If a TLS pointer was passed to clone, use it for the new
* thread.
* thread. We also reset TPIDR2 if it's in use.
*/
if (clone_flags & CLONE_SETTLS)
if (clone_flags & CLONE_SETTLS) {
p->thread.uw.tp_value = tls;
p->thread.tpidr2_el0 = 0;
}
} else {
/*
* A kthread has no context to ERET to, so ensure any buggy
......@@ -387,6 +419,8 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
void tls_preserve_current_state(void)
{
*task_user_tls(current) = read_sysreg(tpidr_el0);
if (system_supports_tpidr2() && !is_compat_task())
current->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
}
static void tls_thread_switch(struct task_struct *next)
......@@ -399,6 +433,8 @@ static void tls_thread_switch(struct task_struct *next)
write_sysreg(0, tpidrro_el0);
write_sysreg(*task_user_tls(next), tpidr_el0);
if (system_supports_tpidr2())
write_sysreg_s(next->thread.tpidr2_el0, SYS_TPIDR2_EL0);
}
/*
......
......@@ -713,21 +713,51 @@ static int system_call_set(struct task_struct *target,
#ifdef CONFIG_ARM64_SVE
static void sve_init_header_from_task(struct user_sve_header *header,
struct task_struct *target)
struct task_struct *target,
enum vec_type type)
{
unsigned int vq;
bool active;
bool fpsimd_only;
enum vec_type task_type;
memset(header, 0, sizeof(*header));
header->flags = test_tsk_thread_flag(target, TIF_SVE) ?
SVE_PT_REGS_SVE : SVE_PT_REGS_FPSIMD;
if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT))
header->flags |= SVE_PT_VL_INHERIT;
/* Check if the requested registers are active for the task */
if (thread_sm_enabled(&target->thread))
task_type = ARM64_VEC_SME;
else
task_type = ARM64_VEC_SVE;
active = (task_type == type);
switch (type) {
case ARM64_VEC_SVE:
if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT))
header->flags |= SVE_PT_VL_INHERIT;
fpsimd_only = !test_tsk_thread_flag(target, TIF_SVE);
break;
case ARM64_VEC_SME:
if (test_tsk_thread_flag(target, TIF_SME_VL_INHERIT))
header->flags |= SVE_PT_VL_INHERIT;
fpsimd_only = false;
break;
default:
WARN_ON_ONCE(1);
return;
}
header->vl = task_get_sve_vl(target);
if (active) {
if (fpsimd_only) {
header->flags |= SVE_PT_REGS_FPSIMD;
} else {
header->flags |= SVE_PT_REGS_SVE;
}
}
header->vl = task_get_vl(target, type);
vq = sve_vq_from_vl(header->vl);
header->max_vl = sve_max_vl();
header->max_vl = vec_max_vl(type);
header->size = SVE_PT_SIZE(vq, header->flags);
header->max_size = SVE_PT_SIZE(sve_vq_from_vl(header->max_vl),
SVE_PT_REGS_SVE);
......@@ -738,19 +768,17 @@ static unsigned int sve_size_from_header(struct user_sve_header const *header)
return ALIGN(header->size, SVE_VQ_BYTES);
}
static int sve_get(struct task_struct *target,
const struct user_regset *regset,
struct membuf to)
static int sve_get_common(struct task_struct *target,
const struct user_regset *regset,
struct membuf to,
enum vec_type type)
{
struct user_sve_header header;
unsigned int vq;
unsigned long start, end;
if (!system_supports_sve())
return -EINVAL;
/* Header */
sve_init_header_from_task(&header, target);
sve_init_header_from_task(&header, target, type);
vq = sve_vq_from_vl(header.vl);
membuf_write(&to, &header, sizeof(header));
......@@ -758,49 +786,61 @@ static int sve_get(struct task_struct *target,
if (target == current)
fpsimd_preserve_current_state();
/* Registers: FPSIMD-only case */
BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header));
if ((header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD)
BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header));
switch ((header.flags & SVE_PT_REGS_MASK)) {
case SVE_PT_REGS_FPSIMD:
return __fpr_get(target, regset, to);
/* Otherwise: full SVE case */
case SVE_PT_REGS_SVE:
start = SVE_PT_SVE_OFFSET;
end = SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq);
membuf_write(&to, target->thread.sve_state, end - start);
BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header));
start = SVE_PT_SVE_OFFSET;
end = SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq);
membuf_write(&to, target->thread.sve_state, end - start);
start = end;
end = SVE_PT_SVE_FPSR_OFFSET(vq);
membuf_zero(&to, end - start);
start = end;
end = SVE_PT_SVE_FPSR_OFFSET(vq);
membuf_zero(&to, end - start);
/*
* Copy fpsr, and fpcr which must follow contiguously in
* struct fpsimd_state:
*/
start = end;
end = SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE;
membuf_write(&to, &target->thread.uw.fpsimd_state.fpsr,
end - start);
/*
* Copy fpsr, and fpcr which must follow contiguously in
* struct fpsimd_state:
*/
start = end;
end = SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE;
membuf_write(&to, &target->thread.uw.fpsimd_state.fpsr, end - start);
start = end;
end = sve_size_from_header(&header);
return membuf_zero(&to, end - start);
start = end;
end = sve_size_from_header(&header);
return membuf_zero(&to, end - start);
default:
return 0;
}
}
static int sve_set(struct task_struct *target,
static int sve_get(struct task_struct *target,
const struct user_regset *regset,
unsigned int pos, unsigned int count,
const void *kbuf, const void __user *ubuf)
struct membuf to)
{
if (!system_supports_sve())
return -EINVAL;
return sve_get_common(target, regset, to, ARM64_VEC_SVE);
}
static int sve_set_common(struct task_struct *target,
const struct user_regset *regset,
unsigned int pos, unsigned int count,
const void *kbuf, const void __user *ubuf,
enum vec_type type)
{
int ret;
struct user_sve_header header;
unsigned int vq;
unsigned long start, end;
if (!system_supports_sve())
return -EINVAL;
/* Header */
if (count < sizeof(header))
return -EINVAL;
......@@ -813,13 +853,37 @@ static int sve_set(struct task_struct *target,
* Apart from SVE_PT_REGS_MASK, all SVE_PT_* flags are consumed by
* vec_set_vector_length(), which will also validate them for us:
*/
ret = vec_set_vector_length(target, ARM64_VEC_SVE, header.vl,
ret = vec_set_vector_length(target, type, header.vl,
((unsigned long)header.flags & ~SVE_PT_REGS_MASK) << 16);
if (ret)
goto out;
/* Actual VL set may be less than the user asked for: */
vq = sve_vq_from_vl(task_get_sve_vl(target));
vq = sve_vq_from_vl(task_get_vl(target, type));
/* Enter/exit streaming mode */
if (system_supports_sme()) {
u64 old_svcr = target->thread.svcr;
switch (type) {
case ARM64_VEC_SVE:
target->thread.svcr &= ~SYS_SVCR_EL0_SM_MASK;
break;
case ARM64_VEC_SME:
target->thread.svcr |= SYS_SVCR_EL0_SM_MASK;
break;
default:
WARN_ON_ONCE(1);
return -EINVAL;
}
/*
* If we switched then invalidate any existing SVE
* state and ensure there's storage.
*/
if (target->thread.svcr != old_svcr)
sve_alloc(target);
}
/* Registers: FPSIMD-only case */
......@@ -828,10 +892,15 @@ static int sve_set(struct task_struct *target,
ret = __fpr_set(target, regset, pos, count, kbuf, ubuf,
SVE_PT_FPSIMD_OFFSET);
clear_tsk_thread_flag(target, TIF_SVE);
if (type == ARM64_VEC_SME)
fpsimd_force_sync_to_sve(target);
goto out;
}
/* Otherwise: full SVE case */
/*
* Otherwise: no registers or full SVE case. For backwards
* compatibility reasons we treat empty flags as SVE registers.
*/
/*
* If setting a different VL from the requested VL and there is
......@@ -852,8 +921,9 @@ static int sve_set(struct task_struct *target,
/*
* Ensure target->thread.sve_state is up to date with target's
* FPSIMD regs, so that a short copyin leaves trailing registers
* unmodified.
* FPSIMD regs, so that a short copyin leaves trailing
* registers unmodified. Always enable SVE even if going into
* streaming mode.
*/
fpsimd_sync_to_sve(target);
set_tsk_thread_flag(target, TIF_SVE);
......@@ -889,8 +959,181 @@ static int sve_set(struct task_struct *target,
return ret;
}
static int sve_set(struct task_struct *target,
const struct user_regset *regset,
unsigned int pos, unsigned int count,
const void *kbuf, const void __user *ubuf)
{
if (!system_supports_sve())
return -EINVAL;
return sve_set_common(target, regset, pos, count, kbuf, ubuf,
ARM64_VEC_SVE);
}
#endif /* CONFIG_ARM64_SVE */
#ifdef CONFIG_ARM64_SME
static int ssve_get(struct task_struct *target,
const struct user_regset *regset,
struct membuf to)
{
if (!system_supports_sme())
return -EINVAL;
return sve_get_common(target, regset, to, ARM64_VEC_SME);
}
static int ssve_set(struct task_struct *target,
const struct user_regset *regset,
unsigned int pos, unsigned int count,
const void *kbuf, const void __user *ubuf)
{
if (!system_supports_sme())
return -EINVAL;
return sve_set_common(target, regset, pos, count, kbuf, ubuf,
ARM64_VEC_SME);
}
static int za_get(struct task_struct *target,
const struct user_regset *regset,
struct membuf to)
{
struct user_za_header header;
unsigned int vq;
unsigned long start, end;
if (!system_supports_sme())
return -EINVAL;
/* Header */
memset(&header, 0, sizeof(header));
if (test_tsk_thread_flag(target, TIF_SME_VL_INHERIT))
header.flags |= ZA_PT_VL_INHERIT;
header.vl = task_get_sme_vl(target);
vq = sve_vq_from_vl(header.vl);
header.max_vl = sme_max_vl();
header.max_size = ZA_PT_SIZE(vq);
/* If ZA is not active there is only the header */
if (thread_za_enabled(&target->thread))
header.size = ZA_PT_SIZE(vq);
else
header.size = ZA_PT_ZA_OFFSET;
membuf_write(&to, &header, sizeof(header));
BUILD_BUG_ON(ZA_PT_ZA_OFFSET != sizeof(header));
end = ZA_PT_ZA_OFFSET;
if (target == current)
fpsimd_preserve_current_state();
/* Any register data to include? */
if (thread_za_enabled(&target->thread)) {
start = end;
end = ZA_PT_SIZE(vq);
membuf_write(&to, target->thread.za_state, end - start);
}
/* Zero any trailing padding */
start = end;
end = ALIGN(header.size, SVE_VQ_BYTES);
return membuf_zero(&to, end - start);
}
static int za_set(struct task_struct *target,
const struct user_regset *regset,
unsigned int pos, unsigned int count,
const void *kbuf, const void __user *ubuf)
{
int ret;
struct user_za_header header;
unsigned int vq;
unsigned long start, end;
if (!system_supports_sme())
return -EINVAL;
/* Header */
if (count < sizeof(header))
return -EINVAL;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &header,
0, sizeof(header));
if (ret)
goto out;
/*
* All current ZA_PT_* flags are consumed by
* vec_set_vector_length(), which will also validate them for
* us:
*/
ret = vec_set_vector_length(target, ARM64_VEC_SME, header.vl,
((unsigned long)header.flags) << 16);
if (ret)
goto out;
/* Actual VL set may be less than the user asked for: */
vq = sve_vq_from_vl(task_get_sme_vl(target));
/* Ensure there is some SVE storage for streaming mode */
if (!target->thread.sve_state) {
sve_alloc(target);
if (!target->thread.sve_state) {
clear_thread_flag(TIF_SME);
ret = -ENOMEM;
goto out;
}
}
/* Allocate/reinit ZA storage */
sme_alloc(target);
if (!target->thread.za_state) {
ret = -ENOMEM;
clear_tsk_thread_flag(target, TIF_SME);
goto out;
}
/* If there is no data then disable ZA */
if (!count) {
target->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK;
goto out;
}
/*
* If setting a different VL from the requested VL and there is
* register data, the data layout will be wrong: don't even
* try to set the registers in this case.
*/
if (vq != sve_vq_from_vl(header.vl)) {
ret = -EIO;
goto out;
}
BUILD_BUG_ON(ZA_PT_ZA_OFFSET != sizeof(header));
start = ZA_PT_ZA_OFFSET;
end = ZA_PT_SIZE(vq);
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
target->thread.za_state,
start, end);
if (ret)
goto out;
/* Mark ZA as active and let userspace use it */
set_tsk_thread_flag(target, TIF_SME);
target->thread.svcr |= SYS_SVCR_EL0_ZA_MASK;
out:
fpsimd_flush_task_state(target);
return ret;
}
#endif /* CONFIG_ARM64_SME */
#ifdef CONFIG_ARM64_PTR_AUTH
static int pac_mask_get(struct task_struct *target,
const struct user_regset *regset,
......@@ -1108,6 +1351,10 @@ enum aarch64_regset {
#ifdef CONFIG_ARM64_SVE
REGSET_SVE,
#endif
#ifdef CONFIG_ARM64_SVE
REGSET_SSVE,
REGSET_ZA,
#endif
#ifdef CONFIG_ARM64_PTR_AUTH
REGSET_PAC_MASK,
REGSET_PAC_ENABLED_KEYS,
......@@ -1188,6 +1435,33 @@ static const struct user_regset aarch64_regsets[] = {
.set = sve_set,
},
#endif
#ifdef CONFIG_ARM64_SME
[REGSET_SSVE] = { /* Streaming mode SVE */
.core_note_type = NT_ARM_SSVE,
.n = DIV_ROUND_UP(SVE_PT_SIZE(SME_VQ_MAX, SVE_PT_REGS_SVE),
SVE_VQ_BYTES),
.size = SVE_VQ_BYTES,
.align = SVE_VQ_BYTES,
.regset_get = ssve_get,
.set = ssve_set,
},
[REGSET_ZA] = { /* SME ZA */
.core_note_type = NT_ARM_ZA,
/*
* ZA is a single register but it's variably sized and
* the ptrace core requires that the size of any data
* be an exact multiple of the configured register
* size so report as though we had SVE_VQ_BYTES
* registers. These values aren't exposed to
* userspace.
*/
.n = DIV_ROUND_UP(ZA_PT_SIZE(SME_VQ_MAX), SVE_VQ_BYTES),
.size = SVE_VQ_BYTES,
.align = SVE_VQ_BYTES,
.regset_get = za_get,
.set = za_set,
},
#endif
#ifdef CONFIG_ARM64_PTR_AUTH
[REGSET_PAC_MASK] = {
.core_note_type = NT_ARM_PAC_MASK,
......
......@@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
kernel_code.end = __pa_symbol(__init_begin - 1);
kernel_data.start = __pa_symbol(_sdata);
kernel_data.end = __pa_symbol(_end - 1);
insert_resource(&iomem_resource, &kernel_code);
insert_resource(&iomem_resource, &kernel_data);
num_standard_resources = memblock.memory.cnt;
res_size = num_standard_resources * sizeof(*standard_resources);
......@@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
}
request_resource(&iomem_resource, res);
if (kernel_code.start >= res->start &&
kernel_code.end <= res->end)
request_resource(res, &kernel_code);
if (kernel_data.start >= res->start &&
kernel_data.end <= res->end)
request_resource(res, &kernel_data);
#ifdef CONFIG_KEXEC_CORE
/* Userspace will find "Crash kernel" region in /proc/iomem. */
if (crashk_res.end && crashk_res.start >= res->start &&
crashk_res.end <= res->end)
request_resource(res, &crashk_res);
#endif
insert_resource(&iomem_resource, res);
}
}
......
......@@ -56,6 +56,7 @@ struct rt_sigframe_user_layout {
unsigned long fpsimd_offset;
unsigned long esr_offset;
unsigned long sve_offset;
unsigned long za_offset;
unsigned long extra_offset;
unsigned long end_offset;
};
......@@ -218,6 +219,7 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
struct user_ctxs {
struct fpsimd_context __user *fpsimd;
struct sve_context __user *sve;
struct za_context __user *za;
};
#ifdef CONFIG_ARM64_SVE
......@@ -226,11 +228,17 @@ static int preserve_sve_context(struct sve_context __user *ctx)
{
int err = 0;
u16 reserved[ARRAY_SIZE(ctx->__reserved)];
u16 flags = 0;
unsigned int vl = task_get_sve_vl(current);
unsigned int vq = 0;
if (test_thread_flag(TIF_SVE))
if (thread_sm_enabled(&current->thread)) {
vl = task_get_sme_vl(current);
vq = sve_vq_from_vl(vl);
flags |= SVE_SIG_FLAG_SM;
} else if (test_thread_flag(TIF_SVE)) {
vq = sve_vq_from_vl(vl);
}
memset(reserved, 0, sizeof(reserved));
......@@ -238,6 +246,7 @@ static int preserve_sve_context(struct sve_context __user *ctx)
__put_user_error(round_up(SVE_SIG_CONTEXT_SIZE(vq), 16),
&ctx->head.size, err);
__put_user_error(vl, &ctx->vl, err);
__put_user_error(flags, &ctx->flags, err);
BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved));
err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved));
......@@ -258,18 +267,28 @@ static int preserve_sve_context(struct sve_context __user *ctx)
static int restore_sve_fpsimd_context(struct user_ctxs *user)
{
int err;
unsigned int vq;
unsigned int vl, vq;
struct user_fpsimd_state fpsimd;
struct sve_context sve;
if (__copy_from_user(&sve, user->sve, sizeof(sve)))
return -EFAULT;
if (sve.vl != task_get_sve_vl(current))
if (sve.flags & SVE_SIG_FLAG_SM) {
if (!system_supports_sme())
return -EINVAL;
vl = task_get_sme_vl(current);
} else {
vl = task_get_sve_vl(current);
}
if (sve.vl != vl)
return -EINVAL;
if (sve.head.size <= sizeof(*user->sve)) {
clear_thread_flag(TIF_SVE);
current->thread.svcr &= ~SYS_SVCR_EL0_SM_MASK;
goto fpsimd_only;
}
......@@ -301,7 +320,10 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
if (err)
return -EFAULT;
set_thread_flag(TIF_SVE);
if (sve.flags & SVE_SIG_FLAG_SM)
current->thread.svcr |= SYS_SVCR_EL0_SM_MASK;
else
set_thread_flag(TIF_SVE);
fpsimd_only:
/* copy the FP and status/control registers */
......@@ -326,6 +348,101 @@ extern int restore_sve_fpsimd_context(struct user_ctxs *user);
#endif /* ! CONFIG_ARM64_SVE */
#ifdef CONFIG_ARM64_SME
static int preserve_za_context(struct za_context __user *ctx)
{
int err = 0;
u16 reserved[ARRAY_SIZE(ctx->__reserved)];
unsigned int vl = task_get_sme_vl(current);
unsigned int vq;
if (thread_za_enabled(&current->thread))
vq = sve_vq_from_vl(vl);
else
vq = 0;
memset(reserved, 0, sizeof(reserved));
__put_user_error(ZA_MAGIC, &ctx->head.magic, err);
__put_user_error(round_up(ZA_SIG_CONTEXT_SIZE(vq), 16),
&ctx->head.size, err);
__put_user_error(vl, &ctx->vl, err);
BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved));
err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved));
if (vq) {
/*
* This assumes that the ZA state has already been saved to
* the task struct by calling the function
* fpsimd_signal_preserve_current_state().
*/
err |= __copy_to_user((char __user *)ctx + ZA_SIG_REGS_OFFSET,
current->thread.za_state,
ZA_SIG_REGS_SIZE(vq));
}
return err ? -EFAULT : 0;
}
static int restore_za_context(struct user_ctxs __user *user)
{
int err;
unsigned int vq;
struct za_context za;
if (__copy_from_user(&za, user->za, sizeof(za)))
return -EFAULT;
if (za.vl != task_get_sme_vl(current))
return -EINVAL;
if (za.head.size <= sizeof(*user->za)) {
current->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK;
return 0;
}
vq = sve_vq_from_vl(za.vl);
if (za.head.size < ZA_SIG_CONTEXT_SIZE(vq))
return -EINVAL;
/*
* Careful: we are about __copy_from_user() directly into
* thread.za_state with preemption enabled, so protection is
* needed to prevent a racing context switch from writing stale
* registers back over the new data.
*/
fpsimd_flush_task_state(current);
/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
sme_alloc(current);
if (!current->thread.za_state) {
current->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK;
clear_thread_flag(TIF_SME);
return -ENOMEM;
}
err = __copy_from_user(current->thread.za_state,
(char __user const *)user->za +
ZA_SIG_REGS_OFFSET,
ZA_SIG_REGS_SIZE(vq));
if (err)
return -EFAULT;
set_thread_flag(TIF_SME);
current->thread.svcr |= SYS_SVCR_EL0_ZA_MASK;
return 0;
}
#else /* ! CONFIG_ARM64_SME */
/* Turn any non-optimised out attempts to use these into a link error: */
extern int preserve_za_context(void __user *ctx);
extern int restore_za_context(struct user_ctxs *user);
#endif /* ! CONFIG_ARM64_SME */
static int parse_user_sigframe(struct user_ctxs *user,
struct rt_sigframe __user *sf)
......@@ -340,6 +457,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
user->fpsimd = NULL;
user->sve = NULL;
user->za = NULL;
if (!IS_ALIGNED((unsigned long)base, 16))
goto invalid;
......@@ -393,7 +511,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
break;
case SVE_MAGIC:
if (!system_supports_sve())
if (!system_supports_sve() && !system_supports_sme())
goto invalid;
if (user->sve)
......@@ -405,6 +523,19 @@ static int parse_user_sigframe(struct user_ctxs *user,
user->sve = (struct sve_context __user *)head;
break;
case ZA_MAGIC:
if (!system_supports_sme())
goto invalid;
if (user->za)
goto invalid;
if (size < sizeof(*user->za))
goto invalid;
user->za = (struct za_context __user *)head;
break;
case EXTRA_MAGIC:
if (have_extra_context)
goto invalid;
......@@ -528,6 +659,9 @@ static int restore_sigframe(struct pt_regs *regs,
}
}
if (err == 0 && system_supports_sme() && user.za)
err = restore_za_context(&user);
return err;
}
......@@ -594,11 +728,12 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
if (system_supports_sve()) {
unsigned int vq = 0;
if (add_all || test_thread_flag(TIF_SVE)) {
int vl = sve_max_vl();
if (add_all || test_thread_flag(TIF_SVE) ||
thread_sm_enabled(&current->thread)) {
int vl = max(sve_max_vl(), sme_max_vl());
if (!add_all)
vl = task_get_sve_vl(current);
vl = thread_get_cur_vl(&current->thread);
vq = sve_vq_from_vl(vl);
}
......@@ -609,6 +744,24 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
return err;
}
if (system_supports_sme()) {
unsigned int vl;
unsigned int vq = 0;
if (add_all)
vl = sme_max_vl();
else
vl = task_get_sme_vl(current);
if (thread_za_enabled(&current->thread))
vq = sve_vq_from_vl(vl);
err = sigframe_alloc(user, &user->za_offset,
ZA_SIG_CONTEXT_SIZE(vq));
if (err)
return err;
}
return sigframe_alloc_end(user);
}
......@@ -649,13 +802,21 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
}
/* Scalable Vector Extension state, if present */
if (system_supports_sve() && err == 0 && user->sve_offset) {
/* Scalable Vector Extension state (including streaming), if present */
if ((system_supports_sve() || system_supports_sme()) &&
err == 0 && user->sve_offset) {
struct sve_context __user *sve_ctx =
apply_user_offset(user, user->sve_offset);
err |= preserve_sve_context(sve_ctx);
}
/* ZA state if present */
if (system_supports_sme() && err == 0 && user->za_offset) {
struct za_context __user *za_ctx =
apply_user_offset(user, user->za_offset);
err |= preserve_za_context(za_ctx);
}
if (err == 0 && user->extra_offset) {
char __user *sfp = (char __user *)user->sigframe;
char __user *userp =
......@@ -759,6 +920,13 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
/* TCO (Tag Check Override) always cleared for signal handlers */
regs->pstate &= ~PSR_TCO_BIT;
/* Signal handlers are invoked with ZA and streaming mode disabled */
if (system_supports_sme()) {
current->thread.svcr &= ~(SYS_SVCR_EL0_ZA_MASK |
SYS_SVCR_EL0_SM_MASK);
sme_smstop();
}
if (ka->sa.sa_flags & SA_RESTORER)
sigtramp = ka->sa.sa_restorer;
else
......
......@@ -19,43 +19,60 @@
#include <asm/stacktrace.h>
/*
* AArch64 PCS assigns the frame pointer to x29.
* A snapshot of a frame record or fp/lr register values, along with some
* accounting information necessary for robust unwinding.
*
* A simple function prologue looks like this:
* sub sp, sp, #0x10
* stp x29, x30, [sp]
* mov x29, sp
* @fp: The fp value in the frame record (or the real fp)
* @pc: The lr value in the frame record (or the real lr)
*
* A simple function epilogue looks like this:
* mov sp, x29
* ldp x29, x30, [sp]
* add sp, sp, #0x10
* @stacks_done: Stacks which have been entirely unwound, for which it is no
* longer valid to unwind to.
*
* @prev_fp: The fp that pointed to this frame record, or a synthetic value
* of 0. This is used to ensure that within a stack, each
* subsequent frame record is at an increasing address.
* @prev_type: The type of stack this frame record was on, or a synthetic
* value of STACK_TYPE_UNKNOWN. This is used to detect a
* transition from one stack to another.
*
* @kr_cur: When KRETPROBES is selected, holds the kretprobe instance
* associated with the most recently encountered replacement lr
* value.
*/
struct unwind_state {
unsigned long fp;
unsigned long pc;
DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
unsigned long prev_fp;
enum stack_type prev_type;
#ifdef CONFIG_KRETPROBES
struct llist_node *kr_cur;
#endif
};
static notrace void start_backtrace(struct stackframe *frame, unsigned long fp,
unsigned long pc)
static notrace void unwind_init(struct unwind_state *state, unsigned long fp,
unsigned long pc)
{
frame->fp = fp;
frame->pc = pc;
state->fp = fp;
state->pc = pc;
#ifdef CONFIG_KRETPROBES
frame->kr_cur = NULL;
state->kr_cur = NULL;
#endif
/*
* Prime the first unwind.
*
* In unwind_frame() we'll check that the FP points to a valid stack,
* In unwind_next() we'll check that the FP points to a valid stack,
* which can't be STACK_TYPE_UNKNOWN, and the first unwind will be
* treated as a transition to whichever stack that happens to be. The
* prev_fp value won't be used, but we set it to 0 such that it is
* definitely not an accessible stack address.
*/
bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
frame->prev_fp = 0;
frame->prev_type = STACK_TYPE_UNKNOWN;
bitmap_zero(state->stacks_done, __NR_STACK_TYPES);
state->prev_fp = 0;
state->prev_type = STACK_TYPE_UNKNOWN;
}
NOKPROBE_SYMBOL(start_backtrace);
NOKPROBE_SYMBOL(unwind_init);
/*
* Unwind from one frame record (A) to the next frame record (B).
......@@ -64,15 +81,12 @@ NOKPROBE_SYMBOL(start_backtrace);
* records (e.g. a cycle), determined based on the location and fp value of A
* and the location (but not the fp value) of B.
*/
static int notrace unwind_frame(struct task_struct *tsk,
struct stackframe *frame)
static int notrace unwind_next(struct task_struct *tsk,
struct unwind_state *state)
{
unsigned long fp = frame->fp;
unsigned long fp = state->fp;
struct stack_info info;
if (!tsk)
tsk = current;
/* Final frame; nothing to unwind */
if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
return -ENOENT;
......@@ -83,7 +97,7 @@ static int notrace unwind_frame(struct task_struct *tsk,
if (!on_accessible_stack(tsk, fp, 16, &info))
return -EINVAL;
if (test_bit(info.type, frame->stacks_done))
if (test_bit(info.type, state->stacks_done))
return -EINVAL;
/*
......@@ -99,27 +113,27 @@ static int notrace unwind_frame(struct task_struct *tsk,
* stack to another, it's never valid to unwind back to that first
* stack.
*/
if (info.type == frame->prev_type) {
if (fp <= frame->prev_fp)
if (info.type == state->prev_type) {
if (fp <= state->prev_fp)
return -EINVAL;
} else {
set_bit(frame->prev_type, frame->stacks_done);
set_bit(state->prev_type, state->stacks_done);
}
/*
* Record this frame record's values and location. The prev_fp and
* prev_type are only meaningful to the next unwind_frame() invocation.
* prev_type are only meaningful to the next unwind_next() invocation.
*/
frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
frame->prev_fp = fp;
frame->prev_type = info.type;
state->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
state->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
state->prev_fp = fp;
state->prev_type = info.type;
frame->pc = ptrauth_strip_insn_pac(frame->pc);
state->pc = ptrauth_strip_insn_pac(state->pc);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if (tsk->ret_stack &&
(frame->pc == (unsigned long)return_to_handler)) {
(state->pc == (unsigned long)return_to_handler)) {
unsigned long orig_pc;
/*
* This is a case where function graph tracer has
......@@ -127,37 +141,37 @@ static int notrace unwind_frame(struct task_struct *tsk,
* to hook a function return.
* So replace it to an original value.
*/
orig_pc = ftrace_graph_ret_addr(tsk, NULL, frame->pc,
(void *)frame->fp);
if (WARN_ON_ONCE(frame->pc == orig_pc))
orig_pc = ftrace_graph_ret_addr(tsk, NULL, state->pc,
(void *)state->fp);
if (WARN_ON_ONCE(state->pc == orig_pc))
return -EINVAL;
frame->pc = orig_pc;
state->pc = orig_pc;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
#ifdef CONFIG_KRETPROBES
if (is_kretprobe_trampoline(frame->pc))
frame->pc = kretprobe_find_ret_addr(tsk, (void *)frame->fp, &frame->kr_cur);
if (is_kretprobe_trampoline(state->pc))
state->pc = kretprobe_find_ret_addr(tsk, (void *)state->fp, &state->kr_cur);
#endif
return 0;
}
NOKPROBE_SYMBOL(unwind_frame);
NOKPROBE_SYMBOL(unwind_next);
static void notrace walk_stackframe(struct task_struct *tsk,
struct stackframe *frame,
bool (*fn)(void *, unsigned long), void *data)
static void notrace unwind(struct task_struct *tsk,
struct unwind_state *state,
stack_trace_consume_fn consume_entry, void *cookie)
{
while (1) {
int ret;
if (!fn(data, frame->pc))
if (!consume_entry(cookie, state->pc))
break;
ret = unwind_frame(tsk, frame);
ret = unwind_next(tsk, state);
if (ret < 0)
break;
}
}
NOKPROBE_SYMBOL(walk_stackframe);
NOKPROBE_SYMBOL(unwind);
static bool dump_backtrace_entry(void *arg, unsigned long where)
{
......@@ -196,17 +210,17 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
void *cookie, struct task_struct *task,
struct pt_regs *regs)
{
struct stackframe frame;
struct unwind_state state;
if (regs)
start_backtrace(&frame, regs->regs[29], regs->pc);
unwind_init(&state, regs->regs[29], regs->pc);
else if (task == current)
start_backtrace(&frame,
unwind_init(&state,
(unsigned long)__builtin_frame_address(1),
(unsigned long)__builtin_return_address(0));
else
start_backtrace(&frame, thread_saved_fp(task),
unwind_init(&state, thread_saved_fp(task),
thread_saved_pc(task));
walk_stackframe(task, &frame, consume_entry, cookie);
unwind(task, &state, consume_entry, cookie);
}
......@@ -158,11 +158,36 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
syscall_trace_exit(regs);
}
static inline void sve_user_discard(void)
/*
* As per the ABI exit SME streaming mode and clear the SVE state not
* shared with FPSIMD on syscall entry.
*/
static inline void fp_user_discard(void)
{
/*
* If SME is active then exit streaming mode. If ZA is active
* then flush the SVE registers but leave userspace access to
* both SVE and SME enabled, otherwise disable SME for the
* task and fall through to disabling SVE too. This means
* that after a syscall we never have any streaming mode
* register state to track, if this changes the KVM code will
* need updating.
*/
if (system_supports_sme() && test_thread_flag(TIF_SME)) {
u64 svcr = read_sysreg_s(SYS_SVCR_EL0);
if (svcr & SYS_SVCR_EL0_SM_MASK)
sme_smstop_sm();
}
if (!system_supports_sve())
return;
/*
* If SME is not active then disable SVE, the registers will
* be cleared when userspace next attempts to access them and
* we do not need to track the SVE register state until then.
*/
clear_thread_flag(TIF_SVE);
/*
......@@ -177,7 +202,7 @@ static inline void sve_user_discard(void)
void do_el0_svc(struct pt_regs *regs)
{
sve_user_discard();
fp_user_discard();
el0_svc_common(regs, regs->regs[8], __NR_syscalls, sys_call_table);
}
......
......@@ -821,6 +821,7 @@ static const char *esr_class_str[] = {
[ESR_ELx_EC_SVE] = "SVE",
[ESR_ELx_EC_ERET] = "ERET/ERETAA/ERETAB",
[ESR_ELx_EC_FPAC] = "FPAC",
[ESR_ELx_EC_SME] = "SME",
[ESR_ELx_EC_IMP_DEF] = "EL3 IMP DEF",
[ESR_ELx_EC_IABT_LOW] = "IABT (lower EL)",
[ESR_ELx_EC_IABT_CUR] = "IABT (current EL)",
......
......@@ -93,7 +93,6 @@ jiffies = jiffies_64;
#ifdef CONFIG_HIBERNATION
#define HIBERNATE_TEXT \
. = ALIGN(SZ_4K); \
__hibernate_exit_text_start = .; \
*(.hibernate_exit.text) \
__hibernate_exit_text_end = .;
......@@ -103,7 +102,6 @@ jiffies = jiffies_64;
#ifdef CONFIG_KEXEC_CORE
#define KEXEC_TEXT \
. = ALIGN(SZ_4K); \
__relocate_new_kernel_start = .; \
*(.kexec_relocate.text) \
__relocate_new_kernel_end = .;
......@@ -170,9 +168,6 @@ SECTIONS
KPROBES_TEXT
HYPERVISOR_TEXT
IDMAP_TEXT
HIBERNATE_TEXT
KEXEC_TEXT
TRAMP_TEXT
*(.gnu.warning)
. = ALIGN(16);
*(.got) /* Global offset table */
......@@ -194,6 +189,14 @@ SECTIONS
HYPERVISOR_DATA_SECTIONS
/* code sections that are never executed via the kernel mapping */
.rodata.text : {
TRAMP_TEXT
HIBERNATE_TEXT
KEXEC_TEXT
. = ALIGN(PAGE_SIZE);
}
idmap_pg_dir = .;
. += IDMAP_DIR_SIZE;
idmap_pg_end = .;
......@@ -337,8 +340,8 @@ ASSERT(__hyp_idmap_text_end - __hyp_idmap_text_start <= PAGE_SIZE,
ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
"ID map text too big or misaligned")
#ifdef CONFIG_HIBERNATION
ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
<= SZ_4K, "Hibernate exit text too big or misaligned")
ASSERT(__hibernate_exit_text_end - __hibernate_exit_text_start <= SZ_4K,
"Hibernate exit text is bigger than 4 KiB")
#endif
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) <= 3*PAGE_SIZE,
......@@ -362,7 +365,7 @@ ASSERT(swapper_pg_dir - tramp_pg_dir == TRAMP_SWAPPER_OFFSET,
#ifdef CONFIG_KEXEC_CORE
/* kexec relocation code should fit into one KEXEC_CONTROL_PAGE_SIZE */
ASSERT(__relocate_new_kernel_end - (__relocate_new_kernel_start & ~(SZ_4K - 1))
<= SZ_4K, "kexec relocation code is too big or misaligned")
ASSERT(__relocate_new_kernel_end - __relocate_new_kernel_start <= SZ_4K,
"kexec relocation code is bigger than 4 KiB")
ASSERT(KEXEC_CONTROL_PAGE_SIZE >= SZ_4K, "KEXEC_CONTROL_PAGE_SIZE is broken")
#endif
......@@ -82,6 +82,26 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
/*
* We don't currently support SME guests but if we leave
* things in streaming mode then when the guest starts running
* FPSIMD or SVE code it may generate SME traps so as a
* special case if we are in streaming mode we force the host
* state to be saved now and exit streaming mode so that we
* don't have to handle any SME traps for valid guest
* operations. Do this for ZA as well for now for simplicity.
*/
if (system_supports_sme()) {
if (read_sysreg(cpacr_el1) & CPACR_EL1_SMEN_EL0EN)
vcpu->arch.flags |= KVM_ARM64_HOST_SME_ENABLED;
if (read_sysreg_s(SYS_SVCR_EL0) &
(SYS_SVCR_EL0_SM_MASK | SYS_SVCR_EL0_ZA_MASK)) {
vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
fpsimd_save_and_flush_cpu_state();
}
}
}
/*
......@@ -109,9 +129,14 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
WARN_ON_ONCE(!irqs_disabled());
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
/*
* Currently we do not support SME guests so SVCR is
* always 0 and we just need a variable to point to.
*/
fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.fp_regs,
vcpu->arch.sve_state,
vcpu->arch.sve_max_vl);
vcpu->arch.sve_max_vl,
NULL, 0, &vcpu->arch.svcr);
clear_thread_flag(TIF_FOREIGN_FPSTATE);
update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
......@@ -130,6 +155,22 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
local_irq_save(flags);
/*
* If we have VHE then the Hyp code will reset CPACR_EL1 to
* CPACR_EL1_DEFAULT and we need to reenable SME.
*/
if (has_vhe() && system_supports_sme()) {
/* Also restore EL0 state seen on entry */
if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED)
sysreg_clear_set(CPACR_EL1, 0,
CPACR_EL1_SMEN_EL0EN |
CPACR_EL1_SMEN_EL1EN);
else
sysreg_clear_set(CPACR_EL1,
CPACR_EL1_SMEN_EL0EN,
CPACR_EL1_SMEN_EL1EN);
}
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
if (vcpu_has_sve(vcpu)) {
__vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
......
......@@ -47,10 +47,24 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
val |= CPTR_EL2_TFP | CPTR_EL2_TZ;
__activate_traps_fpsimd32(vcpu);
}
if (cpus_have_final_cap(ARM64_SME))
val |= CPTR_EL2_TSM;
write_sysreg(val, cptr_el2);
write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el2);
if (cpus_have_final_cap(ARM64_SME)) {
val = read_sysreg_s(SYS_HFGRTR_EL2);
val &= ~(HFGxTR_EL2_nTPIDR2_EL0_MASK |
HFGxTR_EL2_nSMPRI_EL1_MASK);
write_sysreg_s(val, SYS_HFGRTR_EL2);
val = read_sysreg_s(SYS_HFGWTR_EL2);
val &= ~(HFGxTR_EL2_nTPIDR2_EL0_MASK |
HFGxTR_EL2_nSMPRI_EL1_MASK);
write_sysreg_s(val, SYS_HFGWTR_EL2);
}
if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt;
......@@ -94,9 +108,25 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
write_sysreg(this_cpu_ptr(&kvm_init_params)->hcr_el2, hcr_el2);
if (cpus_have_final_cap(ARM64_SME)) {
u64 val;
val = read_sysreg_s(SYS_HFGRTR_EL2);
val |= HFGxTR_EL2_nTPIDR2_EL0_MASK |
HFGxTR_EL2_nSMPRI_EL1_MASK;
write_sysreg_s(val, SYS_HFGRTR_EL2);
val = read_sysreg_s(SYS_HFGWTR_EL2);
val |= HFGxTR_EL2_nTPIDR2_EL0_MASK |
HFGxTR_EL2_nSMPRI_EL1_MASK;
write_sysreg_s(val, SYS_HFGWTR_EL2);
}
cptr = CPTR_EL2_DEFAULT;
if (vcpu_has_sve(vcpu) && (vcpu->arch.flags & KVM_ARM64_FP_ENABLED))
cptr |= CPTR_EL2_TZ;
if (cpus_have_final_cap(ARM64_SME))
cptr &= ~CPTR_EL2_TSM;
write_sysreg(cptr, cptr_el2);
write_sysreg(__kvm_hyp_host_vector, vbar_el2);
......
......@@ -41,7 +41,8 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
val = read_sysreg(cpacr_el1);
val |= CPACR_EL1_TTA;
val &= ~(CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN);
val &= ~(CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN |
CPACR_EL1_SMEN_EL0EN | CPACR_EL1_SMEN_EL1EN);
/*
* With VHE (HCR.E2H == 1), accesses to CPACR_EL1 are routed to
......@@ -62,6 +63,10 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
__activate_traps_fpsimd32(vcpu);
}
if (cpus_have_final_cap(ARM64_SME))
write_sysreg(read_sysreg(sctlr_el2) & ~SCTLR_ELx_ENTP2,
sctlr_el2);
write_sysreg(val, cpacr_el1);
write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el1);
......@@ -83,6 +88,10 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
*/
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
if (cpus_have_final_cap(ARM64_SME))
write_sysreg(read_sysreg(sctlr_el2) | SCTLR_ELx_ENTP2,
sctlr_el2);
write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1);
if (!arm64_kernel_unmapped_at_el0())
......
......@@ -1132,6 +1132,8 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
case SYS_ID_AA64PFR1_EL1:
if (!kvm_has_mte(vcpu->kvm))
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_SME);
break;
case SYS_ID_AA64ISAR1_EL1:
if (!vcpu_has_ptrauth(vcpu))
......@@ -1553,7 +1555,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_UNALLOCATED(4,2),
ID_UNALLOCATED(4,3),
ID_SANITISED(ID_AA64ZFR0_EL1),
ID_UNALLOCATED(4,5),
ID_HIDDEN(ID_AA64SMFR0_EL1),
ID_UNALLOCATED(4,6),
ID_UNALLOCATED(4,7),
......@@ -1596,6 +1598,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility },
{ SYS_DESC(SYS_TRFCR_EL1), undef_access },
{ SYS_DESC(SYS_SMPRI_EL1), undef_access },
{ SYS_DESC(SYS_SMCR_EL1), undef_access },
{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
......@@ -1678,8 +1682,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_CCSIDR_EL1), access_ccsidr },
{ SYS_DESC(SYS_CLIDR_EL1), access_clidr },
{ SYS_DESC(SYS_SMIDR_EL1), undef_access },
{ SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 },
{ SYS_DESC(SYS_CTR_EL0), access_ctr },
{ SYS_DESC(SYS_SVCR_EL0), undef_access },
{ PMU_SYS_REG(SYS_PMCR_EL0), .access = access_pmcr,
.reset = reset_pmcr, .reg = PMCR_EL0 },
......@@ -1719,6 +1725,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
{ SYS_DESC(SYS_TPIDR2_EL0), undef_access },
{ SYS_DESC(SYS_SCXTNUM_EL0), undef_access },
......
......@@ -93,7 +93,7 @@ SYM_FUNC_START(mte_copy_tags_from_user)
mov x3, x1
cbz x2, 2f
1:
user_ldst 2f, ldtrb, w4, x1, 0
USER(2f, ldtrb w4, [x1])
lsl x4, x4, #MTE_TAG_SHIFT
stg x4, [x0], #MTE_GRANULE_SIZE
add x1, x1, #1
......@@ -120,7 +120,7 @@ SYM_FUNC_START(mte_copy_tags_to_user)
1:
ldg x4, [x1]
ubfx x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
user_ldst 2f, sttrb, w4, x0, 0
USER(2f, sttrb w4, [x0])
add x0, x0, #1
add x1, x1, #MTE_GRANULE_SIZE
subs x2, x2, #1
......
......@@ -16,8 +16,8 @@
void copy_highpage(struct page *to, struct page *from)
{
struct page *kto = page_address(to);
struct page *kfrom = page_address(from);
void *kto = page_address(to);
void *kfrom = page_address(from);
copy_page(kto, kfrom);
......
......@@ -158,6 +158,28 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
return contig_ptes;
}
pte_t huge_ptep_get(pte_t *ptep)
{
int ncontig, i;
size_t pgsize;
pte_t orig_pte = ptep_get(ptep);
if (!pte_present(orig_pte) || !pte_cont(orig_pte))
return orig_pte;
ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize);
for (i = 0; i < ncontig; i++, ptep++) {
pte_t pte = ptep_get(ptep);
if (pte_dirty(pte))
orig_pte = pte_mkdirty(orig_pte);
if (pte_young(pte))
orig_pte = pte_mkyoung(orig_pte);
}
return orig_pte;
}
/*
* Changing some bits of contiguous entries requires us to follow a
* Break-Before-Make approach, breaking the whole contiguous set
......@@ -166,15 +188,14 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
*
* This helper performs the break step.
*/
static pte_t get_clear_flush(struct mm_struct *mm,
static pte_t get_clear_contig(struct mm_struct *mm,
unsigned long addr,
pte_t *ptep,
unsigned long pgsize,
unsigned long ncontig)
{
pte_t orig_pte = huge_ptep_get(ptep);
bool valid = pte_valid(orig_pte);
unsigned long i, saddr = addr;
pte_t orig_pte = ptep_get(ptep);
unsigned long i;
for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) {
pte_t pte = ptep_get_and_clear(mm, addr, ptep);
......@@ -190,11 +211,6 @@ static pte_t get_clear_flush(struct mm_struct *mm,
if (pte_young(pte))
orig_pte = pte_mkyoung(orig_pte);
}
if (valid) {
struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
flush_tlb_range(&vma, saddr, addr);
}
return orig_pte;
}
......@@ -385,14 +401,14 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
{
int ncontig;
size_t pgsize;
pte_t orig_pte = huge_ptep_get(ptep);
pte_t orig_pte = ptep_get(ptep);
if (!pte_cont(orig_pte))
return ptep_get_and_clear(mm, addr, ptep);
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
return get_clear_flush(mm, addr, ptep, pgsize, ncontig);
return get_clear_contig(mm, addr, ptep, pgsize, ncontig);
}
/*
......@@ -408,11 +424,11 @@ static int __cont_access_flags_changed(pte_t *ptep, pte_t pte, int ncontig)
{
int i;
if (pte_write(pte) != pte_write(huge_ptep_get(ptep)))
if (pte_write(pte) != pte_write(ptep_get(ptep)))
return 1;
for (i = 0; i < ncontig; i++) {
pte_t orig_pte = huge_ptep_get(ptep + i);
pte_t orig_pte = ptep_get(ptep + i);
if (pte_dirty(pte) != pte_dirty(orig_pte))
return 1;
......@@ -443,7 +459,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
if (!__cont_access_flags_changed(ptep, pte, ncontig))
return 0;
orig_pte = get_clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig);
orig_pte = get_clear_contig(vma->vm_mm, addr, ptep, pgsize, ncontig);
/* Make sure we don't lose the dirty or young state */
if (pte_dirty(orig_pte))
......@@ -476,7 +492,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm,
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
dpfn = pgsize >> PAGE_SHIFT;
pte = get_clear_flush(mm, addr, ptep, pgsize, ncontig);
pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig);
pte = pte_wrprotect(pte);
hugeprot = pte_pgprot(pte);
......
此差异已折叠。
......@@ -238,7 +238,7 @@ int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0,
int this_level, index, level_lsb, level_msb;
dst_addr &= PAGE_MASK;
prev_level_entry = pte_val(pfn_pte(pfn, PAGE_KERNEL_EXEC));
prev_level_entry = pte_val(pfn_pte(pfn, PAGE_KERNEL_ROX));
for (this_level = 3; this_level >= 0; this_level--) {
levels[this_level] = trans_alloc(info);
......
......@@ -43,6 +43,8 @@ KVM_PROTECTED_MODE
MISMATCHED_CACHE_TYPE
MTE
MTE_ASYMM
SME
SME_FA64
SPECTRE_V2
SPECTRE_V3A
SPECTRE_V4
......
......@@ -579,9 +579,7 @@ void arch_ftrace_trampoline_free(struct ftrace_ops *ops)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
#ifdef CONFIG_DYNAMIC_FTRACE
#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
#if defined(CONFIG_DYNAMIC_FTRACE) && !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS)
extern void ftrace_graph_call(void);
static const char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
{
......@@ -610,18 +608,7 @@ int ftrace_disable_ftrace_graph_caller(void)
return ftrace_mod_jmp(ip, &ftrace_stub);
}
#else /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
int ftrace_enable_ftrace_graph_caller(void)
{
return 0;
}
int ftrace_disable_ftrace_graph_caller(void)
{
return 0;
}
#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
#endif /* !CONFIG_DYNAMIC_FTRACE */
#endif /* CONFIG_DYNAMIC_FTRACE && !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
/*
* Hook the return address and push it in the stack of return addrs
......
......@@ -973,16 +973,24 @@ static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
static unsigned long chosen_node_offset = -FDT_ERR_NOTFOUND;
/*
* The main usage of linux,usable-memory-range is for crash dump kernel.
* Originally, the number of usable-memory regions is one. Now there may
* be two regions, low region and high region.
* To make compatibility with existing user-space and older kdump, the low
* region is always the last range of linux,usable-memory-range if exist.
*/
#define MAX_USABLE_RANGES 2
/**
* early_init_dt_check_for_usable_mem_range - Decode usable memory range
* location from flat tree
*/
void __init early_init_dt_check_for_usable_mem_range(void)
{
const __be32 *prop;
int len;
phys_addr_t cap_mem_addr;
phys_addr_t cap_mem_size;
struct memblock_region rgn[MAX_USABLE_RANGES] = {0};
const __be32 *prop, *endp;
int len, i;
unsigned long node = chosen_node_offset;
if ((long)node < 0)
......@@ -991,16 +999,21 @@ void __init early_init_dt_check_for_usable_mem_range(void)
pr_debug("Looking for usable-memory-range property... ");
prop = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
if (!prop || (len % (dt_root_addr_cells + dt_root_size_cells)))
return;
cap_mem_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
cap_mem_size = dt_mem_next_cell(dt_root_size_cells, &prop);
endp = prop + (len / sizeof(__be32));
for (i = 0; i < MAX_USABLE_RANGES && prop < endp; i++) {
rgn[i].base = dt_mem_next_cell(dt_root_addr_cells, &prop);
rgn[i].size = dt_mem_next_cell(dt_root_size_cells, &prop);
pr_debug("cap_mem_start=%pa cap_mem_size=%pa\n", &cap_mem_addr,
&cap_mem_size);
pr_debug("cap_mem_regions[%d]: base=%pa, size=%pa\n",
i, &rgn[i].base, &rgn[i].size);
}
memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
memblock_cap_memory_range(rgn[0].base, rgn[0].size);
for (i = 1; i < MAX_USABLE_RANGES && rgn[i].size; i++)
memblock_add(rgn[i].base, rgn[i].size);
}
#ifdef CONFIG_SERIAL_EARLYCON
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册