提交 · e377ab82311af95c99648c6424a6b888a0ccb102 · openeuler / Kernel

26 5月, 2021 1 次提交

arm64/mm: Remove [PUD|PMD]_TABLE_BIT from [pud|pmd]_bad() · e377ab82

由 Anshuman Khandual 提交于 5月 10, 2021

Semantics wise, [pud|pmd]_bad() have always implied that a given [PUD|PMD]
entry does not have a pointer to the next level page table. This had been
made clear in the commit a1c76574 ("arm64: mm: use *_sect to check for
section maps"). Hence explicitly check for a table entry rather than just
testing a single bit. This basically redefines [pud|pmd]_bad() in terms of
[pud|pmd]_table() making the semantics clear.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1620644871-26280-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>

e377ab82

17 5月, 2021 1 次提交

quota: Disable quotactl_path syscall · 5b9fedb3

由 Jan Kara 提交于 5月 17, 2021

In commit fa8b9007 ("quota: wire up quotactl_path") we have wired up
new quotactl_path syscall. However some people in LWN discussion have
objected that the path based syscall is missing dirfd and flags argument
which is mostly standard for contemporary path based syscalls. Indeed
they have a point and after a discussion with Christian Brauner and
Sascha Hauer I've decided to disable the syscall for now and update its
API. Since there is no userspace currently using that syscall and it
hasn't been released in any major release, we should be fine.

CC: Christian Brauner <christian.brauner@ubuntu.com>
CC: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/lkml/20210512153621.n5u43jsytbik4yze@wittgensteinSigned-off-by: NJan Kara <jack@suse.cz>

5b9fedb3

10 5月, 2021 1 次提交

arm64: Generate cpucaps.h · 0c6c2d36

由 Mark Brown 提交于 4月 28, 2021

The arm64 code allocates an internal constant to every CPU feature it can
detect, distinct from the public hwcap numbers we use to expose some
features to userspace. Currently this is maintained manually which is an
irritating source of conflicts when working on new features, to avoid this
replace the header with a simple text file listing the names we've assigned
and sort it to minimise conflicts.

As part of doing this we also do the Kbuild hookup required to hook up
an arch tools directory and to generate header files in there.

This will result in a renumbering and reordering of the existing constants,
since they are all internal only the values should not be important. The
reordering will impact the order in which some steps in enumeration handle
features but the algorithm is not intended to depend on this and I haven't
seen any issues when testing. Due to the UAO cpucap having been removed in
the past we end up with ARM64_NCAPS being 1 smaller than it was before.
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NMark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210428121231.11219-1-broonie@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

0c6c2d36

06 5月, 2021 1 次提交

arm64: entry: always set GIC_PRIO_PSR_I_SET during entry · 4d6a38da

由 Mark Rutland 提交于 4月 28, 2021

Zenghui reports that booting a kernel with "irqchip.gicv3_pseudo_nmi=1"
on the command line hits a warning during kernel entry, due to the way
we manipulate the PMR.

Early in the entry sequence, we call lockdep_hardirqs_off() to inform
lockdep that interrupts have been masked (as the HW sets DAIF wqhen
entering an exception). Architecturally PMR_EL1 is not affected by
exception entry, and we don't set GIC_PRIO_PSR_I_SET in the PMR early in
the exception entry sequence, so early in exception entry the PMR can
indicate that interrupts are unmasked even though they are masked by
DAIF.

If DEBUG_LOCKDEP is selected, lockdep_hardirqs_off() will check that
interrupts are masked, before we set GIC_PRIO_PSR_I_SET in any of the
exception entry paths, and hence lockdep_hardirqs_off() will WARN() that
something is amiss.

We can avoid this by consistently setting GIC_PRIO_PSR_I_SET during
exception entry so that kernel code sees a consistent environment. We
must also update local_daif_inherit() to undo this, as currently only
touches DAIF. For other paths, local_daif_restore() will update both
DAIF and the PMR. With this done, we can remove the existing special
cases which set this later in the entry code.

We always use (GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET) for consistency with
local_daif_save(), as this will warn if it ever encounters
(GIC_PRIO_IRQOFF | GIC_PRIO_PSR_I_SET), and never sets this itself. This
matches the gic_prio_kentry_setup that we have to retain for
ret_to_user.

The original splat from Zenghui's report was:

| DEBUG_LOCKS_WARN_ON(!irqs_disabled())
| WARNING: CPU: 3 PID: 125 at kernel/locking/lockdep.c:4258 lockdep_hardirqs_off+0xd4/0xe8
| Modules linked in:
| CPU: 3 PID: 125 Comm: modprobe Tainted: G        W         5.12.0-rc8+ #463
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO BTYPE=--)
| pc : lockdep_hardirqs_off+0xd4/0xe8
| lr : lockdep_hardirqs_off+0xd4/0xe8
| sp : ffff80002a39bad0
| pmr_save: 000000e0
| x29: ffff80002a39bad0 x28: ffff0000de214bc0
| x27: ffff0000de1c0400 x26: 000000000049b328
| x25: 0000000000406f30 x24: ffff0000de1c00a0
| x23: 0000000020400005 x22: ffff8000105f747c
| x21: 0000000096000044 x20: 0000000000498ef9
| x19: ffff80002a39bc88 x18: ffffffffffffffff
| x17: 0000000000000000 x16: ffff800011c61eb0
| x15: ffff800011700a88 x14: 0720072007200720
| x13: 0720072007200720 x12: 0720072007200720
| x11: 0720072007200720 x10: 0720072007200720
| x9 : ffff80002a39bad0 x8 : ffff80002a39bad0
| x7 : ffff8000119f0800 x6 : c0000000ffff7fff
| x5 : ffff8000119f07a8 x4 : 0000000000000001
| x3 : 9bcdab23f2432800 x2 : ffff800011730538
| x1 : 9bcdab23f2432800 x0 : 0000000000000000
| Call trace:
|  lockdep_hardirqs_off+0xd4/0xe8
|  enter_from_kernel_mode.isra.5+0x7c/0xa8
|  el1_abort+0x24/0x100
|  el1_sync_handler+0x80/0xd0
|  el1_sync+0x6c/0x100
|  __arch_clear_user+0xc/0x90
|  load_elf_binary+0x9fc/0x1450
|  bprm_execve+0x404/0x880
|  kernel_execve+0x180/0x188
|  call_usermodehelper_exec_async+0xdc/0x158
|  ret_from_fork+0x10/0x18

Fixes: 23529049 ("arm64: entry: fix non-NMI user<->kernel transitions")
Fixes: 7cd1ea10 ("arm64: entry: fix non-NMI kernel<->kernel transitions")
Fixes: f0cd5ac1 ("arm64: entry: fix NMI {user, kernel}->kernel transitions")
Fixes: 2a9b3e6a ("arm64: entry: fix EL1 debug transitions")
Link: https://lore.kernel.org/r/f4012761-026f-4e51-3a0c-7524e434e8b3@huawei.comSigned-off-by: NMark Rutland <mark.rutland@arm.com>
Reported-by: NZenghui Yu <yuzenghui@huawei.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210428111555.50880-1-mark.rutland@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4d6a38da

01 5月, 2021 4 次提交

arm64: kasan: allow to init memory when setting tags · d9b6f907

由 Andrey Konovalov 提交于 4月 29, 2021

Patch series "kasan: integrate with init_on_alloc/free", v3.

This patch series integrates HW_TAGS KASAN with init_on_alloc/free by
initializing memory via the same arm64 instruction that sets memory tags.

This is expected to improve HW_TAGS KASAN performance when
init_on_alloc/free is enabled.  The exact perfomance numbers are unknown
as MTE-enabled hardware doesn't exist yet.

This patch (of 5):

This change adds an argument to mte_set_mem_tag_range() that allows to
enable memory initialization when settinh the allocation tags.  The
implementation uses stzg instruction instead of stg when this argument
indicates to initialize memory.

Combining setting allocation tags with memory initialization will improve
HW_TAGS KASAN performance when init_on_alloc/free is enabled.

This change doesn't integrate memory initialization with KASAN, this is
done is subsequent patches in this series.

Link: https://lkml.kernel.org/r/cover.1615296150.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/d04ae90cc36be3fe246ea8025e5085495681c3d7.1615296150.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
Acked-by: NMarco Elver <elver@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d9b6f907

mm/vmalloc: provide fallback arch huge vmap support functions · 6f680e70

由 Nicholas Piggin 提交于 4月 29, 2021

If an architecture doesn't support a particular page table level as a huge
vmap page size then allow it to skip defining the support query function.

Link: https://lkml.kernel.org/r/20210317062402.533919-11-npiggin@gmail.comSigned-off-by: NNicholas Piggin <npiggin@gmail.com>
Suggested-by: NChristoph Hellwig <hch@lst.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ding Tianhong <dingtianhong@huawei.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6f680e70

arm64: inline huge vmap supported functions · 168a6333

由 Nicholas Piggin 提交于 4月 29, 2021

This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.

Link: https://lkml.kernel.org/r/20210317062402.533919-9-npiggin@gmail.comSigned-off-by: NNicholas Piggin <npiggin@gmail.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ding Tianhong <dingtianhong@huawei.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

168a6333

mm: HUGE_VMAP arch support cleanup · bbc180a5

由 Nicholas Piggin 提交于 4月 29, 2021

This changes the awkward approach where architectures provide init
functions to determine which levels they can provide large mappings for,
to one where the arch is queried for each call.

This removes code and indirection, and allows constant-folding of dead
code for unsupported levels.

This also adds a prot argument to the arch query.  This is unused
currently but could help with some architectures (e.g., some powerpc
processors can't map uncacheable memory with large pages).

Link: https://lkml.kernel.org/r/20210317062402.533919-7-npiggin@gmail.comSigned-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NDing Tianhong <dingtianhong@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bbc180a5

23 4月, 2021 3 次提交

arm64: Force SPARSEMEM_VMEMMAP as the only memory management model · 782276b4

由 Catalin Marinas 提交于 4月 20, 2021

Currently arm64 allows a choice of FLATMEM, SPARSEMEM and
SPARSEMEM_VMEMMAP. However, only the latter is tested regularly. FLATMEM
does not seem to boot in certain configurations (guest under KVM with
Qemu as a VMM). Since the reduction of the SECTION_SIZE_BITS to 27 (4K
pages) or 29 (64K page), there's little argument against the memory
wasted by the mem_map array with SPARSEMEM.

Make SPARSEMEM_VMEMMAP the only available option, non-selectable, and
remove the corresponding #ifdefs under arch/arm64/.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Acked-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Acked-by: NMike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20210420093559.23168-1-catalin.marinas@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

782276b4

xen/arm: introduce XENFEAT_direct_mapped and XENFEAT_not_direct_mapped · f5079a9a

由 Stefano Stabellini 提交于 3月 19, 2021

Newer Xen versions expose two Xen feature flags to tell us if the domain
is directly mapped or not. Only when a domain is directly mapped it
makes sense to enable swiotlb-xen on ARM.

Introduce a function on ARM to check the new Xen feature flags and also
to deal with the legacy case. Call the function xen_swiotlb_detect.
Signed-off-by: NStefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210319200140.12512-1-sstabellini@kernel.orgSigned-off-by: NJuergen Gross <jgross@suse.com>

f5079a9a

arch: Wire up Landlock syscalls · a49f4f81

由 Mickaël Salaün 提交于 4月 22, 2021

Wire up the following system calls for all architectures:
* landlock_create_ruleset(2)
* landlock_add_rule(2)
* landlock_restrict_self(2)

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: NMickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210422154123.13086-10-mic@digikod.netSigned-off-by: NJames Morris <jamorris@linux.microsoft.com>

a49f4f81

17 4月, 2021 4 次提交

KVM: Kill off the old hva-based MMU notifier callbacks · b4c5936c

由 Sean Christopherson 提交于 4月 01, 2021

Yank out the hva-based MMU notifier APIs now that all architectures that
use the notifiers have moved to the gfn-based APIs.

No functional change intended.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210402005658.3024832-7-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b4c5936c

KVM: arm64: Convert to the gfn-based MMU notifier callbacks · cd4c7183

由 Sean Christopherson 提交于 4月 01, 2021

Move arm64 to the gfn-base MMU notifier APIs, which do the hva->gfn
lookup in common code.

No meaningful functional change intended, though the exact order of
operations is slightly different since the memslot lookups occur before
calling into arch code.
Reviewed-by: NMarc Zyngier <maz@kernel.org>
Tested-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210402005658.3024832-4-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cd4c7183

KVM: aarch64: implement KVM_CAP_SET_GUEST_DEBUG2 · fa18aca9

由 Maxim Levitsky 提交于 4月 01, 2021

Move KVM_GUESTDBG_VALID_MASK to kvm_host.h
and use it to return the value of this capability.
Compile tested only.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210401135451.1004564-5-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fa18aca9

KVM: Move prototypes for MMU notifier callbacks to generic code · 5f7c292b

由 Sean Christopherson 提交于 3月 25, 2021

Move the prototypes for the MMU notifier callbacks out of arch code and
into common code.  There is no benefit to having each arch replicate the
prototypes since any deviation from the invocation in common code will
explode.

No functional change intended.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210326021957.1424875-9-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5f7c292b

16 4月, 2021 1 次提交

arm64: alternatives: Move length validation in alternative_{insn, endif} · 22315a22

由 Nathan Chancellor 提交于 4月 13, 2021

After commit 2decad92 ("arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is
set atomically"), LLVM's integrated assembler fails to build entry.S:

<instantiation>:5:7: error: expected assembly-time absolute expression
 .org . - (664b-663b) + (662b-661b)
      ^
<instantiation>:6:7: error: expected assembly-time absolute expression
 .org . - (662b-661b) + (664b-663b)
      ^

The root cause is LLVM's assembler has a one-pass design, meaning it
cannot figure out these instruction lengths when the .org directive is
outside of the subsection that they are in, which was changed by the
.arch_extension directive added in the above commit.

Apply the same fix from commit 966a0acc ("arm64/alternatives: move
length validation inside the subsection") to the alternative_endif
macro, shuffling the .org directives so that the length validation
happen will always happen in the same subsections. alternative_insn has
not shown any issue yet but it appears that it could have the same issue
in the future so just preemptively change it.

Fixes: f7b93d42 ("arm64/alternatives: use subsections for replacement sequences")
Cc: <stable@vger.kernel.org> # 5.8.x
Link: https://github.com/ClangBuiltLinux/linux/issues/1347Signed-off-by: NNathan Chancellor <nathan@kernel.org>
Reviewed-by: NSami Tolvanen <samitolvanen@google.com>
Tested-by: NSami Tolvanen <samitolvanen@google.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Tested-by: NNick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20210414000803.662534-1-nathan@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

22315a22

14 4月, 2021 4 次提交

lib/vdso: Add vdso_data pointer as input to __arch_get_timens_vdso_data() · 808094fc

由 Christophe Leroy 提交于 3月 31, 2021

For the same reason as commit e876f0b6 ("lib/vdso: Allow
architectures to provide the vdso data pointer"), powerpc wants to
avoid calculation of relative position to code.

As the timens_vdso_data is next page to vdso_data, provide
vdso_data pointer to __arch_get_timens_vdso_data() in order
to ease the calculation on powerpc in following patches.
Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Acked-by: NAndrei Vagin <avagin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/539c4204b1baa77c55f758904a1ea239abbc7a5c.1617209142.git.christophe.leroy@csgroup.eu

808094fc

arm64: pac: Optimize kernel entry/exit key installation code paths · b90e4839

由 Peter Collingbourne 提交于 3月 18, 2021

The kernel does not use any keys besides IA so we don't need to
install IB/DA/DB/GA on kernel exit if we arrange to install them
on task switch instead, which we can expect to happen an order of
magnitude less often.

Furthermore we can avoid installing the user IA in the case where the
user task has IA disabled and just leave the kernel IA installed. This
also lets us avoid needing to install IA on kernel entry.

On an Apple M1 under a hypervisor, the overhead of kernel entry/exit
has been measured to be reduced by 15.6ns in the case where IA is
enabled, and 31.9ns in the case where IA is disabled.
Signed-off-by: NPeter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/Ieddf6b580d23c9e0bed45a822dabe72d2ffc9a8e
Link: https://lore.kernel.org/r/2d653d055f38f779937f2b92f8ddd5cf9e4af4f4.1616123271.git.pcc@google.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b90e4839

arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) · 20169862

由 Peter Collingbourne 提交于 3月 18, 2021

This change introduces a prctl that allows the user program to control
which PAC keys are enabled in a particular task. The main reason
why this is useful is to enable a userspace ABI that uses PAC to
sign and authenticate function pointers and other pointers exposed
outside of the function, while still allowing binaries conforming
to the ABI to interoperate with legacy binaries that do not sign or
authenticate pointers.

The idea is that a dynamic loader or early startup code would issue
this prctl very early after establishing that a process may load legacy
binaries, but before executing any PAC instructions.

This change adds a small amount of overhead to kernel entry and exit
due to additional required instruction sequences.

On a DragonBoard 845c (Cortex-A75) with the powersave governor, the
overhead of similar instruction sequences was measured as 4.9ns when
simulating the common case where IA is left enabled, or 43.7ns when
simulating the uncommon case where IA is disabled. These numbers can
be seen as the worst case scenario, since in more realistic scenarios
a better performing governor would be used and a newer chip would be
used that would support PAC unlike Cortex-A75 and would be expected
to be faster than Cortex-A75.

On an Apple M1 under a hypervisor, the overhead of the entry/exit
instruction sequences introduced by this patch was measured as 0.3ns
in the case where IA is left enabled, and 33.0ns in the case where
IA is disabled.
Signed-off-by: NPeter Collingbourne <pcc@google.com>
Reviewed-by: NDave Martin <Dave.Martin@arm.com>
Link: https://linux-review.googlesource.com/id/Ibc41a5e6a76b275efbaa126b31119dc197b927a5
Link: https://lore.kernel.org/r/d6609065f8f40397a4124654eb68c9f490b4d477.1616123271.git.pcc@google.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

20169862

arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere · 2f79d2fc

由 Peter Collingbourne 提交于 3月 18, 2021

In an upcoming change we are going to introduce per-task SCTLR_EL1
bits for PAC. Move the existing per-task SCTLR_EL1 field out of the
MTE-specific code so that we will be able to use it from both the
PAC and MTE code paths and make the task switching code more efficient.
Signed-off-by: NPeter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/Ic65fac78a7926168fa68f9e8da591c9e04ff7278
Link: https://lore.kernel.org/r/13d725cb8e741950fb9d6e64b2cd9bd54ff7c3f9.1616123271.git.pcc@google.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

2f79d2fc

12 4月, 2021 3 次提交

arm64: fpsimd: run kernel mode NEON with softirqs disabled · 13150149

由 Ard Biesheuvel 提交于 3月 02, 2021

Kernel mode NEON can be used in task or softirq context, but only in
a non-nesting manner, i.e., softirq context is only permitted if the
interrupt was not taken at a point where the kernel was using the NEON
in task context.

This means all users of kernel mode NEON have to be aware of this
limitation, and either need to provide scalar fallbacks that may be much
slower (up to 20x for AES instructions) and potentially less safe, or
use an asynchronous interface that defers processing to a later time
when the NEON is guaranteed to be available.

Given that grabbing and releasing the NEON is cheap, we can relax this
restriction, by increasing the granularity of kernel mode NEON code, and
always disabling softirq processing while the NEON is being used in task
context.
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210302090118.30666-4-ardb@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

13150149

arm64: assembler: introduce wxN aliases for wN registers · 4c4dcd35

由 Ard Biesheuvel 提交于 3月 02, 2021

The AArch64 asm syntax has this slightly tedious property that the names
used in mnemonics to refer to registers depend on whether the opcode in
question targets the entire 64-bits (xN), or only the least significant
8, 16 or 32 bits (wN). When writing parameterized code such as macros,
this can be annoying, as macro arguments don't lend themselves to
indexed lookups, and so generating a reference to wN in a macro that
receives xN as an argument is problematic.

For instance, an upcoming patch that modifies the implementation of the
cond_yield macro to be able to refer to 32-bit registers would need to
modify invocations such as

  cond_yield	3f, x8

to

  cond_yield	3f, 8

so that the second argument can be token pasted after x or w to emit the
correct register reference. Unfortunately, this interferes with the self
documenting nature of the first example, where the second argument is
obviously a register, whereas in the second example, one would need to
go and look at the code to find out what '8' means.

So let's fix this by defining wxN aliases for all xN registers, which
resolve to the 32-bit alias of each respective 64-bit register. This
allows the macro implementation to paste the xN reference after a w to
obtain the correct register name.
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210302090118.30666-3-ardb@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4c4dcd35

arm64: assembler: remove conditional NEON yield macros · 27248fe1

由 Ard Biesheuvel 提交于 3月 02, 2021

The users of the conditional NEON yield macros have all been switched to
the simplified cond_yield macro, and so the NEON specific ones can be
removed.
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210302090118.30666-2-ardb@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

27248fe1

11 4月, 2021 7 次提交

kasan, arm64: tests supports for HW_TAGS async mode · e80a76aa

由 Andrey Konovalov 提交于 3月 15, 2021

This change adds KASAN-KUnit tests support for the async HW_TAGS mode.

In async mode, tag fault aren't being generated synchronously when a
bad access happens, but are instead explicitly checked for by the kernel.

As each KASAN-KUnit test expect a fault to happen before the test is over,
check for faults as a part of the test handler.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NAndrey Konovalov <andreyknvl@google.com>
Tested-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210315132019.33202-10-vincenzo.frascino@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

e80a76aa

arm64: mte: Report async tag faults before suspend · eab0e6e1

由 Vincenzo Frascino 提交于 3月 15, 2021

When MTE async mode is enabled TFSR_EL1 contains the accumulative
asynchronous tag check faults for EL1 and EL0.

During the suspend/resume operations the firmware might perform some
operations that could change the state of the register resulting in
a spurious tag check fault report.

Report asynchronous tag faults before suspend and clear the TFSR_EL1
register after resume to prevent this to happen.

Cc: Will Deacon <will@kernel.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: NAndrey Konovalov <andreyknvl@google.com>
Tested-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210315132019.33202-9-vincenzo.frascino@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

eab0e6e1

arm64: mte: Enable async tag check fault · 65812c69

由 Vincenzo Frascino 提交于 3月 15, 2021

MTE provides a mode that asynchronously updates the TFSR_EL1 register
when a tag check exception is detected.

To take advantage of this mode the kernel has to verify the status of
the register at:
  1. Context switching
  2. Return to user/EL0 (Not required in entry from EL0 since the kernel
  did not run)
  3. Kernel entry from EL1
  4. Kernel exit to EL1

If the register is non-zero a trace is reported.

Add the required features for EL1 detection and reporting.

Note: ITFSB bit is set in the SCTLR_EL1 register hence it guaranties that
the indirect writes to TFSR_EL1 are synchronized at exception entry to
EL1. On the context switch path the synchronization is guarantied by the
dsb() in __switch_to().
The dsb(nsh) in mte_check_tfsr_exit() is provisional pending
confirmation by the architects.

Cc: Will Deacon <will@kernel.org>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NAndrey Konovalov <andreyknvl@google.com>
Tested-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210315132019.33202-8-vincenzo.frascino@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

65812c69

arm64: mte: Enable TCO in functions that can read beyond buffer limits · e60beb95

由 Vincenzo Frascino 提交于 3月 15, 2021

load_unaligned_zeropad() and __get/put_kernel_nofault() functions can
read past some buffer limits which may include some MTE granule with a
different tag.

When MTE async mode is enabled, the load operation crosses the boundaries
and the next granule has a different tag the PE sets the TFSR_EL1.TF1 bit
as if an asynchronous tag fault is happened.

Enable Tag Check Override (TCO) in these functions  before the load and
disable it afterwards to prevent this to happen.

Note: The same condition can be hit in MTE sync mode but we deal with it
through the exception handling.
In the current implementation, mte_async_mode flag is set only at boot
time but in future kasan might acquire some runtime features that
that change the mode dynamically, hence we disable it when sync mode is
selected for future proof.

Cc: Will Deacon <will@kernel.org>
Reported-by: NBranislav Rankov <Branislav.Rankov@arm.com>
Tested-by: NBranislav Rankov <Branislav.Rankov@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NAndrey Konovalov <andreyknvl@google.com>
Tested-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210315132019.33202-6-vincenzo.frascino@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

e60beb95

arm64: mte: Drop arch_enable_tagging() · c137c614

由 Vincenzo Frascino 提交于 3月 15, 2021

arch_enable_tagging() was left in memory.h after the introduction of
async mode to not break the bysectability of the KASAN KUNIT tests.

Remove the function now that KASAN has been fully converted.

Cc: Will Deacon <will@kernel.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NAndrey Konovalov <andreyknvl@google.com>
Tested-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210315132019.33202-4-vincenzo.frascino@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

c137c614

arm64: mte: Add asynchronous mode support · f3b7deef

由 Vincenzo Frascino 提交于 3月 15, 2021

MTE provides an asynchronous mode for detecting tag exceptions. In
particular instead of triggering a fault the arm64 core updates a
register which is checked by the kernel after the asynchronous tag
check fault has occurred.

Add support for MTE asynchronous mode.

The exception handling mechanism will be added with a future patch.

Note: KASAN HW activates async mode via kasan.mode kernel parameter.
The default mode is set to synchronous.
The code that verifies the status of TFSR_EL1 will be added with a
future patch.

Cc: Will Deacon <will@kernel.org>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NAndrey Konovalov <andreyknvl@google.com>
Acked-by: NAndrey Konovalov <andreyknvl@google.com>
Tested-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210315132019.33202-2-vincenzo.frascino@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

f3b7deef

KVM: arm64: Don't print warning when trapping SPE registers · 13611bc8

由 Alexandru Elisei 提交于 4月 09, 2021

KVM sets up MDCR_EL2 to trap accesses to the SPE buffer and sampling
control registers and it relies on the fact that KVM injects an undefined
exception for unknown registers. This mechanism of injecting undefined
exceptions also prints a warning message for the host kernel; for example,
when a guest tries to access PMSIDR_EL1:

[    2.691830] kvm [142]: Unsupported guest sys_reg access at: 80009e78 [800003c5]
[    2.691830]  { Op0( 3), Op1( 0), CRn( 9), CRm( 9), Op2( 7), func_read },

This is unnecessary, because KVM has explicitly configured trapping of
those registers and is well aware of their existence. Prevent the warning
by adding the SPE registers to the list of registers that KVM emulates.
The access function will inject the undefined exception.
Signed-off-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210409152154.198566-2-alexandru.elisei@arm.com

13611bc8

09 4月, 2021 8 次提交

arm64: add __nocfi to functions that jump to a physical address · cbdac841

由 Sami Tolvanen 提交于 4月 08, 2021

Disable CFI checking for functions that switch to linear mapping and
make an indirect call to a physical address, since the compiler only
understands virtual addresses and the CFI check for such indirect calls
would always fail.
Signed-off-by: NSami Tolvanen <samitolvanen@google.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Tested-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NKees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210408182843.1754385-15-samitolvanen@google.com

cbdac841

arm64: use function_nocfi with __pa_symbol · bde33977

由 Sami Tolvanen 提交于 4月 08, 2021

With CONFIG_CFI_CLANG, the compiler replaces function address
references with the address of the function's CFI jump table
entry. This means that __pa_symbol(function) returns the physical
address of the jump table entry, which can lead to address space
confusion as the jump table points to the function's virtual
address. Therefore, use the function_nocfi() macro to ensure we are
always taking the address of the actual function instead.
Signed-off-by: NSami Tolvanen <samitolvanen@google.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NKees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210408182843.1754385-14-samitolvanen@google.com

bde33977

arm64: implement function_nocfi · 4ecfca89

由 Sami Tolvanen 提交于 4月 08, 2021

With CONFIG_CFI_CLANG, the compiler replaces function addresses in
instrumented C code with jump table addresses. This change implements
the function_nocfi() macro, which returns the actual function address
instead.
Signed-off-by: NSami Tolvanen <samitolvanen@google.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NKees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210408182843.1754385-13-samitolvanen@google.com

4ecfca89

arm64: cpufeature: Allow early filtering of feature override · cac642c1

由 Marc Zyngier 提交于 4月 08, 2021

Some CPUs are broken enough that some overrides need to be rejected
at the earliest opportunity. In some cases, that's right at cpu
feature override time.

Provide the necessary infrastructure to filter out overrides,
and to report such filtered out overrides to the core cpufeature code.
Acked-by: NWill Deacon <will@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210408131010.1109027-2-maz@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

cac642c1

arm64: Disable fine grained traps on boot · 31c00d2a

由 Mark Brown 提交于 4月 01, 2021

The arm64 FEAT_FGT extension introduces a set of traps to EL2 for accesses
to small sets of registers and instructions from EL1 and EL0. Currently
Linux makes no use of this feature, ensure that it is not active at boot by
disabling the traps during EL2 setup.
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210401180942.35815-3-broonie@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

31c00d2a

arm64: mte: Remove unused mte_assign_mem_tag_range() · df652a16

由 Vincenzo Frascino 提交于 4月 07, 2021

mte_assign_mem_tag_range() was added in commit 85f49cae
("arm64: mte: add in-kernel MTE helpers") in 5.11 but moved out of
mte.S by commit 2cb34276 ("arm64: kasan: simplify and inline
MTE functions") in 5.12 and renamed to mte_set_mem_tag_range().
2cb34276 did not delete the old function prototypes in mte.h.

Remove the unused prototype from mte.h.

Cc: Will Deacon <will@kernel.org>
Reported-by: NDerrick McKee <derrick.mckee@gmail.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210407133817.23053-1-vincenzo.frascino@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

df652a16

arm64: Add __init section marker to some functions · a7dcf58a

由 Jisheng Zhang 提交于 3月 30, 2021

They are not needed after booting, so mark them as __init to move them
to the .init section.
Signed-off-by: NJisheng Zhang <Jisheng.Zhang@synaptics.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20210330135449.4dcffd7f@xhacker.debianSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

a7dcf58a

arm64/sve: Rework SVE access trap to convert state in registers · cccb78ce

由 Mark Brown 提交于 3月 12, 2021

When we enable SVE usage in userspace after taking a SVE access trap we
need to ensure that the portions of the register state that are not
shared with the FPSIMD registers are zeroed. Currently we do this by
forcing the FPSIMD registers to be saved to the task struct and converting
them there. This is wasteful in the common case where the task state is
loaded into the registers and we will immediately return to userspace
since we can initialise the SVE state directly in registers instead of
accessing multiple copies of the register state in memory.

Instead in that common case do the conversion in the registers and
update the task metadata so that we can return to userspace without
spilling the register state to memory unless there is some other reason
to do so.
Signed-off-by: NMark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210312190313.24598-1-broonie@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

cccb78ce

08 4月, 2021 2 次提交

arm64: Move ICH_ sysreg bits from arm-gic-v3.h to sysreg.h · 8a657f71

由 Hector Martin 提交于 3月 01, 2021

These definitions are in arm-gic-v3.h for historical reasons which no
longer apply. Move them to sysreg.h so the AIC driver can use them, as
it needs to peek into vGIC registers to deal with the GIC maintentance
interrupt.
Acked-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Signed-off-by: NHector Martin <marcan@marcan.st>

8a657f71

asm-generic/io.h: implement pci_remap_cfgspace using ioremap_np · b10eb2d5

由 Hector Martin 提交于 3月 25, 2021

Now that we have ioremap_np(), we can make pci_remap_cfgspace() default
to it, falling back to ioremap() on platforms where it is not available.

Remove the arm64 implementation, since that is now redundant. Future
cleanups should be able to do the same for other arches, and eventually
make the generic pci_remap_cfgspace() unconditional.
Acked-by: NWill Deacon <will@kernel.org>
Signed-off-by: NHector Martin <marcan@marcan.st>

b10eb2d5

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功