提交 · 29227d6ea1572b160e5bea45b3c93a0346444dfa · openeuler / Kernel

19 2月, 2020 2 次提交

arm64: memory: Add missing brackets to untagged_addr() macro · d0022c0e

由 Will Deacon 提交于 2月 19, 2020

Add brackets around the evaluation of the 'addr' parameter to the
untagged_addr() macro so that the cast to 'u64' applies to the result
of the expression.

Cc: <stable@vger.kernel.org>
Fixes: 597399d0 ("arm64: tags: Preserve tags for addresses translated via TTBR1")
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NWill Deacon <will@kernel.org>

d0022c0e

arm64: lse: Fix LSE atomics with LLVM · dd1f6308

由 Vincenzo Frascino 提交于 2月 18, 2020

Commit e0d5896b ("arm64: lse: fix LSE atomics with LLVM's integrated
assembler") broke the build when clang is used in connjunction with the
binutils assembler ("-no-integrated-as"). This happens because
__LSE_PREAMBLE is defined as ".arch armv8-a+lse", which overrides the
version of the CPU architecture passed via the "-march" paramter to gas:

$ aarch64-none-linux-gnu-as -EL -I ./arch/arm64/include
                                -I ./arch/arm64/include/generated
                                -I ./include -I ./include
                                -I ./arch/arm64/include/uapi
                                -I ./arch/arm64/include/generated/uapi
                                -I ./include/uapi -I ./include/generated/uapi
                                -I ./init -I ./init
                                -march=armv8.3-a -o init/do_mounts.o
                                /tmp/do_mounts-d7992a.s
/tmp/do_mounts-d7992a.s: Assembler messages:
/tmp/do_mounts-d7992a.s:1959: Error: selected processor does not support `autiasp'
/tmp/do_mounts-d7992a.s:2021: Error: selected processor does not support `paciasp'
/tmp/do_mounts-d7992a.s:2157: Error: selected processor does not support `autiasp'
/tmp/do_mounts-d7992a.s:2175: Error: selected processor does not support `paciasp'
/tmp/do_mounts-d7992a.s:2494: Error: selected processor does not support `autiasp'

Fix the issue by replacing ".arch armv8-a+lse" with ".arch_extension lse".
Sami confirms that the clang integrated assembler does now support the
'.arch_extension' directive, so this change will be fine even for LTO
builds in future.

Fixes: e0d5896b ("arm64: lse: fix LSE atomics with LLVM's integrated assembler")
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reported-by: NAmit Kachhap <Amit.Kachhap@arm.com>
Tested-by: NSami Tolvanen <samitolvanen@google.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

dd1f6308

10 2月, 2020 2 次提交

arm64/spinlock: fix a -Wunused-function warning · 345d52c1

由 Qian Cai 提交于 1月 23, 2020

The commit f5bfdc8e ("locking/osq: Use optimized spinning loop for
arm64") introduced a warning from Clang because vcpu_is_preempted() is
compiled away,

kernel/locking/osq_lock.c:25:19: warning: unused function 'node_cpu'
[-Wunused-function]
static inline int node_cpu(struct optimistic_spin_node *node)
                  ^
1 warning generated.

Fix it by converting vcpu_is_preempted() to a static inline function.

Fixes: f5bfdc8e ("locking/osq: Use optimized spinning loop for arm64")
Acked-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NQian Cai <cai@lca.pw>
Signed-off-by: NWill Deacon <will@kernel.org>

345d52c1

arm64: Drop do_el0_ia_bp_hardening() & do_sp_pc_abort() declarations · 5cb7a111

由 Anshuman Khandual 提交于 1月 27, 2020

There is a redundant do_sp_pc_abort() declaration in exceptions.h which can
be removed. Also do_el0_ia_bp_hardening() as been already been dropped with
the commit bfe29874 ("arm64: entry-common: don't touch daif before
bp-hardening") and hence does not need a declaration any more. This should
not introduce any functional change.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

5cb7a111

08 2月, 2020 1 次提交

irqchip/gic-v3-its: Rename VPENDBASER/VPROPBASER accessors · 5186a6cc

由 Zenghui Yu 提交于 2月 06, 2020

V{PEND,PROP}BASER registers are actually located in VLPI_base frame
of the *redistributor*. Rename their accessors to reflect this fact.

No functional changes.
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-7-yuzenghui@huawei.com

5186a6cc

04 2月, 2020 3 次提交

asm-generic: Make dma-contiguous.h a mandatory include/asm header · def3f7ce

由 Michal Simek 提交于 1月 17, 2020

dma-continuguous.h is generic for all architectures except arm32 which has
its own version.

Similar change was done for msi.h by commit a1b39bae
("asm-generic: Make msi.h a mandatory include/asm header")
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/linux-arm-kernel/20200117080446.GA8980@lst.de/T/#m92bb56b04161057635d4142e1b3b9b6b0a70122eSigned-off-by: NMichal Simek <michal.simek@xilinx.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: Paul Walmsley <paul.walmsley@sifive.com> # for arch/riscv

def3f7ce

arm64: mm: convert mm/dump.c to use walk_page_range() · 102f45fd

由 Steven Price 提交于 2月 03, 2020

Now walk_page_range() can walk kernel page tables, we can switch the arm64
ptdump code over to using it, simplifying the code.

Link: http://lkml.kernel.org/r/20191218162402.45610-22-steven.price@arm.comSigned-off-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

102f45fd

arm64: mm: add p?d_leaf() definitions · 8aa82df3

由 Steven Price 提交于 2月 03, 2020

walk_page_range() is going to be allowed to walk page tables other than
those of user space.  For this it needs to know when it has reached a
'leaf' entry in the page tables.  This information will be provided by the
p?d_leaf() functions/macros.

For arm64, we already have p?d_sect() macros which we can reuse for
p?d_leaf().

pud_sect() is defined as a dummy function when CONFIG_PGTABLE_LEVELS < 3
or CONFIG_ARM64_64K_PAGES is defined.  However when the kernel is
configured this way then architecturally it isn't allowed to have a large
page at this level, and any code using these page walking macros is
implicitly relying on the page size/number of levels being the same as the
kernel.  So it is safe to reuse this for p?d_leaf() as it is an
architectural restriction.

Link: http://lkml.kernel.org/r/20191218162402.45610-5-steven.price@arm.comSigned-off-by: NSteven Price <steven.price@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8aa82df3

28 1月, 2020 3 次提交

KVM: Move running VCPU from ARM to common code · 7495e22b

由 Paolo Bonzini 提交于 1月 09, 2020

For ring-based dirty log tracking, it will be more efficient to account
writes during schedule-out or schedule-in to the currently running VCPU.
We would like to do it even if the write doesn't use the current VCPU's
address space, as is the case for cached writes (see commit 4e335d9e,
"Revert "KVM: Support vCPU-based gfn->hva cache"", 2017-05-02).

Therefore, add a mechanism to track the currently-loaded kvm_vcpu struct.
There is already something similar in KVM/ARM; one important difference
is that kvm_arch_vcpu_{load,put} have two callers in virt/kvm/kvm_main.c:
we have to update both the architecture-independent vcpu_{load,put} and
the preempt notifiers.

Another change made in the process is to allow using kvm_get_running_vcpu()
in preemptible code. This is allowed because preempt notifiers ensure
that the value does not change even after the VCPU thread is migrated.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7495e22b

KVM: Drop kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() · ddd259c9

由 Sean Christopherson 提交于 12月 18, 2019

Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all
arch specific implementations are nops.
Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ddd259c9

KVM: arm64: Free sve_state via arm specific hook · 19bcc89e

由 Sean Christopherson 提交于 12月 18, 2019

Add an arm specific hook to free the arm64-only sve_state.  Doing so
eliminates the last functional code from kvm_arch_vcpu_uninit() across
all architectures and paves the way for removing kvm_arch_vcpu_init()
and kvm_arch_vcpu_uninit() entirely.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

19bcc89e

23 1月, 2020 2 次提交

arm64: KVM: Add UAPI notes for swapped registers · 290a6bb0

由 Andrew Jones 提交于 1月 20, 2020

Two UAPI system register IDs do not derive their values from the
ARM system register encodings. This is because their values were
accidentally swapped. As the IDs are API, they cannot be changed.
Add WARNING notes to point them out.
Suggested-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NAndrew Jones <drjones@redhat.com>
[maz: turned XXX into WARNING]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200120130825.28838-1-drjones@redhat.com

290a6bb0

KVM: arm/arm64: Cleanup MMIO handling · 0e20f5e2

由 Marc Zyngier 提交于 12月 13, 2019

Our MMIO handling is a bit odd, in the sense that it uses an
intermediate per-vcpu structure to store the various decoded
information that describe the access.

But the same information is readily available in the HSR/ESR_EL2
field, and we actually use this field to populate the structure.

Let's simplify the whole thing by getting rid of the superfluous
structure and save a (tiny) bit of space in the vcpu structure.

[32bit fix courtesy of Olof Johansson <olof@lixom.net>]
Signed-off-by: NMarc Zyngier <maz@kernel.org>

0e20f5e2

22 1月, 2020 4 次提交

arm64: acpi: fix DAIF manipulation with pNMI · e533dbe9

由 Mark Rutland 提交于 1月 22, 2020

Since commit:

  d44f1b8d ("arm64: KVM/mm: Move SEA handling behind a single 'claim' interface")

... the top-level APEI SEA handler has the shape:

1. current_flags = arch_local_save_flags()
2. local_daif_restore(DAIF_ERRCTX)
3. <GHES handler>
4. local_daif_restore(current_flags)

However, since commit:

  4a503217 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")

... when pseudo-NMIs (pNMIs) are in use, arch_local_save_flags() will save
the PMR value rather than the DAIF flags.

The combination of these two commits means that the APEI SEA handler will
erroneously attempt to restore the PMR value into DAIF. Fix this by
factoring local_daif_save_flags() out of local_daif_save(), so that we
can consistently save DAIF in step #1, regardless of whether pNMIs are in
use.

Both commits were introduced concurrently in v5.0.

Cc: <stable@vger.kernel.org>
Fixes: 4a503217 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
Fixes: d44f1b8d ("arm64: KVM/mm: Move SEA handling behind a single 'claim' interface")
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

e533dbe9

irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation · 5e516846

由 Marc Zyngier 提交于 12月 24, 2019

GICv4.1 defines a new VPE table that is potentially shared between
both the ITSs and the redistributors, following complicated affinity
rules.

To make things more confusing, the programming of this table at
the redistributor level is reusing the GICv4.0 GICR_VPROPBASER register
for something completely different.

The code flow is somewhat complexified by the need to respect the
affinities required by the HW, meaning that tables can either be
inherited from a previously discovered ITS or redistributor.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-6-maz@kernel.org

5e516846

arm64: Use v8.5-RNG entropy for KASLR seed · 2e8e1ea8

由 Mark Brown 提交于 1月 21, 2020

When seeding KALSR on a system where we have architecture level random
number generation make use of that entropy, mixing it in with the seed
passed by the bootloader. Since this is run very early in init before
feature detection is complete we open code rather than use archrandom.h.
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

2e8e1ea8

arm64: Implement archrandom.h for ARMv8.5-RNG · 1a50ec0b

由 Richard Henderson 提交于 1月 21, 2020

Expose the ID_AA64ISAR0.RNDR field to userspace, as the RNG system
registers are always available at EL0.

Implement arch_get_random_seed_long using RNDR.  Given that the
TRNG is likely to be a shared resource between cores, and VMs,
do not explicitly force re-seeding with RNDRRS.  In order to avoid
code complexity and potential issues with hetrogenous systems only
provide values after cpufeature has finalized the system capabilities.
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
[Modified to only function after cpufeature has finalized the system
capabilities and move all the code into the header -- broonie]
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
[will: Advertise HWCAP via /proc/cpuinfo]
Signed-off-by: NWill Deacon <will@kernel.org>

1a50ec0b

20 1月, 2020 4 次提交

KVM: arm/arm64: Correct AArch32 SPSR on exception entry · 1cfbb484

由 Mark Rutland 提交于 1月 08, 2020

Confusingly, there are three SPSR layouts that a kernel may need to deal
with:

(1) An AArch64 SPSR_ELx view of an AArch64 pstate
(2) An AArch64 SPSR_ELx view of an AArch32 pstate
(3) An AArch32 SPSR_* view of an AArch32 pstate

When the KVM AArch32 support code deals with SPSR_{EL2,HYP}, it's either
dealing with #2 or #3 consistently. On arm64 the PSR_AA32_* definitions
match the AArch64 SPSR_ELx view, and on arm the PSR_AA32_* definitions
match the AArch32 SPSR_* view.

However, when we inject an exception into an AArch32 guest, we have to
synthesize the AArch32 SPSR_* that the guest will see. Thus, an AArch64
host needs to synthesize layout #3 from layout #2.

This patch adds a new host_spsr_to_spsr32() helper for this, and makes
use of it in the KVM AArch32 support code. For arm64 we need to shuffle
the DIT bit around, and remove the SS bit, while for arm we can use the
value as-is.

I've open-coded the bit manipulation for now to avoid having to rework
the existing PSR_* definitions into PSR64_AA32_* and PSR32_AA32_*
definitions. I hope to perform a more thorough refactoring in future so
that we can handle pstate view manipulation more consistently across the
kernel tree.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-4-mark.rutland@arm.com

1cfbb484

KVM: arm/arm64: Correct CPSR on exception entry · 3c2483f1

由 Mark Rutland 提交于 1月 08, 2020

When KVM injects an exception into a guest, it generates the CPSR value
from scratch, configuring CPSR.{M,A,I,T,E}, and setting all other
bits to zero.

This isn't correct, as the architecture specifies that some CPSR bits
are (conditionally) cleared or set upon an exception, and others are
unchanged from the original context.

This patch adds logic to match the architectural behaviour. To make this
simple to follow/audit/extend, documentation references are provided,
and bits are configured in order of their layout in SPSR_EL2. This
layout can be seen in the diagram on ARM DDI 0487E.a page C5-426.

Note that this code is used by both arm and arm64, and is intended to
fuction with the SPSR_EL2 and SPSR_HYP layouts.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-3-mark.rutland@arm.com

3c2483f1

KVM: arm64: Correct PSTATE on exception entry · a425372e

由 Mark Rutland 提交于 1月 08, 2020

When KVM injects an exception into a guest, it generates the PSTATE
value from scratch, configuring PSTATE.{M[4:0],DAIF}, and setting all
other bits to zero.

This isn't correct, as the architecture specifies that some PSTATE bits
are (conditionally) cleared or set upon an exception, and others are
unchanged from the original context.

This patch adds logic to match the architectural behaviour. To make this
simple to follow/audit/extend, documentation references are provided,
and bits are configured in order of their layout in SPSR_EL2. This
layout can be seen in the diagram on ARM DDI 0487E.a page C5-429.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-2-mark.rutland@arm.com

a425372e

KVM: arm64: Only sign-extend MMIO up to register width · b6ae256a

由 Christoffer Dall 提交于 12月 12, 2019

On AArch64 you can do a sign-extended load to either a 32-bit or 64-bit
register, and we should only sign extend the register up to the width of
the register as specified in the operation (by using the 32-bit Wn or
64-bit Xn register specifier).

As it turns out, the architecture provides this decoding information in
the SF ("Sixty-Four" -- how cute...) bit.

Let's take advantage of this with the usual 32-bit/64-bit header file
dance and do the right thing on AArch64 hosts.
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191212195055.5541-1-christoffer.dall@arm.com

b6ae256a

18 1月, 2020 1 次提交

open: introduce openat2(2) syscall · fddb5d43

由 Aleksa Sarai 提交于 1月 18, 2020

/* Background. */
For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown flags
are present[1].

This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road to
being added to openat(2).

Userspace also has a hard time figuring out whether a particular flag is
supported on a particular kernel. While it is now possible with
contemporary kernels (thanks to [3]), older kernels will expose unknown
flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during
openat(2) time matches modern syscall designs and is far more
fool-proof.

In addition, the newly-added path resolution restriction LOOKUP flags
(which we would like to expose to user-space) don't feel related to the
pre-existing O_* flag set -- they affect all components of path lookup.
We'd therefore like to add a new flag argument.

Adding a new syscall allows us to finally fix the flag-ignoring problem,
and we can make it extensible enough so that we will hopefully never
need an openat3(2).

/* Syscall Prototype. */
  /*
   * open_how is an extensible structure (similar in interface to
   * clone3(2) or sched_setattr(2)). The size parameter must be set to
   * sizeof(struct open_how), to allow for future extensions. All future
   * extensions will be appended to open_how, with their zero value
   * acting as a no-op default.
   */
  struct open_how { /* ... */ };

  int openat2(int dfd, const char *pathname,
              struct open_how *how, size_t size);

/* Description. */
The initial version of 'struct open_how' contains the following fields:

  flags
    Used to specify openat(2)-style flags. However, any unknown flag
    bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR)
    will result in -EINVAL. In addition, this field is 64-bits wide to
    allow for more O_ flags than currently permitted with openat(2).

  mode
    The file mode for O_CREAT or O_TMPFILE.

    Must be set to zero if flags does not contain O_CREAT or O_TMPFILE.

  resolve
    Restrict path resolution (in contrast to O_* flags they affect all
    path components). The current set of flags are as follows (at the
    moment, all of the RESOLVE_ flags are implemented as just passing
    the corresponding LOOKUP_ flag).

    RESOLVE_NO_XDEV       => LOOKUP_NO_XDEV
    RESOLVE_NO_SYMLINKS   => LOOKUP_NO_SYMLINKS
    RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS
    RESOLVE_BENEATH       => LOOKUP_BENEATH
    RESOLVE_IN_ROOT       => LOOKUP_IN_ROOT

open_how does not contain an embedded size field, because it is of
little benefit (userspace can figure out the kernel open_how size at
runtime fairly easily without it). It also only contains u64s (even
though ->mode arguably should be a u16) to avoid having padding fields
which are never used in the future.

Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE
is no longer permitted for openat(2). As far as I can tell, this has
always been a bug and appears to not be used by userspace (and I've not
seen any problems on my machines by disallowing it). If it turns out
this breaks something, we can special-case it and only permit it for
openat(2) but not openat2(2).

After input from Florian Weimer, the new open_how and flag definitions
are inside a separate header from uapi/linux/fcntl.h, to avoid problems
that glibc has with importing that header.

/* Testing. */
In a follow-up patch there are over 200 selftests which ensure that this
syscall has the correct semantics and will correctly handle several
attack scenarios.

In addition, I've written a userspace library[4] which provides
convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary
because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care
must be taken when using RESOLVE_IN_ROOT'd file descriptors with other
syscalls). During the development of this patch, I've run numerous
verification tests using libpathrs (showing that the API is reasonably
usable by userspace).

/* Future Work. */
Additional RESOLVE_ flags have been suggested during the review period.
These can be easily implemented separately (such as blocking auto-mount
during resolution).

Furthermore, there are some other proposed changes to the openat(2)
interface (the most obvious example is magic-link hardening[5]) which
would be a good opportunity to add a way for userspace to restrict how
O_PATH file descriptors can be re-opened.

Another possible avenue of future work would be some kind of
CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace
which openat2(2) flags and fields are supported by the current kernel
(to avoid userspace having to go through several guesses to figure it
out).

[1]: https://lwn.net/Articles/588444/
[2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com
[3]: commit 629e014b ("fs: completely ignore unknown open flags")
[4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523
[5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/
[6]: https://youtu.be/ggD-eb3yPVsSuggested-by: NChristian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: NAleksa Sarai <cyphar@cyphar.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fddb5d43

17 1月, 2020 8 次提交

arm64: entry: cleanup el0 svc handler naming · 7a2c0944

由 Mark Rutland 提交于 1月 16, 2020

For most of the exception entry code, <foo>_handler() is the first C
function called from the entry assembly in entry-common.c, and external
functions handling the bulk of the logic are called do_<foo>().

For consistency, apply this scheme to el0_svc_handler and
el0_svc_compat_handler, renaming them to do_el0_svc and
do_el0_svc_compat respectively.

There should be no functional change as a result of this patch.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

7a2c0944

arm64: assembler: remove smp_dmb macro · ddb953f8

由 Mark Rutland 提交于 1月 16, 2020

These days arm64 kernels are always SMP, and thus smp_dmb is an
overly-long way of writing dmb. Naturally, no-one uses it.

Remove the unused macro.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

ddb953f8

arm64: assembler: remove inherit_daif macro · 170b25fa

由 Mark Rutland 提交于 1月 16, 2020

We haven't needed the inherit_daif macro since commit:

  ed3768db ("arm64: entry: convert el1_sync to C")

... which converted all callers to C and the local_daif_inherit
function.

Remove the unused macro.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

170b25fa

arm64: Use macros instead of hard-coded constants for MAIR_EL1 · 95b3f74b

由 Catalin Marinas 提交于 12月 11, 2019

Currently, the arm64 __cpu_setup has hard-coded constants for the memory
attributes that go into the MAIR_EL1 register. Define proper macros in
asm/sysreg.h and make use of them in proc.S.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

95b3f74b

arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list · 83b0c36b

由 Sai Prakash Ranjan 提交于 1月 17, 2020

The "silver" KRYO3XX and KRYO4XX CPU cores are not affected by Spectre
variant 2. Add them to spectre_v2 safe list to correct the spurious
ARM_SMCCC_ARCH_WORKAROUND_1 warning and vulnerability status reported
under sysfs.
Reviewed-by: NStephen Boyd <swboyd@chromium.org>
Tested-by: NStephen Boyd <swboyd@chromium.org>
Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
[will: tweaked commit message to remove stale mention of "gold" cores]
Signed-off-by: NWill Deacon <will@kernel.org>

83b0c36b

locking/osq: Use optimized spinning loop for arm64 · f5bfdc8e

由 Waiman Long 提交于 1月 13, 2020

Arm64 has a more optimized spinning loop (atomic_cond_read_acquire)
using wfe for spinlock that can boost performance of sibling threads
by putting the current cpu to a wait state that is broken only when
the monitored variable changes or an external event happens.

OSQ has a more complicated spinning loop. Besides the lock value, it
also checks for need_resched() and vcpu_is_preempted(). The check for
need_resched() is not a problem as it is only set by the tick interrupt
handler. That will be detected by the spinning cpu right after iret.

The vcpu_is_preempted() check, however, is a problem as changes to the
preempt state of of previous node will not affect the wait state. For
ARM64, vcpu_is_preempted is not currently defined and so is a no-op.
Will has indicated that he is planning to para-virtualize wfe instead
of defining vcpu_is_preempted for PV support. So just add a comment in
arch/arm64/include/asm/spinlock.h to indicate that vcpu_is_preempted()
should not be defined as suggested.

On a 2-socket 56-core 224-thread ARM64 system, a kernel mutex locking
microbenchmark was run for 10s with and without the patch. The
performance numbers before patch were:

Running locktest with mutex [runtime = 10s, load = 1]
Threads = 224, Min/Mean/Max = 316/123,143/2,121,269
Threads = 224, Total Rate = 2,757 kop/s; Percpu Rate = 12 kop/s

After patch, the numbers were:

Running locktest with mutex [runtime = 10s, load = 1]
Threads = 224, Min/Mean/Max = 334/147,836/1,304,787
Threads = 224, Total Rate = 3,311 kop/s; Percpu Rate = 15 kop/s

So there was about 20% performance improvement.
Signed-off-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200113150735.21956-1-longman@redhat.com

f5bfdc8e

arm64: fix alternatives with LLVM's integrated assembler · c54f90c2

由 Sami Tolvanen 提交于 10月 31, 2019

LLVM's integrated assembler fails with the following error when
building KVM:

  <inline asm>:12:6: error: expected absolute expression
   .if kvm_update_va_mask == 0
       ^
  <inline asm>:21:6: error: expected absolute expression
   .if kvm_update_va_mask == 0
       ^
  <inline asm>:24:2: error: unrecognized instruction mnemonic
          NOT_AN_INSTRUCTION
          ^
  LLVM ERROR: Error parsing inline asm

These errors come from ALTERNATIVE_CB and __ALTERNATIVE_CFG,
which test for the existence of the callback parameter in inline
assembly using the following expression:

  " .if " __stringify(cb) " == 0\n"

This works with GNU as, but isn't supported by LLVM. This change
splits __ALTERNATIVE_CFG and ALTINSTR_ENTRY into separate macros
to fix the LLVM build.

Link: https://github.com/ClangBuiltLinux/linux/issues/472Signed-off-by: NSami Tolvanen <samitolvanen@google.com>
Tested-by: NNick Desaulniers <ndesaulniers@google.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NWill Deacon <will@kernel.org>

c54f90c2

arm64: lse: fix LSE atomics with LLVM's integrated assembler · e0d5896b

由 Sami Tolvanen 提交于 10月 31, 2019

Unlike gcc, clang considers each inline assembly block to be independent
and therefore, when using the integrated assembler for inline assembly,
any preambles that enable features must be repeated in each block.

This change defines __LSE_PREAMBLE and adds it to each inline assembly
block that has LSE instructions, which allows them to be compiled also
with clang's assembler.

Link: https://github.com/ClangBuiltLinux/linux/issues/671Signed-off-by: NSami Tolvanen <samitolvanen@google.com>
Tested-by: NAndrew Murray <andrew.murray@arm.com>
Tested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAndrew Murray <andrew.murray@arm.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NWill Deacon <will@kernel.org>

e0d5896b

16 1月, 2020 4 次提交

arm64: Implement optimised checksum routine · 5777eaed

由 Robin Murphy 提交于 1月 15, 2020

Apparently there exist certain workloads which rely heavily on software
checksumming, for which the generic do_csum() implementation becomes a
significant bottleneck. Therefore let's give arm64 its own optimised
version - for ease of maintenance this foregoes assembly or intrisics,
and is thus not actually arm64-specific, but does rely heavily on C
idioms that translate well to the A64 ISA and the typical load/store
capabilities of most ARMv8 CPU cores.

The resulting increase in checksum throughput scales nicely with buffer
size, tending towards 4x for a small in-order core (Cortex-A53), and up
to 6x or more for an aggressive big core (Ampere eMAG).
Reported-by: NLingyan Huang <huanglingyan2@huawei.com>
Tested-by: NLingyan Huang <huanglingyan2@huawei.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

5777eaed

arm64: Workaround for Cortex-A55 erratum 1530923 · 275fa0ea

由 Steven Price 提交于 12月 16, 2019

Cortex-A55 erratum 1530923 allows TLB entries to be allocated as a
result of a speculative AT instruction. This may happen in the middle of
a guest world switch while the relevant VMSA configuration is in an
inconsistent state, leading to erroneous content being allocated into
TLBs.

The same workaround as is used for Cortex-A76 erratum 1165522
(WORKAROUND_SPECULATIVE_AT_VHE) can be used here. Note that this
mandates the use of VHE on affected parts.
Acked-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

275fa0ea

arm64: Rename WORKAROUND_1319367 to SPECULATIVE_AT_NVHE · db0d46a5

由 Steven Price 提交于 12月 16, 2019

To match SPECULATIVE_AT_VHE let's also have a generic name for the NVHE
variant.
Acked-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

db0d46a5

arm64: Rename WORKAROUND_1165522 to SPECULATIVE_AT_VHE · e85d68fa

由 Steven Price 提交于 12月 16, 2019

Cortex-A55 is affected by a similar erratum, so rename the existing
workaround for errarum 1165522 so it can be used for both errata.
Acked-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

e85d68fa

15 1月, 2020 6 次提交

arm64: Use a variable to store non-global mappings decision · 09e3c22a

由 Mark Brown 提交于 12月 09, 2019

Refactor the code which checks to see if we need to use non-global
mappings to use a variable instead of checking with the CPU capabilities
each time, doing the initial check for KPTI early in boot before we
start allocating memory so we still avoid transitioning to non-global
mappings in common cases.

Since this variable always matches our decision about non-global
mappings this means we can also combine arm64_kernel_use_ng_mappings()
and arm64_unmap_kernel_at_el0() into a single function, the variable
simply stores the result and the decision code is elsewhere. We could
just have the users check the variable directly but having a function
makes it clear that these uses are read-only.

The result is that we simplify the code a bit and reduces the amount of
code executed at runtime.
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

09e3c22a

arm64: Don't use KPTI where we have E0PD · 92ac6fd1

由 Mark Brown 提交于 12月 09, 2019

Since E0PD is intended to fulfil the same role as KPTI we don't need to
use KPTI on CPUs where E0PD is available, we can rely on E0PD instead.
Change the check that forces KPTI on when KASLR is enabled to check for
E0PD before doing so, CPUs with E0PD are not expected to be affected by
meltdown so should not need to enable KPTI for other reasons.

Since E0PD is a system capability we will still enable KPTI if any of
the CPUs in the system lacks E0PD, this will rewrite any global mappings
that were established in systems where some but not all CPUs support
E0PD. We may transiently have a mix of global and non-global mappings
while booting since we use the local CPU when deciding if KPTI will be
required prior to completing CPU enumeration but any global mappings
will be converted to non-global ones when KPTI is applied.

KPTI can still be forced on from the command line if required.
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

92ac6fd1

arm64: Factor out checks for KASLR in KPTI code into separate function · c2d92353

由 Mark Brown 提交于 12月 09, 2019

In preparation for integrating E0PD support with KASLR factor out the
checks for interaction between KASLR and KPTI done in boot context into
a new function kaslr_requires_kpti(), in the process clarifying the
distinction between what we do in boot context and what we do at
runtime.
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

c2d92353

arm64: Add initial support for E0PD · 3e6c69a0

由 Mark Brown 提交于 12月 09, 2019

Kernel Page Table Isolation (KPTI) is used to mitigate some speculation
based security issues by ensuring that the kernel is not mapped when
userspace is running but this approach is expensive and is incompatible
with SPE.  E0PD, introduced in the ARMv8.5 extensions, provides an
alternative to this which ensures that accesses from userspace to the
kernel's half of the memory map to always fault with constant time,
preventing timing attacks without requiring constant unmapping and
remapping or preventing legitimate accesses.

Currently this feature will only be enabled if all CPUs in the system
support E0PD, if some CPUs do not support the feature at boot time then
the feature will not be enabled and in the unlikely event that a late
CPU is the first CPU to lack the feature then we will reject that CPU.

This initial patch does not yet integrate with KPTI, this will be dealt
with in followup patches.  Ideally we could ensure that by default we
don't use KPTI on CPUs where E0PD is present.
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
[will: Fixed typo in Kconfig text]
Signed-off-by: NWill Deacon <will@kernel.org>

3e6c69a0

arm64: Move the LSE gas support detection to Kconfig · 395af861

由 Catalin Marinas 提交于 1月 15, 2020

As the Kconfig syntax gained support for $(as-instr) tests, move the LSE
gas support detection from Makefile to the main arm64 Kconfig and remove
the additional CONFIG_AS_LSE definition and check.

Cc: Will Deacon <will@kernel.org>
Reviewed-by: NVladimir Murzin <vladimir.murzin@arm.com>
Tested-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

395af861

arm64: Introduce ID_ISAR6 CPU register · 8e3747be

由 Anshuman Khandual 提交于 12月 17, 2019

This adds basic building blocks required for ID_ISAR6 CPU register which
identifies support for various instruction implementation on AArch32 state.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Acked-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
[will: Ensure SPECRES is treated the same as on A64]
Signed-off-by: NWill Deacon <will@kernel.org>

8e3747be

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功