提交 · 71b3ec5f221b8b3ff545639be83ddfcd5d7c9800 · openeuler / Kernel

25 5月, 2020 1 次提交

KVM: arm64: Clean up cpu_init_hyp_mode() · 71b3ec5f

由 David Brazdil 提交于 5月 15, 2020

Pull bits of code to the only place where it is used. Remove empty function
__cpu_init_stage2(). Remove redundant has_vhe() check since this function is
nVHE-only. No functional changes intended.
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200515152056.83158-1-dbrazdil@google.com

71b3ec5f

16 5月, 2020 14 次提交

KVM: arm64: Make KVM_CAP_MAX_VCPUS compatible with the selected GIC version · 5107000f

由 Marc Zyngier 提交于 4月 27, 2020

KVM_CAP_MAX_VCPUS always return the maximum possible number of
VCPUs, irrespective of the selected interrupt controller. This
is pretty misleading for userspace that selects a GICv2 on a GICv3
system that supports v2 compat: It always gets a maximum of 512
VCPUs, even if the effective limit is 8. The 9th VCPU will fail
to be created, which is unexpected as far as userspace is concerned.

Fortunately, we already have the right information stashed in the
kvm structure, and we can return it as requested.
Reported-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Tested-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20200427141507.284985-1-maz@kernel.org

5107000f

KVM: arm64: Support enabling dirty log gradually in small chunks · c862626e

由 Keqian Zhu 提交于 4月 13, 2020

There is already support of enabling dirty log gradually in small chunks
for x86 in commit 3c9bd400 ("KVM: x86: enable dirty log gradually in
small chunks"). This adds support for arm64.

x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET
is enabled. However, for arm64, both huge pages and normal pages can be
write protected gradually by userspace.

Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G
Linux VMs with different page size. The memory pressure is 127G in each
case. The time taken of memory_global_dirty_log_start in QEMU is listed
below:

Page Size      Before    After Optimization
  4K            650ms         1.8ms
  2M             4ms          1.8ms
  1G             2ms          1.8ms

Besides the time reduction, the biggest improvement is that we will minimize
the performance side effect (because of dissolving huge pages and marking
memslots dirty) on guest after enabling dirty log.
Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200413122023.52583-1-zhukeqian1@huawei.com

c862626e

KVM: arm64: Unify handling THP backed host memory · 0529c902

由 Suzuki K Poulose 提交于 5月 07, 2020

We support mapping host memory backed by PMD transparent hugepages
at stage2 as huge pages. However the checks are now spread across
two different places. Let us unify the handling of the THPs to
keep the code cleaner (and future proof for PUD THP support).
This patch moves transparent_hugepage_adjust() closer to the caller
to avoid a forward declaration for fault_supports_stage2_huge_mappings().

Also, since we already handle the case where the host VA and the guest
PA may not be aligned, the explicit VM_BUG_ON() is not required.
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200507123546.1875-3-yuzenghui@huawei.com

0529c902

KVM: arm64: Clean up the checking for huge mapping · 9f283614

由 Suzuki K Poulose 提交于 5月 07, 2020

If we are checking whether the stage2 can map PAGE_SIZE,
we don't have to do the boundary checks as both the host
VMA and the guest memslots are page aligned. Bail the case
easily.

While we're at it, fixup a typo in the comment below.
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200507123546.1875-2-yuzenghui@huawei.com

9f283614

KVM: arm/arm64: Release kvm->mmu_lock in loop to prevent starvation · 48c963e3

由 Jiang Yi 提交于 4月 15, 2020

Do cond_resched_lock() in stage2_flush_memslot() like what is done in
unmap_stage2_range() and other places holding mmu_lock while processing
a possibly large range of memory.
Signed-off-by: NJiang Yi <giangyi@amazon.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20200415084229.29992-1-giangyi@amazon.com

48c963e3

KVM: arm64: Sidestep stage2_unmap_vm() on vcpu reset when S2FWB is supported · 892713e9

由 Zenghui Yu 提交于 4月 15, 2020

stage2_unmap_vm() was introduced to unmap user RAM region in the stage2
page table to make the caches coherent. E.g., a guest reboot with stage1
MMU disabled will access memory using non-cacheable attributes. If the
RAM and caches are not coherent at this stage, some evicted dirty cache
line may go and corrupt guest data in RAM.

Since ARMv8.4, S2FWB feature is mandatory and KVM will take advantage
of it to configure the stage2 page table and the attributes of memory
access. So we ensure that guests always access memory using cacheable
attributes and thus, the caches always be coherent.

So on CPUs that support S2FWB, we can safely reset the vcpu without a
heavy stage2 unmapping.
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200415072835.1164-1-yuzenghui@huawei.com

892713e9

KVM: Fix spelling in code comments · 656012c7

由 Fuad Tabba 提交于 4月 01, 2020

Fix spelling and typos (e.g., repeated words) in comments.
Signed-off-by: NFuad Tabba <tabba@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200401140310.29701-1-tabba@google.com

656012c7

KVM: arm64: Use cpus_have_final_cap for has_vhe() · ce6f8f02

由 Marc Zyngier 提交于 5月 13, 2020

By the time we start using the has_vhe() helper, we have long
discovered whether we are running VHE or not. It thus makes
sense to use cpus_have_final_cap() instead of cpus_have_const_cap(),
which leads to a small text size reduction.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NDavid Brazdil <dbrazdil@google.com>
Link: https://lore.kernel.org/r/20200513103828.74580-1-maz@kernel.org

ce6f8f02

KVM: arm64: Simplify __kvm_timer_set_cntvoff implementation · c6fe89ff

由 Marc Zyngier 提交于 5月 13, 2020

Now that this function isn't constrained by the 32bit PCS,
let's simplify it by taking a single 64bit offset instead
of two 32bit parameters.
Signed-off-by: NMarc Zyngier <maz@kernel.org>

c6fe89ff

KVM: arm64: Clean up kvm makefiles · 25357de0

由 Fuad Tabba 提交于 5月 05, 2020

Consolidate references to the CONFIG_KVM configuration item to encompass
entire folders rather than per line.
Signed-off-by: NFuad Tabba <tabba@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200505154520.194120-5-tabba@google.com

25357de0

KVM: arm64: Change CONFIG_KVM to a menuconfig entry · f2613362

由 Will Deacon 提交于 5月 05, 2020

Changing CONFIG_KVM to be a 'menuconfig' entry in Kconfig mean that we
can straightforwardly enumerate optional features, such as the virtual
PMU device as dependent options.
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NFuad Tabba <tabba@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200505154520.194120-4-tabba@google.com

f2613362

KVM: arm64: Update help text · bf7bc1df

由 Will Deacon 提交于 5月 05, 2020

arm64 KVM supports 16k pages since 02e0b760
("arm64: kvm: Add support for 16K pages"), so update the Kconfig help
text accordingly.
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NFuad Tabba <tabba@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200505154520.194120-3-tabba@google.com

bf7bc1df

KVM: arm64: Kill off CONFIG_KVM_ARM_HOST · d82755b2

由 Will Deacon 提交于 5月 05, 2020

CONFIG_KVM_ARM_HOST is just a proxy for CONFIG_KVM, so remove it in favour
of the latter.
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NFuad Tabba <tabba@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200505154520.194120-2-tabba@google.com

d82755b2

KVM: arm64: Move virt/kvm/arm to arch/arm64 · 9ed24f4b

由 Marc Zyngier 提交于 5月 13, 2020

Now that the 32bit KVM/arm host is a distant memory, let's move the
whole of the KVM/arm64 code into the arm64 tree.

As they said in the song: Welcome Home (Sanitarium).
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200513104034.74741-1-maz@kernel.org

9ed24f4b

07 5月, 2020 1 次提交

arm64: hugetlb: avoid potential NULL dereference · 027d0c71

由 Mark Rutland 提交于 5月 05, 2020

The static analyzer in GCC 10 spotted that in huge_pte_alloc() we may
pass a NULL pmdp into pte_alloc_map() when pmd_alloc() returns NULL:

|   CC      arch/arm64/mm/pageattr.o
|   CC      arch/arm64/mm/hugetlbpage.o
|                  from arch/arm64/mm/hugetlbpage.c:10:
| arch/arm64/mm/hugetlbpage.c: In function ‘huge_pte_alloc’:
| ./arch/arm64/include/asm/pgtable-types.h:28:24: warning: dereference of NULL ‘pmdp’ [CWE-690] [-Wanalyzer-null-dereference]
| ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
| arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
|     |arch/arm64/mm/hugetlbpage.c:232:10:
|     |./arch/arm64/include/asm/pgtable-types.h:28:24:
| ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
| arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’

This can only occur when the kernel cannot allocate a page, and so is
unlikely to happen in practice before other systems start failing.

We can avoid this by bailing out if pmd_alloc() fails, as we do earlier
in the function if pud_alloc() fails.

Fixes: 66b3923a ("arm64: hugetlb: add support for PTE contiguous bit")
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reported-by: NKyrill Tkachov <kyrylo.tkachov@arm.com>
Cc: <stable@vger.kernel.org> # 4.5.x-
Cc: Will Deacon <will@kernel.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

027d0c71

01 5月, 2020 2 次提交

KVM: arm64: Fix 32bit PC wrap-around · 0225fd5e

由 Marc Zyngier 提交于 4月 29, 2020

In the unlikely event that a 32bit vcpu traps into the hypervisor
on an instruction that is located right at the end of the 32bit
range, the emulation of that instruction is going to increment
PC past the 32bit range. This isn't great, as userspace can then
observe this value and get a bit confused.

Conversly, userspace can do things like (in the context of a 64bit
guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
that PC hasn't been truncated. More confusion.

Fix both by:
- truncating PC increments for 32bit guests
- sanitizing all 32bit regs every time a core reg is changed by
  userspace, and that PSTATE indicates a 32bit mode.

Cc: stable@vger.kernel.org
Acked-by: NWill Deacon <will@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

0225fd5e

arm64: vdso: Add -fasynchronous-unwind-tables to cflags · 1578e5d0

由 Vincenzo Frascino 提交于 4月 29, 2020

On arm64 linux gcc uses -fasynchronous-unwind-tables -funwind-tables
by default since gcc-8, so now the de facto platform ABI is to allow
unwinding from async signal handlers.

However on bare metal targets (aarch64-none-elf), and on old gcc,
async and sync unwind tables are not enabled by default to avoid
runtime memory costs.

This means if linux is built with a baremetal toolchain the vdso.so
may not have unwind tables which breaks the gcc platform ABI guarantee
in userspace.

Add -fasynchronous-unwind-tables explicitly to the vgettimeofday.o
cflags to address the ABI change.

Fixes: 28b1a824 ("arm64: vdso: Substitute gettimeofday() with C implementation")
Cc: Will Deacon <will@kernel.org>
Reported-by: NSzabolcs Nagy <szabolcs.nagy@arm.com>
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

1578e5d0

30 4月, 2020 4 次提交

KVM: arm64: Save/restore sp_el0 as part of __guest_enter · 6e977984

由 Marc Zyngier 提交于 4月 24, 2020

We currently save/restore sp_el0 in C code. This is a bit unsafe,
as a lot of the C code expects 'current' to be accessible from
there (and the opportunity to run kernel code in HYP is specially
great with VHE).

Instead, let's move the save/restore of sp_el0 to the assembly
code (in __guest_enter), making sure that sp_el0 is correct
very early on when we exit the guest, and is preserved as long
as possible to its host value when we enter the guest.
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

6e977984

KVM: arm64: Delete duplicated label in invalid_vector · 6aea9e05

由 Fangrui Song 提交于 4月 13, 2020

SYM_CODE_START defines \label , so it is redundant to define \label again.
A redefinition at the same place is accepted by GNU as
(https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=159fbb6088f17a341bcaaac960623cab881b4981)
but rejected by the clang integrated assembler.

Fixes: 617a2f39 ("arm64: kvm: Annotate assembly using modern annoations")
Signed-off-by: NFangrui Song <maskray@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Tested-by: NNick Desaulniers <ndesaulniers@google.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/988
Link: https://lore.kernel.org/r/20200413231016.250737-1-maskray@google.com

6aea9e05

crypto: arch/nhpoly1305 - process in explicit 4k chunks · a9a8ba90

由 Jason A. Donenfeld 提交于 4月 22, 2020

Rather than chunking via PAGE_SIZE, this commit changes the arch
implementations to chunk in explicit 4k parts, so that calculations on
maximum acceptable latency don't suddenly become invalid on platforms
where PAGE_SIZE isn't 4k, such as arm64.

Fixes: 0f961f9f ("crypto: x86/nhpoly1305 - add AVX2 accelerated NHPoly1305")
Fixes: 012c8238 ("crypto: x86/nhpoly1305 - add SSE2 accelerated NHPoly1305")
Fixes: a00fa0c8 ("crypto: arm64/nhpoly1305 - add NEON-accelerated NHPoly1305")
Fixes: 16aae359 ("crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305")
Cc: stable@vger.kernel.org
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a9a8ba90

crypto: arch/lib - limit simd usage to 4k chunks · 706024a5

由 Jason A. Donenfeld 提交于 4月 22, 2020

The initial Zinc patchset, after some mailing list discussion, contained
code to ensure that kernel_fpu_enable would not be kept on for more than
a 4k chunk, since it disables preemption. The choice of 4k isn't totally
scientific, but it's not a bad guess either, and it's what's used in
both the x86 poly1305, blake2s, and nhpoly1305 code already (in the form
of PAGE_SIZE, which this commit corrects to be explicitly 4k for the
former two).

Ard did some back of the envelope calculations and found that
at 5 cycles/byte (overestimate) on a 1ghz processor (pretty slow), 4k
means we have a maximum preemption disabling of 20us, which Sebastian
confirmed was probably a good limit.

Unfortunately the chunking appears to have been left out of the final
patchset that added the glue code. So, this commit adds it back in.

Fixes: 84e03fa3 ("crypto: x86/chacha - expose SIMD ChaCha routine as library function")
Fixes: b3aad5ba ("crypto: arm64/chacha - expose arm64 ChaCha routine as library function")
Fixes: a44a3430 ("crypto: arm/chacha - expose ARM ChaCha routine as library function")
Fixes: d7d7b853 ("crypto: x86/poly1305 - wire up faster implementations for kernel")
Fixes: f569ca16 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation")
Fixes: a6b803b3 ("crypto: arm/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation")
Fixes: ed0356ed ("crypto: blake2s - x86_64 SIMD implementation")
Cc: Eric Biggers <ebiggers@google.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

706024a5

23 4月, 2020 1 次提交

arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h> · 62d0fd59

由 Masahiro Yamada 提交于 4月 22, 2020

As the bug report [1] pointed out, <linux/vermagic.h> must be included
after <linux/module.h>.

I believe we should not impose any include order restriction. We often
sort include directives alphabetically, but it is just coding style
convention. Technically, we can include header files in any order by
making every header self-contained.

Currently, arch-specific MODULE_ARCH_VERMAGIC is defined in
<asm/module.h>, which is not included from <linux/vermagic.h>.

Hence, the straight-forward fix-up would be as follows:

|--- a/include/linux/vermagic.h
|+++ b/include/linux/vermagic.h
|@@ -1,5 +1,6 @@
| /* SPDX-License-Identifier: GPL-2.0 */
| #include <generated/utsrelease.h>
|+#include <linux/module.h>
|
| /* Simply sanity version stamp for modules. */
| #ifdef CONFIG_SMP

This works enough, but for further cleanups, I split MODULE_ARCH_VERMAGIC
definitions into <asm/vermagic.h>.

With this, <linux/module.h> and <linux/vermagic.h> will be orthogonal,
and the location of MODULE_ARCH_VERMAGIC definitions will be consistent.

For arc and ia64, MODULE_PROC_FAMILY is only used for defining
MODULE_ARCH_VERMAGIC. I squashed it.

For hexagon, nds32, and xtensa, I removed <asm/modules.h> entirely
because they contained nothing but MODULE_ARCH_VERMAGIC definition.
Kbuild will automatically generate <asm/modules.h> at build-time,
wrapping <asm-generic/module.h>.

[1] https://lore.kernel.org/lkml/20200411155623.GA22175@zn.tnicReported-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Acked-by: NJessica Yu <jeyu@kernel.org>

62d0fd59

21 4月, 2020 1 次提交

arm64: sync kernel APIAKey when installing · 3fabb438

由 Mark Rutland 提交于 4月 21, 2020

A direct write to a APxxKey_EL1 register requires a context
synchronization event to ensure that indirect reads made by subsequent
instructions (e.g. AUTIASP, PACIASP) observe the new value.

When we initialize the boot task's APIAKey in boot_init_stack_canary()
via ptrauth_keys_switch_kernel() we miss the necessary ISB, and so there
is a window where instructions are not guaranteed to use the new APIAKey
value. This has been observed to result in boot-time crashes where
PACIASP and AUTIASP within a function used a mixture of the old and new
key values.

Fix this by having ptrauth_keys_switch_kernel() synchronize the new key
value with an ISB. At the same time, __ptrauth_key_install() is renamed
to __ptrauth_key_install_nosync() so that it is obvious that this
performs no synchronization itself.

Fixes: 28321582 ("arm64: initialize ptrauth keys for kernel booting task")
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reported-by: NWill Deacon <will@kernel.org>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Tested-by: NWill Deacon <will@kernel.org>

3fabb438

15 4月, 2020 2 次提交

arm64: Delete the space separator in __emit_inst · c9a4ef66

由 Fangrui Song 提交于 4月 14, 2020

In assembly, many instances of __emit_inst(x) expand to a directive. In
a few places __emit_inst(x) is used as an assembler macro argument. For
example, in arch/arm64/kvm/hyp/entry.S

  ALTERNATIVE(nop, SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)

expands to the following by the C preprocessor:

  alternative_insn nop, .inst (0xd500401f | ((0) << 16 | (4) << 5) | ((!!1) << 8)), 4, 1

Both comma and space are separators, with an exception that content
inside a pair of parentheses/quotes is not split, so the clang
integrated assembler splits the arguments to:

   nop, .inst, (0xd500401f | ((0) << 16 | (4) << 5) | ((!!1) << 8)), 4, 1

GNU as preprocesses the input with do_scrub_chars(). Its arm64 backend
(along with many other non-x86 backends) sees:

  alternative_insn nop,.inst(0xd500401f|((0)<<16|(4)<<5)|((!!1)<<8)),4,1
  # .inst(...) is parsed as one argument

while its x86 backend sees:

  alternative_insn nop,.inst (0xd500401f|((0)<<16|(4)<<5)|((!!1)<<8)),4,1
  # The extra space before '(' makes the whole .inst (...) parsed as two arguments

The non-x86 backend's behavior is considered unintentional
(https://sourceware.org/bugzilla/show_bug.cgi?id=25750).
So drop the space separator inside `.inst (...)` to make the clang
integrated assembler work.
Suggested-by: NIlie Halip <ilie.halip@gmail.com>
Signed-off-by: NFangrui Song <maskray@google.com>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/939Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

c9a4ef66

arm64: vdso: don't free unallocated pages · 9cc3d0c6

由 Mark Rutland 提交于 4月 14, 2020

The aarch32_vdso_pages[] array never has entries allocated in the C_VVAR
or C_VDSO slots, and as the array is zero initialized these contain
NULL.

However in __aarch32_alloc_vdso_pages() when
aarch32_alloc_kuser_vdso_page() fails we attempt to free the page whose
struct page is at NULL, which is obviously nonsensical.

This patch removes the erroneous page freeing.

Fixes: 7c1deeeb ("arm64: compat: VDSO setup for compat layer")
Cc: <stable@vger.kernel.org> # 5.3.x-
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Acked-by: NWill Deacon <will@kernel.org>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

9cc3d0c6

11 4月, 2020 5 次提交

mm/memory_hotplug: add pgprot_t to mhp_params · bfeb022f

由 Logan Gunthorpe 提交于 4月 10, 2020

devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory.  At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB.
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-.  In the case firmware doesn't set this
register it is effectively WB and will typically result in a machine
check exception when it's accessed.

Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.

To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().

Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
simple change to pass the pgprot_t down to their respective functions
which set up the page tables.  For x86_32, set the page tables
explicitly using _set_memory_prot() (seeing they are already mapped).

For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
should be fine, for now, seeing these architectures don't support
ZONE_DEVICE.

A check in __add_pages() is also added to ensure the pgprot parameter
was set for all arches.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bfeb022f

mm/memory_hotplug: rename mhp_restrictions to mhp_params · f5637d3b

由 Logan Gunthorpe 提交于 4月 10, 2020

The mhp_restrictions struct really doesn't specify anything resembling a
restriction anymore so rename it to be mhp_params as it is a list of
extended parameters.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5637d3b

mm/vma: introduce VM_ACCESS_FLAGS · 6cb4d9a2

由 Anshuman Khandual 提交于 4月 10, 2020

There are many places where all basic VMA access flags (read, write,
exec) are initialized or checked against as a group.  One such example
is during page fault.  Existing vma_is_accessible() wrapper already
creates the notion of VMA accessibility as a group access permissions.

Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which
will not only reduce code duplication but also extend the VMA
accessibility concept in general.
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Springer <rspringer@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6cb4d9a2

mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS · c62da0c3

由 Anshuman Khandual 提交于 4月 10, 2020

There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
existing VM_STACK_DEFAULT_FLAGS.  While here, also define some more
macros with standard VMA access flag combinations that are used
frequently across many platforms.  Apart from simplification, this
reduces code duplication as well.
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c62da0c3

mm: hugetlb: optionally allocate gigantic hugepages using cma · cf11e85f

由 Roman Gushchin 提交于 4月 10, 2020

Commit 944d9fec ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.

However it actually works only at early stages of the system loading,
when the majority of memory is free.  After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero.  Even dropping caches manually doesn't
help a lot.

At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex.  At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.

The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
   as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
   cma allocator and the dedicated cma area

In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.

* On a multi-node machine a per-node cma area is allocated on each node.
  Following gigantic hugetlb allocation are using the first available
  numa node if the mask isn't specified by a user.

Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
   pass hugetlb_cma=10G as a kernel argument

2) allocate hugetlb pages as usual, e.g.
   echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.

x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.

The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap.  It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!
Signed-off-by: NRoman Gushchin <guro@fb.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NAndreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: NMike Kravetz <mike.kravetz@oracle.com>
Acked-by: NMichal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cf11e85f

09 4月, 2020 1 次提交

arm64: armv8_deprecated: Fix undef_hook mask for thumb setend · fc226601

由 Fredrik Strupe 提交于 4月 08, 2020

For thumb instructions, call_undef_hook() in traps.c first reads a u16,
and if the u16 indicates a T32 instruction (u16 >= 0xe800), a second
u16 is read, which then makes up the the lower half-word of a T32
instruction. For T16 instructions, the second u16 is not read,
which makes the resulting u32 opcode always have the upper half set to
0.

However, having the upper half of instr_mask in the undef_hook set to 0
masks out the upper half of all thumb instructions - both T16 and T32.
This results in trapped T32 instructions with the lower half-word equal
to the T16 encoding of setend (b650) being matched, even though the upper
half-word is not 0000 and thus indicates a T32 opcode.

An example of such a T32 instruction is eaa0b650, which should raise a
SIGILL since T32 instructions with an eaa prefix are unallocated as per
Arm ARM, but instead works as a SETEND because the second half-word is set
to b650.

This patch fixes the issue by extending instr_mask to include the
upper u32 half, which will still match T16 instructions where the upper
half is 0, but not T32 instructions.

Fixes: 2d888f48 ("arm64: Emulate SETEND for AArch32 tasks")
Cc: <stable@vger.kernel.org> # 4.0.x-
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NFredrik Strupe <fredrik@strupe.net>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

fc226601

03 4月, 2020 4 次提交

mm: allow VM_FAULT_RETRY for multiple times · 4064b982

由 Peter Xu 提交于 4月 01, 2020

The idea comes from a discussion between Linus and Andrea [1].

Before this patch we only allow a page fault to retry once.  We achieved
this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
handle_mm_fault() the second time.  This was majorly used to avoid
unexpected starvation of the system by looping over forever to handle the
page fault on a single page.  However that should hardly happen, and after
all for each code path to return a VM_FAULT_RETRY we'll first wait for a
condition (during which time we should possibly yield the cpu) to happen
before VM_FAULT_RETRY is really returned.

This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
flag when we receive VM_FAULT_RETRY.  It means that the page fault handler
now can retry the page fault for multiple times if necessary without the
need to generate another page fault event.  Meanwhile we still keep the
FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
page fault is the first attempt or not.

Then we'll have these combinations of fault flags (only considering
ALLOW_RETRY flag and TRIED flag):

  - ALLOW_RETRY and !TRIED:  this means the page fault allows to
                             retry, and this is the first try

  - ALLOW_RETRY and TRIED:   this means the page fault allows to
                             retry, and this is not the first try

  - !ALLOW_RETRY and !TRIED: this means the page fault does not allow
                             to retry at all

  - !ALLOW_RETRY and TRIED:  this is forbidden and should never be used

In existing code we have multiple places that has taken special care of
the first condition above by checking against (fault_flags &
FAULT_FLAG_ALLOW_RETRY).  This patch introduces a simple helper to detect
the first retry of a page fault by checking against both (fault_flags &
FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
even the 2nd try will have the ALLOW_RETRY set, then use that helper in
all existing special paths.  One example is in __lock_page_or_retry(), now
we'll drop the mmap_sem only in the first attempt of page fault and we'll
keep it in follow up retries, so old locking behavior will be retained.

This will be a nice enhancement for current code [2] at the same time a
supporting material for the future userfaultfd-writeprotect work, since in
that work there will always be an explicit userfault writeprotect retry
for protected pages, and if that cannot resolve the page fault (e.g., when
userfaultfd-writeprotect is used in conjunction with swapped pages) then
we'll possibly need a 3rd retry of the page fault.  It might also benefit
other potential users who will have similar requirement like userfault
write-protection.

GUP code is not touched yet and will be covered in follow up patch.

Please read the thread below for more information.

[1] https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/
[2] https://lore.kernel.org/lkml/20181230154648.GB9832@redhat.com/Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Suggested-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NBrian Geffon <bgeffon@google.com>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Martin Cracauer <cracauer@cons.org>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Link: http://lkml.kernel.org/r/20200220160246.9790-1-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4064b982

mm: introduce FAULT_FLAG_DEFAULT · dde16072

由 Peter Xu 提交于 4月 01, 2020

Although there're tons of arch-specific page fault handlers, most of them
are still sharing the same initial value of the page fault flags.  Say,
merely all of the page fault handlers would allow the fault to be retried,
and they also allow the fault to respond to SIGKILL.

Let's define a default value for the fault flags to replace those initial
page fault flags that were copied over.  With this, it'll be far easier to
introduce new fault flag that can be used by all the architectures instead
of touching all the archs.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NBrian Geffon <bgeffon@google.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Martin Cracauer <cracauer@cons.org>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Link: http://lkml.kernel.org/r/20200220160238.9694-1-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dde16072

arm64/mm: use helper fault_signal_pending() · b502f038

由 Peter Xu 提交于 4月 01, 2020

Let the arm64 fault handling to use the new fault_signal_pending() helper,
by moving the signal handling out of the retry logic.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NBrian Geffon <bgeffon@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Martin Cracauer <cracauer@cons.org>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Link: http://lkml.kernel.org/r/20200220155927.9264-1-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b502f038

asm-generic: make more kernel-space headers mandatory · 630f289b

由 Masahiro Yamada 提交于 4月 01, 2020

Change a header to mandatory-y if both of the following are met:

[1] At least one architecture (except um) specifies it as generic-y in
    arch/*/include/asm/Kbuild

[2] Every architecture (except um) either has its own implementation
    (arch/*/include/asm/*.h) or specifies it as generic-y in
    arch/*/include/asm/Kbuild

This commit was generated by the following shell script.

----------------------------------->8-----------------------------------

arches=$(cd arch; ls -1 | sed -e '/Kconfig/d' -e '/um/d')

tmpfile=$(mktemp)

grep "^mandatory-y +=" include/asm-generic/Kbuild > $tmpfile

find arch -path 'arch/*/include/asm/Kbuild' |
	xargs sed -n 's/^generic-y += \(.*\)/\1/p' | sort -u |
while read header
do
	mandatory=yes

	for arch in $arches
	do
		if ! grep -q "generic-y += $header" arch/$arch/include/asm/Kbuild &&
			! [ -f arch/$arch/include/asm/$header ]; then
			mandatory=no
			break
		fi
	done

	if [ "$mandatory" = yes ]; then
		echo "mandatory-y += $header" >> $tmpfile

		for arch in $arches
		do
			sed -i "/generic-y += $header/d" arch/$arch/include/asm/Kbuild
		done
	fi

done

sed -i '/^mandatory-y +=/d' include/asm-generic/Kbuild

LANG=C sort $tmpfile >> include/asm-generic/Kbuild

----------------------------------->8-----------------------------------

One obvious benefit is the diff stat:

 25 files changed, 52 insertions(+), 557 deletions(-)

It is tedious to list generic-y for each arch that needs it.

So, mandatory-y works like a fallback default (by just wrapping
asm-generic one) when arch does not have a specific header
implementation.

See the following commits:

def3f7ce
a1b39bae

It is tedious to convert headers one by one, so I processed by a shell
script.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/20200210175452.5030-1-masahiroy@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

630f289b

02 4月, 2020 3 次提交

arm64: remove CONFIG_DEBUG_ALIGN_RODATA feature · e16e65a0

由 Ard Biesheuvel 提交于 3月 29, 2020

When CONFIG_DEBUG_ALIGN_RODATA is enabled, kernel segments mapped with
different permissions (r-x for .text, r-- for .rodata, rw- for .data,
etc) are rounded up to 2 MiB so they can be mapped more efficiently.
In particular, it permits the segments to be mapped using level 2
block entries when using 4k pages, which is expected to result in less
TLB pressure.

However, the mappings for the bulk of the kernel will use level 2
entries anyway, and the misaligned fringes are organized such that they
can take advantage of the contiguous bit, and use far fewer level 3
entries than would be needed otherwise.

This makes the value of this feature dubious at best, and since it is not
enabled in defconfig or in the distro configs, it does not appear to be
in wide use either. So let's just remove it.
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NWill Deacon <will@kernel.org>
Acked-by: NLaura Abbott <labbott@kernel.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

e16e65a0

arm64: Always force a branch protection mode when the compiler has one · b8fdef31

由 Mark Brown 提交于 3月 31, 2020

Compilers with branch protection support can be configured to enable it by
default, it is likely that distributions will do this as part of deploying
branch protection system wide. As well as the slight overhead from having
some extra NOPs for unused branch protection features this can cause more
serious problems when the kernel is providing pointer authentication to
userspace but not built for pointer authentication itself. In that case our
switching of keys for userspace can affect the kernel unexpectedly, causing
pointer authentication instructions in the kernel to corrupt addresses.

To ensure that we get consistent and reliable behaviour always explicitly
initialise the branch protection mode, ensuring that the kernel is built
the same way regardless of the compiler defaults.

Fixes: 75031975 (arm64: add basic pointer authentication support)
Reported-by: NSzabolcs Nagy <szabolcs.nagy@arm.com>
Signed-off-by: NMark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
[catalin.marinas@arm.com: remove Kconfig option in favour of Makefile check]
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b8fdef31

vhost: refine vhost and vringh kconfig · 20c384f1

由 Jason Wang 提交于 3月 26, 2020

Currently, CONFIG_VHOST depends on CONFIG_VIRTUALIZATION. But vhost is
not necessarily for VM since it's a generic userspace and kernel
communication protocol. Such dependency may prevent archs without
virtualization support from using vhost.

To solve this, a dedicated vhost menu is created under drivers so
CONIFG_VHOST can be decoupled out of CONFIG_VIRTUALIZATION.

While at it, also squash Kconfig.vringh into vhost Kconfig file. This
avoids the trick of conditional inclusion from VOP or CAIF. Then it
will be easier to introduce new vringh users and common dependency for
both vringh and vhost.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200326140125.19794-2-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

20c384f1

01 4月, 2020 1 次提交

arm64: Kconfig: ptrauth: Add binutils version check to fix mismatch · 15cd0e67

由 Amit Daniel Kachhap 提交于 3月 30, 2020

Recent addition of ARM64_PTR_AUTH exposed a mismatch issue with binutils.
9.1+ versions of gcc inserts a section note .note.gnu.property but this
can be used properly by binutils version greater than 2.33.1. If older
binutils are used then the following warnings are generated,

aarch64-linux-ld: warning: arch/arm64/kernel/vdso/vgettimeofday.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000
aarch64-linux-objdump: warning: arch/arm64/lib/csum.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000
aarch64-linux-nm: warning: .tmp_vmlinux1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000

This patch enables ARM64_PTR_AUTH when gcc and binutils versions are
compatible with each other. Older gcc which do not insert such section
continue to work as before.

This scenario may not occur with clang as a recent commit 3b446c7d
("arm64: Kconfig: verify binutils support for ARM64_PTR_AUTH") masks
binutils version lesser then 2.34.
Reported-by: Nkbuild test robot <lkp@intel.com>
Suggested-by: NVincenzo Frascino <Vincenzo.Frascino@arm.com>
Signed-off-by: NAmit Daniel Kachhap <amit.kachhap@arm.com>
[catalin.marinas@arm.com: slight adjustment to the comment]
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

15cd0e67

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功