提交 · 9d0c8e793f0eb0613efe81d2cdca8c2efa0ad33c · openeuler / Kernel

14 3月, 2021 2 次提交

ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign · 61bf318e

In https://bugs.gentoo.org/769614 Dmitry noticed that
`ptrace(PTRACE_GET_SYSCALL_INFO)` does not return error sign properly.

The bug is in mismatch between get/set errors:

static inline long syscall_get_error(struct task_struct *task,
                                     struct pt_regs *regs)
{
        return regs->r10 == -1 ? regs->r8:0;
}

static inline long syscall_get_return_value(struct task_struct *task,
                                            struct pt_regs *regs)
{
        return regs->r8;
}

static inline void syscall_set_return_value(struct task_struct *task,
                                            struct pt_regs *regs,
                                            int error, long val)
{
        if (error) {
                /* error < 0, but ia64 uses > 0 return value */
                regs->r8 = -error;
                regs->r10 = -1;
        } else {
                regs->r8 = val;
                regs->r10 = 0;
        }
}

Tested on v5.10 on rx3600 machine (ia64 9040 CPU).

Link: https://lkml.kernel.org/r/20210221002554.333076-2-slyfox@gentoo.org
Link: https://bugs.gentoo.org/769614Signed-off-by: NSergei Trofimovich <slyfox@gentoo.org>
Reported-by: NDmitry V. Levin <ldv@altlinux.org>
Reviewed-by: NDmitry V. Levin <ldv@altlinux.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

61bf318e

ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls · 0ceb1ace

由 Sergei Trofimovich 提交于 3年前

In https://bugs.gentoo.org/769614 Dmitry noticed that
`ptrace(PTRACE_GET_SYSCALL_INFO)` does not work for syscalls called via
glibc's syscall() wrapper.

ia64 has two ways to call syscalls from userspace: via `break` and via
`eps` instructions.

The difference is in stack layout:

1. `eps` creates simple stack frame: no locals, in{0..7} == out{0..8}
2. `break` uses userspace stack frame: may be locals (glibc provides
   one), in{0..7} == out{0..8}.

Both work fine in syscall handling cde itself.

But `ptrace(PTRACE_GET_SYSCALL_INFO)` uses unwind mechanism to
re-extract syscall arguments but it does not account for locals.

The change always skips locals registers. It should not change `eps`
path as kernel's handler already enforces locals=0 and fixes `break`.

Tested on v5.10 on rx3600 machine (ia64 9040 CPU).

Link: https://lkml.kernel.org/r/20210221002554.333076-1-slyfox@gentoo.org
Link: https://bugs.gentoo.org/769614Signed-off-by: NSergei Trofimovich <slyfox@gentoo.org>
Reported-by: NDmitry V. Levin <ldv@altlinux.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0ceb1ace

13 3月, 2021 4 次提交

KVM: LAPIC: Advancing the timer expiration on guest initiated write · 35737d2d

由 Wanpeng Li 提交于 3年前

Advancing the timer expiration should only be necessary on guest initiated
writes. When we cancel the timer and clear .pending during state restore,
clear expired_tscdeadline as well.
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <1614818118-965-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

35737d2d

KVM: x86/mmu: Skip !MMU-present SPTEs when removing SP in exclusive mode · 8df9f1af

由 Sean Christopherson 提交于 3年前

If mmu_lock is held for write, don't bother setting !PRESENT SPTEs to
REMOVED_SPTE when recursively zapping SPTEs as part of shadow page
removal.  The concurrent write protections provided by REMOVED_SPTE are
not needed, there are no backing page side effects to record, and MMIO
SPTEs can be left as is since they are protected by the memslot
generation, not by ensuring that the MMIO SPTE is unreachable (which
is racy with respect to lockless walks regardless of zapping behavior).

Skipping !PRESENT drastically reduces the number of updates needed to
tear down sparsely populated MMUs, e.g. when tearing down a 6gb VM that
didn't touch much memory, 6929/7168 (~96.6%) of SPTEs were '0' and could
be skipped.

Avoiding the write itself is likely close to a wash, but avoiding
__handle_changed_spte() is a clear-cut win as that involves saving and
restoring all non-volatile GPRs (it's a subtly big function), as well as
several conditional branches before bailing out.

Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210310003029.1250571-1-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8df9f1af

KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged · d7eb79c6

由 Wanpeng Li 提交于 3年前

# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                88
On-line CPU(s) list:   0-63
Off-line CPU(s) list:  64-87

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.10.0-rc3-tlinux2-0050+ root=/dev/mapper/cl-root ro
rd.lvm.lv=cl/root rhgb quiet console=ttyS0 LANG=en_US .UTF-8 no-kvmclock-vsyscall

# echo 1 > /sys/devices/system/cpu/cpu76/online
-bash: echo: write error: Cannot allocate memory

The per-cpu vsyscall pvclock data pointer assigns either an element of the
static array hv_clock_boot (#vCPU <= 64) or dynamically allocated memory
hvclock_mem (vCPU > 64), the dynamically memory will not be allocated if
kvmclock vsyscall is disabled, this can result in cpu hotpluged fails in
kvmclock_setup_percpu() which returns -ENOMEM. It's broken for no-vsyscall
and sometimes you end up with vsyscall disabled if the host does something
strange. This patch fixes it by allocating this dynamically memory
unconditionally even if vsyscall is disabled.

Fixes: 6a1cac56 ("x86/kvm: Use __bss_decrypted attribute in shared variables")
Reported-by: NZelin Deng <zelin.deng@linux.alibaba.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: stable@vger.kernel.org#v4.19-rc5+
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <1614130683-24137-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d7eb79c6

kvm: x86: annotate RCU pointers · 6fcd9cbc

由 Muhammad Usama Anjum 提交于 3年前

This patch adds the annotation to fix the following sparse errors:
arch/x86/kvm//x86.c:8147:15: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map [noderef] __rcu *
arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map *
arch/x86/kvm//x86.c:10628:16: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map [noderef] __rcu *
arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map *
arch/x86/kvm//x86.c:10629:15: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter [noderef] __rcu *
arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter *
arch/x86/kvm//lapic.c:267:15: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map [noderef] __rcu *
arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map *
arch/x86/kvm//lapic.c:269:9: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map [noderef] __rcu *
arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map *
arch/x86/kvm//lapic.c:637:15: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map [noderef] __rcu *
arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map *
arch/x86/kvm//lapic.c:994:15: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map [noderef] __rcu *
arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map *
arch/x86/kvm//lapic.c:1036:15: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map [noderef] __rcu *
arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map *
arch/x86/kvm//lapic.c:1173:15: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map [noderef] __rcu *
arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map *
arch/x86/kvm//pmu.c:190:18: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter [noderef] __rcu *
arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter *
arch/x86/kvm//pmu.c:251:18: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter [noderef] __rcu *
arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter *
arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu *
arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter *
arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces):
arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu *
arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter *
Signed-off-by: NMuhammad Usama Anjum <musamaanjum@gmail.com>
Message-Id: <20210305191123.GA497469@LEGION>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6fcd9cbc

12 3月, 2021 2 次提交

KVM: arm64: Fix exclusive limit for IPA size · 262b003d

由 Marc Zyngier 提交于 3年前

When registering a memslot, we check the size and location of that
memslot against the IPA size to ensure that we can provide guest
access to the whole of the memory.

Unfortunately, this check rejects memslot that end-up at the exact
limit of the addressing capability for a given IPA size. For example,
it refuses the creation of a 2GB memslot at 0x8000000 with a 32bit
IPA space.

Fix it by relaxing the check to accept a memslot reaching the
limit of the IPA space.

Fixes: c3058d5d ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE")
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20210311100016.3830038-3-maz@kernel.org

262b003d

KVM: arm64: Reject VM creation when the default IPA size is unsupported · 7d717558

由 Marc Zyngier 提交于 3年前

KVM/arm64 has forever used a 40bit default IPA space, partially
due to its 32bit heritage (where the only choice is 40bit).

However, there are implementations in the wild that have a *cough*
much smaller *cough* IPA space, which leads to a misprogramming of
VTCR_EL2, and a guest that is stuck on its first memory access
if userspace dares to ask for the default IPA setting (which most
VMMs do).

Instead, blundly reject the creation of such VM, as we can't
satisfy the requirements from userspace (with a one-off warning).
Also clarify the boot warning, and document that the VM creation
will fail when an unsupported IPA size is provided.

Although this is an ABI change, it doesn't really change much
for userspace:

- the guest couldn't run before this change, but no error was
  returned. At least userspace knows what is happening.

- a memory slot that was accepted because it did fit the default
  IPA space now doesn't even get a chance to be registered.

The other thing that is left doing is to convince userspace to
actually use the IPA space setting instead of relying on the
antiquated default.

Fixes: 233a7cb2 ("kvm: arm64: Allow tuning the physical address size for VM")
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20210311100016.3830038-2-maz@kernel.org

7d717558

11 3月, 2021 6 次提交

arm64: mm: remove unused __cpu_uses_extended_idmap[_level()] · 30b26757

由 Ard Biesheuvel 提交于 3年前

These routines lost all existing users during the latest merge window so
we can remove them. This avoids the need to fix them in the context of
fixing a regression related to the ID map on 52-bit VA kernels.
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210310171515.416643-3-ardb@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

30b26757

arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds · 7ba8f2b2

由 Ard Biesheuvel 提交于 3年前

52-bit VA kernels can run on hardware that is only 48-bit capable, but
configure the ID map as 52-bit by default. This was not a problem until
recently, because the special T0SZ value for a 52-bit VA space was never
programmed into the TCR register anwyay, and because a 52-bit ID map
happens to use the same number of translation levels as a 48-bit one.

This behavior was changed by commit 1401bef7 ("arm64: mm: Always update
TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ
value for a 52-bit VA to be programmed into TCR_EL1. While some hardware
simply ignores this, Mark reports that Amberwing systems choke on this,
resulting in a broken boot. But even before that commit, the unsupported
idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly
as well.

Given that we already have to deal with address spaces being either 48-bit
or 52-bit in size, the cleanest approach seems to be to simply default to
a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the
kernel in DRAM requires it. This is guaranteed not to happen unless the
system is actually 52-bit VA capable.

Fixes: 90ec95cd ("arm64: mm: Introduce VA_BITS_MIN")
Reported-by: NMark Salter <msalter@redhat.com>
Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.comSigned-off-by: NArd Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

7ba8f2b2

kbuild: remove LLVM=1 test from HAS_LTO_CLANG · 4c273d23

由 Masahiro Yamada 提交于 3年前

As Documentation/kbuild/llvm.rst notes, LLVM=1 switches the default of
tools, but you can still override CC, LD, etc. individually. This LLVM=1
check is unneeded because each tool is already checked separately.

"make CC=clang LD=ld.lld NM=llvm-nm AR=llvm-ar LLVM_IAS=1 menuconfig"
should be able to enable Clang LTO.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NNathan Chancellor <nathan@kernel.org>

4c273d23

kbuild: Allow LTO to be selected with KASAN_HW_TAGS · bf3c2551

由 Sami Tolvanen 提交于 3年前

While LTO with KASAN is normally not useful, hardware tag-based KASAN
can be used also in production kernels with ARM64_MTE. Therefore, allow
KASAN_HW_TAGS to be selected together with HAS_LTO_CLANG.
Reported-by: NAlistair Delva <adelva@google.com>
Signed-off-by: NSami Tolvanen <samitolvanen@google.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>

bf3c2551

Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF} · bce21a2b

由 Jan Beulich 提交于 3年前

It's not helpful if every driver has to cook its own. Generalize
xenbus'es INVALID_GRANT_HANDLE and pcifront's INVALID_GRANT_REF (which
shouldn't have expanded to zero to begin with). Use the constants in
p2m.c and gntdev.c right away, and update field types where necessary so
they would match with the constants' types (albeit without touching
struct ioctl_gntdev_grant_ref's ref field, as that's part of the public
interface of the kernel and would require introducing a dependency on
Xen's grant_table.h public header).
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/db7c38a5-0d75-d5d1-19de-e5fe9f0b9c48@suse.comSigned-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>

bce21a2b

Xen: drop exports of {set,clear}_foreign_p2m_mapping() · 0f9b05b9

由 Jan Beulich 提交于 3年前

They're only used internally, and the layering violation they contain
(x86) or imply (Arm) of calling HYPERVISOR_grant_table_op() strongly
advise against any (uncontrolled) use from a module. The functions also
never had users except the ones from drivers/xen/grant-table.c forever
since their introduction in 3.15.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/746a5cd6-1446-eda4-8b23-03c1cac30b8d@suse.comSigned-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>

0f9b05b9

10 3月, 2021 6 次提交

arm64: perf: Fix 64-bit event counter read truncation · 7bb8bc6e

由 Rob Herring 提交于 3年前

Commit 0fdf1bb7 ("arm64: perf: Avoid PMXEV* indirection") changed
armv8pmu_read_evcntr() to return a u32 instead of u64. The result is
silent truncation of the event counter when using 64-bit counters. Given
the offending commit appears to have passed thru several folks, it seems
likely this was a bad rebase after v8.5 PMU 64-bit counters landed.

Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: <stable@vger.kernel.org>
Fixes: 0fdf1bb7 ("arm64: perf: Avoid PMXEV* indirection")
Signed-off-by: NRob Herring <robh@kernel.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20210310004412.1450128-1-robh@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

7bb8bc6e

arm64/mm: Fix __enable_mmu() for new TGRAN range values · 26f55386

由 James Morse 提交于 3年前

As per ARM ARM DDI 0487G.a, when FEAT_LPA2 is implemented, ID_AA64MMFR0_EL1
might contain a range of values to describe supported translation granules
(4K and 16K pages sizes in particular) instead of just enabled or disabled
values. This changes __enable_mmu() function to handle complete acceptable
range of values (depending on whether the field is signed or unsigned) now
represented with ID_AA64MMFR0_TGRAN_SUPPORTED_[MIN..MAX] pair. While here,
also fix similar situations in EFI stub and KVM as well.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-efi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1615355590-21102-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>

26f55386

arm64: mte: Map hotplugged memory as Normal Tagged · d15dfd31

由 Catalin Marinas 提交于 3年前

In a system supporting MTE, the linear map must allow reading/writing
allocation tags by setting the memory type as Normal Tagged. Currently,
this is only handled for memory present at boot. Hotplugged memory uses
Normal non-Tagged memory.

Introduce pgprot_mhp() for hotplugged memory and use it in
add_memory_resource(). The arm64 code maps pgprot_mhp() to
pgprot_tagged().

Note that ZONE_DEVICE memory should not be mapped as Tagged and
therefore setting the memory type in arch_add_memory() is not feasible.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Fixes: 0178dc76 ("arm64: mte: Use Normal Tagged attributes for the linear map")
Reported-by: NPatrick Daly <pdaly@codeaurora.org>
Tested-by: NPatrick Daly <pdaly@codeaurora.org>
Link: https://lore.kernel.org/r/1614745263-27827-1-git-send-email-pdaly@codeaurora.org
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Will Deacon <will@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210309122601.5543-1-catalin.marinas@arm.comSigned-off-by: NWill Deacon <will@kernel.org>

d15dfd31

sparc: sparc64_defconfig: remove duplicate CONFIGs · 69264b4a

由 Corentin Labbe 提交于 3年前

After my patch there is CONFIG_ATA defined twice.
Remove the duplicate one.
Same problem for CONFIG_HAPPYMEAL, except I added as builtin for boot
test with NFS.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Fixes: a57cdeb3 ("sparc: sparc64_defconfig: add necessary configs for qemu")
Signed-off-by: NCorentin Labbe <clabbe@baylibre.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

69264b4a

sparc64: Fix opcode filtering in handling of no fault loads · e5e8b80d

由 Rob Gardner 提交于 3年前

is_no_fault_exception() has two bugs which were discovered via random
opcode testing with stress-ng. Both are caused by improper filtering
of opcodes.

The first bug can be triggered by a floating point store with a no-fault
ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040.

The code first tests op3[5] (0x1000000), which denotes a floating
point instruction, and then tests op3[2] (0x200000), which denotes a
store instruction. But these bits are not mutually exclusive, and the
above mentioned opcode has both bits set. The intent is to filter out
stores, so the test for stores must be done first in order to have
any effect.

The second bug can be triggered by a floating point load with one of
the invalid ASI values 0x8e or 0x8f, which pass this check in
is_no_fault_exception():
     if ((asi & 0xf2) == ASI_PNF)

An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38",
opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal
in handle_ldf_stq(), and is_no_fault_exception() must not allow these
invalid asi values to make it that far.

In both of these cases, handle_ldf_stq() reacts by calling
sun4v_data_access_exception() or spitfire_data_access_exception(),
which call is_no_fault_exception() and results in an infinite
recursion.
Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
Tested-by: NAnatoly Pugachev <matorola@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5e8b80d

KVM: arm64: Ensure I-cache isolation between vcpus of a same VM · 01dc9262

由 Marc Zyngier 提交于 3年前

It recently became apparent that the ARMv8 architecture has interesting
rules regarding attributes being used when fetching instructions
if the MMU is off at Stage-1.

In this situation, the CPU is allowed to fetch from the PoC and
allocate into the I-cache (unless the memory is mapped with
the XN attribute at Stage-2).

If we transpose this to vcpus sharing a single physical CPU,
it is possible for a vcpu running with its MMU off to influence
another vcpu running with its MMU on, as the latter is expected to
fetch from the PoU (and self-patching code doesn't flush below that
level).

In order to solve this, reuse the vcpu-private TLB invalidation
code to apply the same policy to the I-cache, nuking it every time
the vcpu runs on a physical CPU that ran another vcpu of the same
VM in the past.

This involve renaming __kvm_tlb_flush_local_vmid() to
__kvm_flush_cpu_context(), and inserting a local i-cache invalidation
there.

Cc: stable@vger.kernel.org
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210303164505.68492-1-maz@kernel.org

01dc9262

09 3月, 2021 6 次提交

arm64: kasan: fix page_alloc tagging with DEBUG_VIRTUAL · 86c83365

由 Andrey Konovalov 提交于 3年前

When CONFIG_DEBUG_VIRTUAL is enabled, the default page_to_virt() macro
implementation from include/linux/mm.h is used. That definition doesn't
account for KASAN tags, which leads to no tags on page_alloc allocations.

Provide an arm64-specific definition for page_to_virt() when
CONFIG_DEBUG_VIRTUAL is enabled that takes care of KASAN tags.

Fixes: 2813b9c0 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrey Konovalov <andreyknvl@google.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/4b55b35202706223d3118230701c6a59749d9b72.1615219501.git.andreyknvl@google.comSigned-off-by: NWill Deacon <will@kernel.org>

86c83365

MIPS: kernel: Reserve exception base early to prevent corruption · bd67b711

由 Thomas Bogendoerfer 提交于 3年前

BMIPS is one of the few platforms that do change the exception base.
After commit 2dcb3964 ("memblock: do not start bottom-up allocations
with kernel_end") we started seeing BMIPS boards fail to boot with the
built-in FDT being corrupted.

Before the cited commit, early allocations would be in the [kernel_end,
RAM_END] range, but after commit they would be within [RAM_START +
PAGE_SIZE, RAM_END].

The custom exception base handler that is installed by
bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the
memory region allocated by unflatten_and_copy_device_tree() thus
corrupting the FDT used by the kernel.

To fix this, we need to perform an early reservation of the custom
exception space. Additional we reserve the first 4k (1k for R3k) for
either normal exception vector space (legacy CPUs) or special vectors
like cache exceptions.

Huge thanks to Serge for analysing and proposing a solution to this
issue.

Fixes: 2dcb3964 ("memblock: do not start bottom-up allocations with kernel_end")
Reported-by: NKamal Dasu <kdasu.kdev@gmail.com>
Debugged-by: NSerge Semin <Sergey.Semin@baikalelectronics.ru>
Acked-by: NMike Rapoport <rppt@linux.ibm.com>
Tested-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NSerge Semin <fancer.lancer@gmail.com>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

bd67b711

KVM: arm64: Don't use cbz/adr with external symbols · dbaee836

由 Sami Tolvanen 提交于 3年前

allmodconfig + CONFIG_LTO_CLANG_THIN=y fails to build due to following
linker errors:

  ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21CC):
  relocation R_AARCH64_CONDBR19 out of range: 2031220 is not in
  [-1048576, 1048575]; references hyp_panic
  >>> defined in vmlinux.o

  ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21E0):
  relocation R_AARCH64_ADR_PREL_LO21 out of range: 2031200 is not in
  [-1048576, 1048575]; references hyp_panic
  >>> defined in vmlinux.o

This is because with LTO, the compiler ends up placing hyp_panic()
more than 1MB away from __guest_enter(). Use an unconditional branch
and adr_l instead to fix the issue.

Link: https://github.com/ClangBuiltLinux/linux/issues/1317Reported-by: NNathan Chancellor <nathan@kernel.org>
Suggested-by: NMarc Zyngier <maz@kernel.org>
Suggested-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NSami Tolvanen <samitolvanen@google.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Acked-by: NWill Deacon <will@kernel.org>
Tested-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210305202124.3768527-1-samitolvanen@google.com

dbaee836

arm64/mm: Reorganize pfn_valid() · 093bbe21

由 Anshuman Khandual 提交于 3年前

There are multiple instances of pfn_to_section_nr() and __pfn_to_section()
when CONFIG_SPARSEMEM is enabled. This can be optimized if memory section
is fetched earlier. This replaces the open coded PFN and ADDR conversion
with PFN_PHYS() and PHYS_PFN() helpers. While there, also add a comment.
This does not cause any functional change.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1614921898-4099-3-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>

093bbe21

arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory · eeb0753b

由 Anshuman Khandual 提交于 3年前

pfn_valid() validates a pfn but basically it checks for a valid struct page
backing for that pfn. It should always return positive for memory ranges
backed with struct page mapping. But currently pfn_valid() fails for all
ZONE_DEVICE based memory types even though they have struct page mapping.

pfn_valid() asserts that there is a memblock entry for a given pfn without
MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
that they do not have memblock entries. Hence memblock_is_map_memory() will
invariably fail via memblock_search() for a ZONE_DEVICE based address. This
eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
into the system via memremap_pages() called from a driver, their respective
memory sections will not have SECTION_IS_EARLY set.

Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
for firmware reserved memory regions. memblock_is_map_memory() can just be
skipped as its always going to be positive and that will be an optimization
for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
hotplugged memory too will not have SECTION_IS_EARLY set for their sections

Skipping memblock_is_map_memory() for all non early memory sections would
fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
performance for normal hotplug memory as well.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: NDavid Hildenbrand <david@redhat.com>
Fixes: 73b20c84 ("arm64: mm: implement pte_devmap support")
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1614921898-4099-2-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>

eeb0753b

MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytes · 6654111c

由 Bjørn Mork 提交于 3年前

The devicetree specification requires 8-byte alignment in
memory. This is now enforced by libfdt since commit 79edff12
("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9")
which included the upstream commit 5e735860c478 ("libfdt: Check for
8-byte address alignment in fdt_ro_probe_()").

This broke the MIPS raw appended DTBs which would be appended to
the image immediately following the initramfs section.  This ends
with a 32bit size, resulting in a 4-byte alignment of the DTB.

Fix by padding with zeroes to 8-bytes when MIPS_RAW_APPENDED_DTB
is defined.

Fixes: 79edff12 ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9")
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: NBjørn Mork <bjorn@mork.no>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

6654111c

08 3月, 2021 13 次提交

arm64/mm: Drop THP conditionality from FORCE_MAX_ZONEORDER · 79cc2ed5

由 Anshuman Khandual 提交于 3年前

Currently without THP being enabled, MAX_ORDER via FORCE_MAX_ZONEORDER gets
reduced to 11, which falls below HUGETLB_PAGE_ORDER for certain 16K and 64K
page size configurations. This is problematic which throws up the following
warning during boot as pageblock_order via HUGETLB_PAGE_ORDER order exceeds
MAX_ORDER.

WARNING: CPU: 7 PID: 127 at mm/vmstat.c:1092 __fragmentation_index+0x58/0x70
Modules linked in:
CPU: 7 PID: 127 Comm: kswapd0 Not tainted 5.12.0-rc1-00005-g0221e3101a1 #237
Hardware name: linux,dummy-virt (DT)
pstate: 20400005 (nzCv daif +PAN -UAO -TCO BTYPE=--)
pc : __fragmentation_index+0x58/0x70
lr : fragmentation_index+0x88/0xa8
sp : ffff800016ccfc00
x29: ffff800016ccfc00 x28: 0000000000000000
x27: ffff800011fd4000 x26: 0000000000000002
x25: ffff800016ccfda0 x24: 0000000000000002
x23: 0000000000000640 x22: ffff0005ffcb5b18
x21: 0000000000000002 x20: 000000000000000d
x19: ffff0005ffcb3980 x18: 0000000000000004
x17: 0000000000000001 x16: 0000000000000019
x15: ffff800011ca7fb8 x14: 00000000000002b3
x13: 0000000000000000 x12: 00000000000005e0
x11: 0000000000000003 x10: 0000000000000080
x9 : ffff800011c93948 x8 : 0000000000000000
x7 : 0000000000000000 x6 : 0000000000007000
x5 : 0000000000007944 x4 : 0000000000000032
x3 : 000000000000001c x2 : 000000000000000b
x1 : ffff800016ccfc10 x0 : 000000000000000d
Call trace:
__fragmentation_index+0x58/0x70
compaction_suitable+0x58/0x78
wakeup_kcompactd+0x8c/0xd8
balance_pgdat+0x570/0x5d0
kswapd+0x1e0/0x388
kthread+0x154/0x158
ret_from_fork+0x10/0x30

This solves the problem via keeping FORCE_MAX_ZONEORDER unchanged with or
without THP on 16K and 64K page size configurations, making sure that the
HUGETLB_PAGE_ORDER (and pageblock_order) would never exceed MAX_ORDER.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1614597914-28565-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>

79cc2ed5

arm64/mm: Drop redundant ARCH_WANT_HUGE_PMD_SHARE · 07fb6dc3

由 Anshuman Khandual 提交于 3年前

There is already an ARCH_WANT_HUGE_PMD_SHARE which is being selected for
applicable configurations. Hence just drop the other redundant entry.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1614575192-21307-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>

07fb6dc3

arm64: Drop support for CMDLINE_EXTEND · cae118b6

由 Will Deacon 提交于 3年前

The documented behaviour for CMDLINE_EXTEND is that the arguments from
the bootloader are appended to the built-in kernel command line. This
also matches the option parsing behaviour for the EFI stub and early ID
register overrides.

Bizarrely, the fdt behaviour is the other way around: appending the
built-in command line to the bootloader arguments, resulting in a
command-line that doesn't necessarily line-up with the parsing order and
definitely doesn't line-up with the documented behaviour.

As it turns out, there is a proposal [1] to replace CMDLINE_EXTEND with
CMDLINE_PREPEND and CMDLINE_APPEND options which should hopefully make
the intended behaviour much clearer. While we wait for those to land,
drop CMDLINE_EXTEND for now as there appears to be little enthusiasm for
changing the current FDT behaviour.

[1] https://lore.kernel.org/lkml/20190319232448.45964-2-danielwa@cisco.com/

Cc: Max Uvarov <muvarov@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/CAL_JsqJX=TCCs7=gg486r9TN4NYscMTCLNfqJF9crskKPq-bTg@mail.gmail.com
Link: https://lore.kernel.org/r/20210303134927.18975-3-will@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

cae118b6

arm64: cpufeatures: Fix handling of CONFIG_CMDLINE for idreg overrides · df304c2d

由 Will Deacon 提交于 3年前

The built-in kernel commandline (CONFIG_CMDLINE) can be configured in
three different ways:

1. CMDLINE_FORCE: Use CONFIG_CMDLINE instead of any bootloader args
2. CMDLINE_EXTEND: Append the bootloader args to CONFIG_CMDLINE
3. CMDLINE_FROM_BOOTLOADER: Only use CONFIG_CMDLINE if there aren't
any bootloader args.

The early cmdline parsing to detect idreg overrides gets (2) and (3)
slightly wrong: in the case of (2) the bootloader args are parsed first
and in the case of (3) the CMDLINE is always parsed.

Fix these issues by moving the bootargs parsing out into a helper
function and following the same logic as that used by the EFI stub.
Reviewed-by: NMarc Zyngier <maz@kernel.org>
Fixes: 33200303 ("arm64: cpufeature: Add an early command-line cpufeature override facility")
Link: https://lore.kernel.org/r/20210303134927.18975-2-will@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

df304c2d

crypto: mips/poly1305 - enable for all MIPS processors · 6c810cf2

由 Maciej W. Rozycki 提交于 3年前

The MIPS Poly1305 implementation is generic MIPS code written such as to
support down to the original MIPS I and MIPS III ISA for the 32-bit and
64-bit variant respectively.  Lift the current limitation then to enable
code for MIPSr1 ISA or newer processors only and have it available for
all MIPS processors.
Signed-off-by: NMaciej W. Rozycki <macro@orcam.me.uk>
Fixes: a11d055e ("crypto: mips/poly1305 - incorporate OpenSSL/CRYPTOGAMS optimized implementation")
Cc: stable@vger.kernel.org # v5.5+
Acked-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

6c810cf2

MIPS: boot/compressed: Copy DTB to aligned address · 7a05293a

由 Paul Cercueil 提交于 3年前

Since 5.12-rc1, the Device Tree blob must now be properly aligned.

Therefore, the decompress routine must be careful to copy the blob at
the next aligned address after the kernel image.

This commit fixes the kernel sometimes not booting with a Device Tree
blob appended to it.

Fixes: 79edff12 ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9")
Signed-off-by: NPaul Cercueil <paul@crapouillou.net>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

7a05293a

s390: remove IBM_PARTITION and CONFIGFS_FS from zfcpdump defconfig · 78c7ccca

由 Alexander Egorenkov 提交于 3年前

Remove by zfcpdump unused CONFIG_IBM_PARTITION and CONFIG_CONFIGFS_FS.
Signed-off-by: NAlexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: NSteffen Maier <maier@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>

78c7ccca

H
s390: update defconfigs · d50aa69d
由 Heiko Carstens 提交于 3年前
```
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
```
d50aa69d

s390/cpumf: remove unneeded semicolon · 1c0a9c79

由 Jiapeng Chong 提交于 3年前

Fix the following coccicheck warnings:

./arch/s390/kernel/perf_cpum_cf.c:272:2-3: Unneeded semicolon.
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/1614233736-87331-1-git-send-email-jiapeng.chong@linux.alibaba.comSigned-off-by: NHeiko Carstens <hca@linux.ibm.com>

1c0a9c79

s390/cpumf: rename header file to hwctrset.h · 46b635b6

由 Thomas Richter 提交于 3年前

Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
Suggested-by: NHendrick Brueckner <brueckner@linux.ibm.com>
Acked-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>

46b635b6

s390/cpumf: remove 60 seconds read limit · c41b20de

由 Thomas Richter 提交于 3年前

Remove the 60 seconds read interval limit. Do not impose any limit
at all and allow read of complete counter sets.
Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
Acked-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>

c41b20de

s390/topology: remove always false if check · f9d8cbf3

由 Heiko Carstens 提交于 3年前

The cpumask being checked in cpu_group_map() must have at least one
cpu set; therefore remove the check.
Reviewed-by: NAlexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>

f9d8cbf3

s390/time,idle: get rid of unsigned long long · eba8e1af

由 Heiko Carstens 提交于 3年前

Get rid of unsigned long long, and use unsigned long instead
everywhere. The usage of unsigned long long is a leftover from
31 bit kernel support.
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>

eba8e1af

06 3月, 2021 1 次提交

m68k: Fix virt_addr_valid() W=1 compiler warnings · a65a802a

由 Geert Uytterhoeven 提交于 3年前

If CONFIG_DEBUG_SG=y, and CONFIG_MMU=y:

    include/linux/scatterlist.h: In function ‘sg_set_buf’:
    arch/m68k/include/asm/page_mm.h:174:49: warning: ordered comparison of pointer with null pointer [-Wextra]
      174 | #define virt_addr_valid(kaddr) ((void *)(kaddr) >= (void *)PAGE_OFFSET && (void *)(kaddr) < high_memory)
	  |                                                 ^~

or CONFIG_MMU=n:

    include/linux/scatterlist.h: In function ‘sg_set_buf’:
    arch/m68k/include/asm/page_no.h:33:50: warning: ordered comparison of pointer with null pointer [-Wextra]
       33 | #define virt_addr_valid(kaddr) (((void *)(kaddr) >= (void *)PAGE_OFFSET) && \
	  |                                                  ^~

Fix this by doing the comparison in the "unsigned long" instead of the
"void *" domain.

Note that for now this is only seen when compiling btrfs, due to commit
e9aa7c28 ("btrfs: enable W=1 checks for btrfs"), but as people
are doing more W=1 compile testing, it will start to show up elsewhere,
too.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Acked-by: NGreg Ungerer <gerg@linux-m68k.org>
Link: https://lore.kernel.org/r/20210305084122.4118826-1-geert@linux-m68k.org

a65a802a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功