提交 · 29a68c21649dddb69228b86bb94b9246e73d77b0 · openeuler / Kernel

12 1月, 2021 4 次提交

kvm: debugfs: aarch64 export cpu time related items to debugfs · 29a68c21

由 chenjiajun 提交于 12月 23, 2020

virt inclusion
category: feature
bugzilla: 46853
CVE: NA

This patch export cpu time related items to vcpu_stat.
Contain:
	steal, st_max, utime, stime, gtime

The definitions of these items are:
steal: cpu time VCPU waits for PCPU while it is servicing another VCPU
st_max: max scheduling delay
utime: cpu time in userspace
stime: cpu time in sys
gtime: cpu time in guest

Through these items, user can get many cpu usage info of vcpu, such as:
CPU Usage of Guest =  gtime_delta / delta_cputime
CPU Usage of Hyp = (utime_delta - gtime_delta + stime_delta) / delta_cputime
CPU Usage of Steal = steal_delta / delta_cputime
Max Scheduling Delay = st_max
Signed-off-by: Nliangpeng <liangpeng10@huawei.com>
Signed-off-by: Nchenjiajun <chenjiajun8@huawei.com>
Reviewed-by: NXiangyou Xie <xiexiangyou@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>

29a68c21

kvm: debugfs: export remaining aarch64 kvm exit reasons to debugfs · cf9cc040

由 chenjiajun 提交于 12月 23, 2020

virt inclusion
category: feature
bugzilla: 46853
CVE: NA

This patch export remaining aarch64 exit items to vcpu_stat via debugfs.
The items include:
	fp_asimd_exit_stat, irq_exit_stat, sys64_exit_stat,
	mabt_exit_stat, fail_entry_exit_stat, internal_error_exit_stat,
	unknown_ec_exit_stat, cp15_32_exit_stat, cp15_64_exit_stat,
	cp14_mr_exit_stat, cp14_ls_exit_stat, cp14_64_exit_stat,
	smc_exit_stat, sve_exit_stat, debug_exit_stat
Signed-off-by: NBiaoxiang Ye <yebiaoxiang@huawei.com>
Signed-off-by: NZengruan Ye <yezengruan@huawei.com>
Signed-off-by: Nchenjiajun <chenjiajun8@huawei.com>
Reviewed-by: NXiangyou Xie <xiexiangyou@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>

cf9cc040

kvm: debugfs: Export vcpu stat via debugfs · 560c628b

由 chenjiajun 提交于 12月 23, 2020

virt inclusion
category: feature
bugzilla: 46853
CVE: NA

This patch create debugfs entry for vcpu stat.
The entry path is /sys/kernel/debug/kvm/vcpu_stat.
And vcpu_stat contains partial kvm exits items of vcpu, include:
	pid, hvc_exit_stat, wfe_exit_stat, wfi_exit_stat,
	mmio_exit_user, mmio_exit_kernel, exits

Currently, The maximum vcpu limit is 1024.

From this vcpu_stat, user can get the number of these kvm exits items
over a period of time, which is helpful to monitor the virtual machine.
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Nchenjiajun <chenjiajun8@huawei.com>
Reviewed-by: NXiangyou Xie <xiexiangyou@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>

560c628b

KVM: arm64: Introduce handling of AArch32 TTBCR2 traps · 4c2c0bf6

由 Marc Zyngier 提交于 1月 07, 2021

stable inclusion
from stable-5.10.4
commit e365b97a1576e2bb268664585533c1671e2f0709
bugzilla: 46903

--------------------------------

commit ca4e5147 upstream.

ARMv8.2 introduced TTBCR2, which shares TCR_EL1 with TTBCR.
Gracefully handle traps to this register when HCR_EL2.TVM is set.

Cc: stable@vger.kernel.org
Reported-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

4c2c0bf6

02 12月, 2020 3 次提交

KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() · 7d894834

由 Yanan Wang 提交于 12月 02, 2020

If we get a FSC_PERM fault, just using (logging_active && writable) to
determine calling kvm_pgtable_stage2_map(). There will be two more cases
we should consider.

(1) After logging_active is configged back to false from true. When we
get a FSC_PERM fault with write_fault and adjustment of hugepage is needed,
we should merge tables back to a block entry. This case is ignored by still
calling kvm_pgtable_stage2_relax_perms(), which will lead to an endless
loop and guest panic due to soft lockup.

(2) We use (FSC_PERM && logging_active && writable) to determine
collapsing a block entry into a table by calling kvm_pgtable_stage2_map().
But sometimes we may only need to relax permissions when trying to write
to a page other than a block.
In this condition,using kvm_pgtable_stage2_relax_perms() will be fine.

The ISS filed bit[1:0] in ESR_EL2 regesiter indicates the stage2 lookup
level at which a D-abort or I-abort occurred. By comparing granule of
the fault lookup level with vma_pagesize, we can strictly distinguish
conditions of calling kvm_pgtable_stage2_relax_perms() or
kvm_pgtable_stage2_map(), and the above two cases will be well considered.
Suggested-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NYanan Wang <wangyanan55@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201201201034.116760-4-wangyanan55@huawei.com

7d894834

KVM: arm64: Fix handling of merging tables into a block entry · 3a0b870e

由 Yanan Wang 提交于 12月 02, 2020

When dirty logging is enabled, we collapse block entries into tables
as necessary. If dirty logging gets canceled, we can end-up merging
tables back into block entries.

When this happens, we must not only free the non-huge page-table
pages but also invalidate all the TLB entries that can potentially
cover the block. Otherwise, we end-up with multiple possible translations
for the same physical page, which can legitimately result in a TLB
conflict.

To address this, replease the bogus invalidation by IPA with a full
VM invalidation. Although this is pretty heavy handed, it happens
very infrequently and saves a bunch of invalidations by IPA.
Signed-off-by: NYanan Wang <wangyanan55@huawei.com>
[maz: fixup commit message]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201201201034.116760-3-wangyanan55@huawei.com

3a0b870e

KVM: arm64: Fix memory leak on stage2 update of a valid PTE · 5c646b7e

由 Yanan Wang 提交于 12月 02, 2020

When installing a new leaf PTE onto an invalid ptep, we need to
get_page(ptep) to account for the new mapping.

However, simply updating a valid PTE shouldn't result in any
additional refcounting, as there is new mapping. This otherwise
results in a page being forever wasted.

Address this by fixing-up the refcount in stage2_map_walker_try_leaf()
if the PTE was already valid, balancing out the later get_page()
in stage2_map_walk_leaf().
Signed-off-by: NYanan Wang <wangyanan55@huawei.com>
[maz: update commit message, add comment in the code]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201201201034.116760-2-wangyanan55@huawei.com

5c646b7e

18 11月, 2020 1 次提交

KVM: arm64: vgic-v3: Drop the reporting of GICR_TYPER.Last for userspace · 23bde347

由 Zenghui Yu 提交于 11月 17, 2020

It was recently reported that if GICR_TYPER is accessed before the RD base
address is set, we'll suffer from the unset @rdreg dereferencing. Oops...

	gpa_t last_rdist_typer = rdreg->base + GICR_TYPER +
			(rdreg->free_index - 1) * KVM_VGIC_V3_REDIST_SIZE;

It's "expected" that users will access registers in the redistributor if
the RD has been properly configured (e.g., the RD base address is set). But
it hasn't yet been covered by the existing documentation.

Per discussion on the list [1], the reporting of the GICR_TYPER.Last bit
for userspace never actually worked. And it's difficult for us to emulate
it correctly given that userspace has the flexibility to access it any
time. Let's just drop the reporting of the Last bit for userspace for now
(userspace should have full knowledge about it anyway) and it at least
prevents kernel from panic ;-)

[1] https://lore.kernel.org/kvmarm/c20865a267e44d1e2c0d52ce4e012263@kernel.org/

Fixes: ba7b3f12 ("KVM: arm/arm64: Revisit Redistributor TYPER last bit computation")
Reported-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20201117151629.1738-1-yuzenghui@huawei.com
Cc: stable@vger.kernel.org

23bde347

16 11月, 2020 1 次提交

KVM: arm64: Correctly align nVHE percpu data · 7bab16a6

由 Jamie Iles 提交于 11月 13, 2020

The nVHE percpu data is partially linked but the nVHE linker script did
not align the percpu section.  The PERCPU_INPUT macro would then align
the data to a page boundary:

  #define PERCPU_INPUT(cacheline)					\
  	__per_cpu_start = .;						\
  	*(.data..percpu..first)						\
  	. = ALIGN(PAGE_SIZE);						\
  	*(.data..percpu..page_aligned)					\
  	. = ALIGN(cacheline);						\
  	*(.data..percpu..read_mostly)					\
  	. = ALIGN(cacheline);						\
  	*(.data..percpu)						\
  	*(.data..percpu..shared_aligned)				\
  	PERCPU_DECRYPTED_SECTION					\
  	__per_cpu_end = .;

but then when the final vmlinux linking happens the hypervisor percpu
data is included after page alignment and so the offsets potentially
don't match.  On my build I saw that the .hyp.data..percpu section was
at address 0x20 and then the percpu data would begin at 0x1000 (because
of the page alignment in PERCPU_INPUT), but when linked into vmlinux,
everything would be shifted down by 0x20 bytes.

This manifests as one of the CPUs getting lost when running
kvm-unit-tests or starting any VM and subsequent soft lockup on a Cortex
A72 device.

Fixes: 30c95391 ("kvm: arm64: Set up hyp percpu data for nVHE")
Signed-off-by: NJamie Iles <jamie@nuviainc.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NDavid Brazdil <dbrazdil@google.com>
Cc: David Brazdil <dbrazdil@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201113150406.14314-1-jamie@nuviainc.com

7bab16a6

13 11月, 2020 3 次提交

KVM: arm64: Handle SCXTNUM_ELx traps · ed4ffaf4

由 Marc Zyngier 提交于 11月 10, 2020

As the kernel never sets HCR_EL2.EnSCXT, accesses to SCXTNUM_ELx
will trap to EL2. Let's handle that as gracefully as possible
by injecting an UNDEF exception into the guest. This is consistent
with the guest's view of ID_AA64PFR0_EL1.CSV2 being at most 1.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201110141308.451654-4-maz@kernel.org

ed4ffaf4

KVM: arm64: Unify trap handlers injecting an UNDEF · 338b1793

由 Marc Zyngier 提交于 11月 10, 2020

A large number of system register trap handlers only inject an
UNDEF exeption, and yet each class of sysreg seems to provide its
own, identical function.

Let's unify them all, saving us introducing yet another one later.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201110141308.451654-3-maz@kernel.org

338b1793

KVM: arm64: Allow setting of ID_AA64PFR0_EL1.CSV2 from userspace · 23711a5e

由 Marc Zyngier 提交于 11月 10, 2020

We now expose ID_AA64PFR0_EL1.CSV2=1 to guests running on hosts
that are immune to Spectre-v2, but that don't have this field set,
most likely because they predate the specification.

However, this prevents the migration of guests that have started on
a host the doesn't fake this CSV2 setting to one that does, as KVM
rejects the write to ID_AA64PFR0_EL2 on the grounds that it isn't
what is already there.

In order to fix this, allow userspace to set this field as long as
this doesn't result in a promising more than what is already there
(setting CSV2 to 0 is acceptable, but setting it to 1 when it is
already set to 0 isn't).

Fixes: e1026237 ("KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2")
Reported-by: NPeng Liang <liangpeng10@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201110141308.451654-2-maz@kernel.org

23711a5e

07 11月, 2020 5 次提交

KVM: arm64: Remove AA64ZFR0_EL1 accessors · c512298e

由 Andrew Jones 提交于 11月 05, 2020

The AA64ZFR0_EL1 accessors are just the general accessors with
its visibility function open-coded. It also skips the if-else
chain in read_id_reg, but there's no reason not to go there.
Indeed consolidating ID register accessors and removing lines
of code make it worthwhile.

Remove the AA64ZFR0_EL1 accessors, replacing them with the
general accessors for sanitized ID registers.

No functional change intended.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-5-drjones@redhat.com

c512298e

KVM: arm64: Check RAZ visibility in ID register accessors · 912dee57

由 Andrew Jones 提交于 11月 05, 2020

The instruction encodings of ID registers are preallocated. Until an
encoding is assigned a purpose the register is RAZ. KVM's general ID
register accessor functions already support both paths, RAZ or not.
If for each ID register we can determine if it's RAZ or not, then all
ID registers can build on the general functions. The register visibility
function allows us to check whether a register should be completely
hidden or not, extending it to also report when the register should
be RAZ or not allows us to use it for ID registers as well.

Check for RAZ visibility in the ID register accessor functions,
allowing the RAZ case to be handled in a generic way for all system
registers.

The new REG_RAZ flag will be used in a later patch. This patch has
no intended functional change.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-4-drjones@redhat.com

912dee57

KVM: arm64: Consolidate REG_HIDDEN_GUEST/USER · 01fe5ace

由 Andrew Jones 提交于 11月 05, 2020

REG_HIDDEN_GUEST and REG_HIDDEN_USER are always used together.
Consolidate them into a single REG_HIDDEN flag. We can always
add another flag later if some register needs to expose itself
differently to the guest than it does to userspace.

No functional change intended.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-3-drjones@redhat.com

01fe5ace

KVM: arm64: Don't hide ID registers from userspace · f81cb2c3

由 Andrew Jones 提交于 11月 05, 2020

ID registers are RAZ until they've been allocated a purpose, but
that doesn't mean they should be removed from the KVM_GET_REG_LIST
list. So far we only have one register, SYS_ID_AA64ZFR0_EL1, that
is hidden from userspace when its function, SVE, is not present.

Expose SYS_ID_AA64ZFR0_EL1 to userspace as RAZ when SVE is not
implemented. Removing the userspace visibility checks is enough
to reexpose it, as it will already return zero to userspace when
SVE is not present. The register already behaves as RAZ for the
guest when SVE is not present.

Fixes: 73433762 ("KVM: arm64/sve: System register context switch and access support")
Reported-by: 张东旭 <xu910121@sina.com>
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org#v5.2+
Link: https://lore.kernel.org/r/20201105091022.15373-2-drjones@redhat.com

f81cb2c3

KVM: arm64: Fix build error in user_mem_abort() · faf00039

由 Gavin Shan 提交于 11月 03, 2020

The PUD and PMD are folded into PGD when the following options are
enabled. In that case, PUD_SHIFT is equal to PMD_SHIFT and we fail
to build with the indicated errors:

   CONFIG_ARM64_VA_BITS_42=y
   CONFIG_ARM64_PAGE_SHIFT=16
   CONFIG_PGTABLE_LEVELS=3

   arch/arm64/kvm/mmu.c: In function ‘user_mem_abort’:
   arch/arm64/kvm/mmu.c:798:2: error: duplicate case value
     case PMD_SHIFT:
     ^~~~
   arch/arm64/kvm/mmu.c:791:2: note: previously used here
     case PUD_SHIFT:
     ^~~~

This fixes the issue by skipping the check on PUD huge page when PUD
and PMD are folded into PGD.

Fixes: 2f40c460 ("KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes")
Reported-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NGavin Shan <gshan@redhat.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201103003009.32955-1-gshan@redhat.com

faf00039

31 10月, 2020 1 次提交

KVM: arm64: Handle Asymmetric AArch32 systems · 22f55384

由 Qais Yousef 提交于 10月 27, 2020

On a system without uniform support for AArch32 at EL0, it is possible
for the guest to force run AArch32 at EL0 and potentially cause an
illegal exception if running on a core without AArch32. Add an extra
check so that if we catch the guest doing that, then we prevent it from
running again by resetting vcpu->arch.target and return
ARM_EXCEPTION_IL.

We try to catch this misbehaviour as early as possible and not rely on
an illegal exception occuring to signal the problem. Attempting to run a
32bit app in the guest will produce an error from QEMU if the guest
exits while running in AArch32 EL0.

Tested on Juno by instrumenting the host to fake asym aarch32 and
instrumenting KVM to make the asymmetry visible to the guest.

[will: Incorporated feedback from Marc]
Signed-off-by: NQais Yousef <qais.yousef@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201021104611.2744565-2-qais.yousef@arm.com
Link: https://lore.kernel.org/r/20201027215118.27003-2-will@kernel.org

22f55384

30 10月, 2020 8 次提交

KVM: arm64: Force PTE mapping on fault resulting in a device mapping · 91a2c34b

由 Santosh Shukla 提交于 10月 26, 2020

VFIO allows a device driver to resolve a fault by mapping a MMIO
range. This can be subsequently result in user_mem_abort() to
try and compute a huge mapping based on the MMIO pfn, which is
a sure recipe for things to go wrong.

Instead, force a PTE mapping when the pfn faulted in has a device
mapping.

Fixes: 6d674e28 ("KVM: arm/arm64: Properly handle faulting of device mappings")
Suggested-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NSantosh Shukla <sashukla@nvidia.com>
[maz: rewritten commit message]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NGavin Shan <gshan@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1603711447-11998-2-git-send-email-sashukla@nvidia.com

91a2c34b

KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes · 2f40c460

由 Gavin Shan 提交于 10月 26, 2020

Although huge pages can be created out of multiple contiguous PMDs
or PTEs, the corresponding sizes are not supported at Stage-2 yet.

Instead of failing the mapping, fall back to the nearer supported
mapping size (CONT_PMD to PMD and CONT_PTE to PTE respectively).
Suggested-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NGavin Shan <gshan@redhat.com>
[maz: rewritten commit message]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201025230626.18501-1-gshan@redhat.com

2f40c460

KVM: arm64: Fix masks in stage2_pte_cacheable() · e2fc6a9f

由 Will Deacon 提交于 10月 29, 2020

stage2_pte_cacheable() tries to figure out whether the mapping installed
in its 'pte' parameter is cacheable or not. Unfortunately, it fails
miserably because it extracts the memory attributes from the entry using
FIELD_GET(), which returns the attributes shifted down to bit 0, but then
compares this with the unshifted value generated by the PAGE_S2_MEMATTR()
macro.

A direct consequence of this bug is that cache maintenance is silently
skipped, which in turn causes 32-bit guests to crash early on when their
set/way maintenance is trapped but not emulated correctly.

Fix the broken masks by avoiding the use of FIELD_GET() altogether.

Fixes: 6d9d2115 ("KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table")
Reported-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20201029144716.30476-1-will@kernel.org

e2fc6a9f

KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR · 4a1c2c7f

由 Marc Zyngier 提交于 10月 29, 2020

The DBGD{CCINT,SCRext} and DBGVCR register entries in the cp14 array
are missing their target register, resulting in all accesses being
targetted at the guard sysreg (indexed by __INVALID_SYSREG__).

Point the emulation code at the actual register entries.

Fixes: bdfb4b38 ("arm64: KVM: add trap handlers for AArch32 debug registers")
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201029172409.2768336-1-maz@kernel.org

4a1c2c7f

KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT · 7efe8ef2

由 Will Deacon 提交于 10月 26, 2020

For consistency with the rest of the stage-2 page-table page allocations
(performing using a kvm_mmu_memory_cache), ensure that __GFP_ACCOUNT is
included in the GFP flags for the PGD pages.
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NGavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20201026144423.24683-1-will@kernel.org

7efe8ef2

KVM: arm64: Drop useless PAN setting on host EL1 to EL2 transition · d2782505

由 Marc Zyngier 提交于 10月 26, 2020

Setting PSTATE.PAN when entering EL2 on nVHE doesn't make much
sense as this bit only means something for translation regimes
that include EL0. This obviously isn't the case in the nVHE case,
so let's drop this setting.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NVladimir Murzin <vladimir.murzin@arm.com>
Link: https://lore.kernel.org/r/20201026095116.72051-4-maz@kernel.org

d2782505

KVM: arm64: Remove leftover kern_hyp_va() in nVHE TLB invalidation · b6d6db4d

由 Marc Zyngier 提交于 10月 26, 2020

The new calling convention says that pointers coming from the SMCCC
interface are turned into their HYP version in the host HVC handler.
However, there is still a stray kern_hyp_va() in the TLB invalidation
code, which could result in a corrupted pointer.

Drop the spurious conversion.

Fixes: a071261d ("KVM: arm64: nVHE: Fix pointers during SMCCC convertion")
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201026095116.72051-3-maz@kernel.org

b6d6db4d

KVM: arm64: Don't corrupt tpidr_el2 on failed HVC call · 28e81c62

由 Marc Zyngier 提交于 10月 26, 2020

The hyp-init code starts by stashing a register in TPIDR_EL2
in in order to free a register. This happens no matter if the
HVC call is legal or not.

Although nothing wrong seems to come out of it, it feels odd
to alter the EL2 state for something that eventually returns
an error.

Instead, use the fact that we know exactly which bits of the
__kvm_hyp_init call are non-zero to perform the check with
a series of EOR/ROR instructions, combined with a build-time
check that the value is the one we expect.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201026095116.72051-2-maz@kernel.org

28e81c62

29 10月, 2020 1 次提交

arm64: Add workaround for Arm Cortex-A77 erratum 1508412 · 96d389ca

由 Rob Herring 提交于 10月 28, 2020

On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device load
and a store exclusive or PAR_EL1 read can cause a deadlock.

The workaround requires a DMB SY before and after a PAR_EL1 register
read. In addition, it's possible an interrupt (doing a device read) or
KVM guest exit could be taken between the DMB and PAR read, so we
also need a DMB before returning from interrupt and before returning to
a guest.

A deadlock is still possible with the workaround as KVM guests must also
have the workaround. IOW, a malicious guest can deadlock an affected
systems.

This workaround also depends on a firmware counterpart to enable the h/w
to insert DMB SY after load and store exclusive instructions. See the
errata document SDEN-1152370 v10 [1] for more information.

[1] https://static.docs.arm.com/101992/0010/Arm_Cortex_A77_MP074_Software_Developer_Errata_Notice_v10.pdfSigned-off-by: NRob Herring <robh@kernel.org>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMarc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20201028182839.166037-2-robh@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

96d389ca

28 10月, 2020 1 次提交

KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED · 1de111b5

由 Stephen Boyd 提交于 10月 23, 2020

According to the SMCCC spec[1](7.5.2 Discovery) the
ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and
SMCCC_RET_NOT_SUPPORTED.

 0 is "workaround required and safe to call this function"
 1 is "workaround not required but safe to call this function"
 SMCCC_RET_NOT_SUPPORTED is "might be vulnerable or might not be, who knows, I give up!"

SMCCC_RET_NOT_SUPPORTED might as well mean "workaround required, except
calling this function may not work because it isn't implemented in some
cases". Wonderful. We map this SMC call to

 0 is SPECTRE_MITIGATED
 1 is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

For KVM hypercalls (hvc), we've implemented this function id to return
SMCCC_RET_NOT_SUPPORTED, 0, and SMCCC_RET_NOT_REQUIRED. One of those
isn't supposed to be there. Per the code we call
arm64_get_spectre_v2_state() to figure out what to return for this
feature discovery call.

 0 is SPECTRE_MITIGATED
 SMCCC_RET_NOT_REQUIRED is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

Let's clean this up so that KVM tells the guest this mapping:

 0 is SPECTRE_MITIGATED
 1 is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

Note: SMCCC_RET_NOT_AFFECTED is 1 but isn't part of the SMCCC spec

Fixes: c118bbb5 ("arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests")
Signed-off-by: NStephen Boyd <swboyd@chromium.org>
Acked-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://developer.arm.com/documentation/den0028/latest [1]
Link: https://lore.kernel.org/r/20201023154751.1973872-1-swboyd@chromium.orgSigned-off-by: NWill Deacon <will@kernel.org>

1de111b5

02 10月, 2020 2 次提交

KVM: arm64: Ensure user_mem_abort() return value is initialised · ffd1b63a

由 Will Deacon 提交于 9月 30, 2020

If a change in the MMU notifier sequence number forces user_mem_abort()
to return early when attempting to handle a stage-2 fault, we return
uninitialised stack to kvm_handle_guest_abort(), which could potentially
result in the injection of an external abort into the guest or a spurious
return to userspace. Neither or these are what we want to do.

Initialise 'ret' to 0 in user_mem_abort() so that bailing due to a
change in the MMU notrifier sequence number is treated as though the
fault was handled.
Reported-by: Nkernel test robot <lkp@intel.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: NGavin Shan <gshan@redhat.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20200930102442.16142-1-will@kernel.org

ffd1b63a

KVM: arm64: Pass level hint to TLBI during stage-2 permission fault · b259d137

由 Will Deacon 提交于 9月 30, 2020

Alex pointed out that we don't pass a level hint to the TLBI instruction
when handling a stage-2 permission fault, even though the walker does
at some point have the level information in its hands.

Rework stage2_update_leaf_attrs() so that it can optionally return the
level of the updated pte to its caller, which can in turn be used to
provide the correct TLBI level hint.
Reported-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: NGavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/595cc73e-636e-8b3a-f93a-b4e9fb218db8@arm.com
Link: https://lore.kernel.org/r/20200930131801.16889-1-will@kernel.org

b259d137

01 10月, 2020 1 次提交

KVM: arm64: Restore missing ISB on nVHE __tlb_switch_to_guest · 452d6222

由 Marc Zyngier 提交于 7月 13, 2020

Commit a0e50aa3 ("KVM: arm64: Factor out stage 2 page table
data from struct kvm") dropped the ISB after __load_guest_stage2(),
only leaving the one that is required when the speculative AT
workaround is in effect.

As Andrew points it: "This alternative is 'backwards' to avoid a
double ISB as there is one in __load_guest_stage2 when the workaround
is active."

Restore the missing ISB, conditionned on the AT workaround not being
active.

Fixes: a0e50aa3 ("KVM: arm64: Factor out stage 2 page table data from struct kvm")
Reported-by: NAndrew Scull <ascull@google.com>
Reported-by: NThomas Tai <thomas.tai@oracle.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

452d6222

30 9月, 2020 7 次提交

kvm: arm64: Remove unnecessary hyp mappings · a3bb9c3a

由 David Brazdil 提交于 9月 22, 2020

With all nVHE per-CPU variables being part of the hyp per-CPU region,
mapping them individual is not necessary any longer. They are mapped to hyp
as part of the overall per-CPU region.
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NAndrew Scull <ascull@google.com>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-11-dbrazdil@google.com

a3bb9c3a

kvm: arm64: Set up hyp percpu data for nVHE · 30c95391

由 David Brazdil 提交于 9月 22, 2020

Add hyp percpu section to linker script and rename the corresponding ELF
sections of hyp/nvhe object files. This moves all nVHE-specific percpu
variables to the new hyp percpu section.

Allocate sufficient amount of memory for all percpu hyp regions at global KVM
init time and create corresponding hyp mappings.

The base addresses of hyp percpu regions are kept in a dynamically allocated
array in the kernel.

Add NULL checks in PMU event-reset code as it may run before KVM memory is
initialized.
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-10-dbrazdil@google.com

30c95391

kvm: arm64: Create separate instances of kvm_host_data for VHE/nVHE · 2a1198c9

由 David Brazdil 提交于 9月 22, 2020

Host CPU context is stored in a global per-cpu variable `kvm_host_data`.
In preparation for introducing independent per-CPU region for nVHE hyp,
create two separate instances of `kvm_host_data`, one for VHE and one
for nVHE.
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-9-dbrazdil@google.com

2a1198c9

kvm: arm64: Duplicate arm64_ssbd_callback_required for nVHE hyp · df4c8214

由 David Brazdil 提交于 9月 22, 2020

Hyp keeps track of which cores require SSBD callback by accessing a
kernel-proper global variable. Create an nVHE symbol of the same name
and copy the value from kernel proper to nVHE as KVM is being enabled
on a core.

Done in preparation for separating percpu memory owned by kernel
proper and nVHE.
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-8-dbrazdil@google.com

df4c8214

kvm: arm64: Remove hyp_adr/ldr_this_cpu · ea391027

由 David Brazdil 提交于 9月 22, 2020

The hyp_adr/ldr_this_cpu helpers were introduced for use in hyp code
because they always needed to use TPIDR_EL2 for base, while
adr/ldr_this_cpu from kernel proper would select between TPIDR_EL2 and
_EL1 based on VHE/nVHE.

Simplify this now that the hyp mode case can be handled using the
__KVM_VHE/NVHE_HYPERVISOR__ macros.
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NAndrew Scull <ascull@google.com>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-6-dbrazdil@google.com

ea391027

kvm: arm64: Remove __hyp_this_cpu_read · 717cf94a

由 David Brazdil 提交于 9月 22, 2020

this_cpu_ptr is meant for use in kernel proper because it selects between
TPIDR_EL1/2 based on nVHE/VHE. __hyp_this_cpu_ptr was used in hyp to always
select TPIDR_EL2. Unify all users behind this_cpu_ptr and friends by
selecting _EL2 register under __KVM_NVHE_HYPERVISOR__. VHE continues
selecting the register using alternatives.

Under CONFIG_DEBUG_PREEMPT, the kernel helpers perform a preemption check
which is omitted by the hyp helpers. Preserve the behavior for nVHE by
overriding the corresponding macros under __KVM_NVHE_HYPERVISOR__. Extend
the checks into VHE hyp code.
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NAndrew Scull <ascull@google.com>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-5-dbrazdil@google.com

717cf94a

kvm: arm64: Partially link nVHE hyp code, simplify HYPCOPY · ab25464b

由 David Brazdil 提交于 9月 22, 2020

Relying on objcopy to prefix the ELF section names of the nVHE hyp code
is brittle and prevents us from using wildcards to match specific
section names.

Improve the build rules by partially linking all '.nvhe.o' files and
prefixing their ELF section names using a linker script. Continue using
objcopy for prefixing ELF symbol names.

One immediate advantage of this approach is that all subsections
matching a pattern can be merged into a single prefixed section, eg.
.text and .text.* can be linked into a single '.hyp.text'. This removes
the need for -fno-reorder-functions on GCC and will be useful in the
future too: LTO builds use .text subsections, compilers routinely
generate .rodata subsections, etc.

Partially linking all hyp code into a single object file also makes it
easier to analyze.
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-2-dbrazdil@google.com

ab25464b

29 9月, 2020 2 次提交

KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled · 9ef2b48b

由 Will Deacon 提交于 9月 28, 2020

Patching the EL2 exception vectors is integral to the Spectre-v2
workaround, where it can be necessary to execute CPU-specific sequences
to nobble the branch predictor before running the hypervisor text proper.

Remove the dependency on CONFIG_RANDOMIZE_BASE and allow the EL2 vectors
to be patched even when KASLR is not enabled.

Fixes: 7a132017e7a5 ("KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE")
Reported-by: Nkernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/202009221053.Jv1XsQUZ%lkp@intel.comSigned-off-by: NWill Deacon <will@kernel.org>

9ef2b48b

KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state() · d63d975a

由 Marc Zyngier 提交于 9月 18, 2020

Convert the KVM WA2 code to using the Spectre infrastructure,
making the code much more readable. It also allows us to
take SSBS into account for the mitigation.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

d63d975a

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功