- 25 5月, 2020 1 次提交
-
-
由 David Brazdil 提交于
Pull bits of code to the only place where it is used. Remove empty function __cpu_init_stage2(). Remove redundant has_vhe() check since this function is nVHE-only. No functional changes intended. Signed-off-by: NDavid Brazdil <dbrazdil@google.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200515152056.83158-1-dbrazdil@google.com
-
- 16 5月, 2020 1 次提交
-
-
由 Keqian Zhu 提交于
There is already support of enabling dirty log gradually in small chunks for x86 in commit 3c9bd400 ("KVM: x86: enable dirty log gradually in small chunks"). This adds support for arm64. x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET is enabled. However, for arm64, both huge pages and normal pages can be write protected gradually by userspace. Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G Linux VMs with different page size. The memory pressure is 127G in each case. The time taken of memory_global_dirty_log_start in QEMU is listed below: Page Size Before After Optimization 4K 650ms 1.8ms 2M 4ms 1.8ms 1G 2ms 1.8ms Besides the time reduction, the biggest improvement is that we will minimize the performance side effect (because of dissolving huge pages and marking memslots dirty) on guest after enabling dirty log. Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200413122023.52583-1-zhukeqian1@huawei.com
-
- 24 3月, 2020 1 次提交
-
-
由 Marc Zyngier 提交于
Each time a Group-enable bit gets flipped, the state of these bits needs to be forwarded to the hardware. This is a pretty heavy handed operation, requiring all vcpus to reload their GICv4 configuration. It is thus implemented as a new request type. These enable bits are programmed into the HW by setting the VGrp{0,1}En fields of GICR_VPENDBASER when the vPEs are made resident again. Of course, we only support Group-1 for now... Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-22-maz@kernel.org
-
- 17 2月, 2020 1 次提交
-
-
由 Mark Rutland 提交于
With VHE, running a vCPU always requires the sequence: 1. kvm_arm_vhe_guest_enter(); 2. kvm_vcpu_run_vhe(); 3. kvm_arm_vhe_guest_exit() ... and as we invoke this from the shared arm/arm64 KVM code, 32-bit arm has to provide stubs for all three functions. To simplify the common code, and make it easier to make further modifications to the arm64-specific portions in the near future, let's fold kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() into kvm_vcpu_run_vhe(). The 32-bit stubs for kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() are removed, as they are no longer used. The 32-bit stub for kvm_vcpu_run_vhe() is left as-is. There should be no functional change as a result of this patch. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200210114757.2889-1-mark.rutland@arm.com
-
- 28 1月, 2020 3 次提交
-
-
由 Paolo Bonzini 提交于
For ring-based dirty log tracking, it will be more efficient to account writes during schedule-out or schedule-in to the currently running VCPU. We would like to do it even if the write doesn't use the current VCPU's address space, as is the case for cached writes (see commit 4e335d9e, "Revert "KVM: Support vCPU-based gfn->hva cache"", 2017-05-02). Therefore, add a mechanism to track the currently-loaded kvm_vcpu struct. There is already something similar in KVM/ARM; one important difference is that kvm_arch_vcpu_{load,put} have two callers in virt/kvm/kvm_main.c: we have to update both the architecture-independent vcpu_{load,put} and the preempt notifiers. Another change made in the process is to allow using kvm_get_running_vcpu() in preemptible code. This is allowed because preempt notifiers ensure that the value does not change even after the VCPU thread is migrated. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all arch specific implementations are nops. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add an arm specific hook to free the arm64-only sve_state. Doing so eliminates the last functional code from kvm_arch_vcpu_uninit() across all architectures and paves the way for removing kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() entirely. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 23 1月, 2020 1 次提交
-
-
由 Marc Zyngier 提交于
Our MMIO handling is a bit odd, in the sense that it uses an intermediate per-vcpu structure to store the various decoded information that describe the access. But the same information is readily available in the HSR/ESR_EL2 field, and we actually use this field to populate the structure. Let's simplify the whole thing by getting rid of the superfluous structure and save a (tiny) bit of space in the vcpu structure. [32bit fix courtesy of Olof Johansson <olof@lixom.net>] Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
- 16 1月, 2020 1 次提交
-
-
由 Steven Price 提交于
Cortex-A55 is affected by a similar erratum, so rename the existing workaround for errarum 1165522 so it can be used for both errata. Acked-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NSteven Price <steven.price@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
- 15 1月, 2020 1 次提交
-
-
由 Suzuki K Poulose 提交于
We finalize the system wide capabilities after the SMP CPUs are booted by the kernel. This is used as a marker for deciding various checks in the kernel. e.g, sanity check the hotplugged CPUs for missing mandatory features. However there is no explicit helper available for this in the kernel. There is sys_caps_initialised, which is not exposed. The other closest one we have is the jump_label arm64_const_caps_ready which denotes that the capabilities are set and the capability checks could use the individual jump_labels for fast path. This is performed before setting the ELF Hwcaps, which must be checked against the new CPUs. We also perform some of the other initialization e.g, SVE setup, which is important for the use of FP/SIMD where SVE is supported. Normally userspace doesn't get to run before we finish this. However the in-kernel users may potentially start using the neon mode. So, we need to reject uses of neon mode before we are set. Instead of defining a new marker for the completion of SVE setup, we could simply reuse the arm64_const_caps_ready and enable it once we have finished all the setup. Also we could expose this to the various users as "system_capabilities_finalized()" to make it more meaningful than "const_caps_ready". Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: NArd Biesheuvel <ardb@kernel.org> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
- 22 10月, 2019 4 次提交
-
-
由 Steven Price 提交于
Allow user space to inform the KVM host where in the physical memory map the paravirtualized time structures should be located. User space can set an attribute on the VCPU providing the IPA base address of the stolen time structure for that VCPU. This must be repeated for every VCPU in the VM. The address is given in terms of the physical address visible to the guest and must be 64 byte aligned. The guest will discover the address via a hypercall. Signed-off-by: NSteven Price <steven.price@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
由 Steven Price 提交于
Implement the service call for configuring a shared structure between a VCPU and the hypervisor in which the hypervisor can write the time stolen from the VCPU's execution time by other tasks on the host. User space allocates memory which is placed at an IPA also chosen by user space. The hypervisor then updates the shared structure using kvm_put_guest() to ensure single copy atomicity of the 64-bit value reporting the stolen time in nanoseconds. Whenever stolen time is enabled by the guest, the stolen time counter is reset. The stolen time itself is retrieved from the sched_info structure maintained by the Linux scheduler code. We enable SCHEDSTATS when selecting KVM Kconfig to ensure this value is meaningful. Signed-off-by: NSteven Price <steven.price@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
由 Steven Price 提交于
This provides a mechanism for querying which paravirtualized time features are available in this hypervisor. Also add the header file which defines the ABI for the paravirtualized time features we're about to add. Signed-off-by: NSteven Price <steven.price@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
由 Christoffer Dall 提交于
For a long time, if a guest accessed memory outside of a memslot using any of the load/store instructions in the architecture which doesn't supply decoding information in the ESR_EL2 (the ISV bit is not set), the kernel would print the following message and terminate the VM as a result of returning -ENOSYS to userspace: load/store instruction decoding not implemented The reason behind this message is that KVM assumes that all accesses outside a memslot is an MMIO access which should be handled by userspace, and we originally expected to eventually implement some sort of decoding of load/store instructions where the ISV bit was not set. However, it turns out that many of the instructions which don't provide decoding information on abort are not safe to use for MMIO accesses, and the remaining few that would potentially make sense to use on MMIO accesses, such as those with register writeback, are not used in practice. It also turns out that fetching an instruction from guest memory can be a pretty horrible affair, involving stopping all CPUs on SMP systems, handling multiple corner cases of address translation in software, and more. It doesn't appear likely that we'll ever implement this in the kernel. What is much more common is that a user has misconfigured his/her guest and is actually not accessing an MMIO region, but just hitting some random hole in the IPA space. In this scenario, the error message above is almost misleading and has led to a great deal of confusion over the years. It is, nevertheless, ABI to userspace, and we therefore need to introduce a new capability that userspace explicitly enables to change behavior. This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV) which does exactly that, and introduces a new exit reason to report the event to userspace. User space can then emulate an exception to the guest, restart the guest, suspend the guest, or take any other appropriate action as per the policy of the running system. Reported-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NAlexander Graf <graf@amazon.com> Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
- 15 10月, 2019 1 次提交
-
-
由 Marc Zyngier 提交于
The GICv3 architecture specification is incredibly misleading when it comes to PMR and the requirement for a DSB. It turns out that this DSB is only required if the CPU interface sends an Upstream Control message to the redistributor in order to update the RD's view of PMR. This message is only sent when ICC_CTLR_EL1.PMHE is set, which isn't the case in Linux. It can still be set from EL3, so some special care is required. But the upshot is that in the (hopefuly large) majority of the cases, we can drop the DSB altogether. This relies on a new static key being set if the boot CPU has PMHE set. The drawback is that this static key has to be exported to modules. Cc: Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 08 7月, 2019 1 次提交
-
-
由 Marc Zyngier 提交于
As part of setting up the host context, we populate its MPIDR by using cpu_logical_map(). It turns out that contrary to arm64, cpu_logical_map() on 32bit ARM doesn't return the *full* MPIDR, but a truncated version. This leaves the host MPIDR slightly corrupted after the first run of a VM, since we won't correctly restore the MPIDR on exit. Oops. Since we cannot trust cpu_logical_map(), let's adopt a different strategy. We move the initialization of the host CPU context as part of the per-CPU initialization (which, in retrospect, makes a lot of sense), and directly read the MPIDR from the HW. This is guaranteed to work on both arm and arm64. Reported-by: NAndre Przywara <Andre.Przywara@arm.com> Tested-by: NAndre Przywara <Andre.Przywara@arm.com> Fixes: 32f13955 ("arm/arm64: KVM: Statically configure the host's view of MPIDR") Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 05 7月, 2019 1 次提交
-
-
由 Andre Przywara 提交于
Recent commits added the explicit notion of "workaround not required" to the state of the Spectre v2 (aka. BP_HARDENING) workaround, where we just had "needed" and "unknown" before. Export this knowledge to the rest of the kernel and enhance the existing kvm_arm_harden_branch_predictor() to report this new state as well. Export this new state to guests when they use KVM's firmware interface emulation. Signed-off-by: NAndre Przywara <andre.przywara@arm.com> Reviewed-by: NSteven Price <steven.price@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 21 6月, 2019 1 次提交
-
-
由 Julien Thierry 提交于
When using IRQ priority masking to disable interrupts, in order to deal with the PSR.I state, local_irq_save() would convert the I bit into a PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore() potentially modifying the value of PMR in undesired location due to the state of PSR.I upon flag saving [1]. In an attempt to solve this issue in a less hackish manner, introduce a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent whether PSR.I is being used to disable interrupts, in which case it takes precedence of the status of interrupt masking via PMR. GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> | GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as some sections (e.g. arch_cpu_idle(), interrupt acknowledge path) requires PMR not to mask interrupts that could be signaled to the CPU when using only PSR.I. [1] https://www.spinics.net/lists/arm-kernel/msg716956.html Fixes: 4a503217 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: <stable@vger.kernel.org> # 5.1.x- Reported-by: NZenghui Yu <yuzenghui@huawei.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NJulien Thierry <julien.thierry@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 19 6月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAlexios Zavras <alexios.zavras@intel.com> Reviewed-by: NAllison Randal <allison@lohutok.net> Reviewed-by: NEnrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 24 5月, 2019 1 次提交
-
-
由 James Morse 提交于
KVM's pmu.c contains the __hyp_text needed to switch the pmu registers between host and guest. Because this isn't covered by the 'hyp' Makefile, it can be built with kasan and friends when these are enabled in Kconfig. When starting a guest, this results in: | Kernel panic - not syncing: HYP panic: | PS:a00003c9 PC:000083000028ada0 ESR:86000007 | FAR:000083000028ada0 HPFAR:0000000029df5300 PAR:0000000000000000 | VCPU:000000004e10b7d6 | CPU: 0 PID: 3088 Comm: qemu-system-aar Not tainted 5.2.0-rc1 #11026 | Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Plat | Call trace: | dump_backtrace+0x0/0x200 | show_stack+0x20/0x30 | dump_stack+0xec/0x158 | panic+0x1ec/0x420 | panic+0x0/0x420 | SMP: stopping secondary CPUs | Kernel Offset: disabled | CPU features: 0x002,25006082 | Memory Limit: none | ---[ end Kernel panic - not syncing: HYP panic: This is caused by functions in pmu.c calling the instrumented code, which isn't mapped to hyp. From objdump -r: | RELOCATION RECORDS FOR [.hyp.text]: | OFFSET TYPE VALUE | 0000000000000010 R_AARCH64_CALL26 __sanitizer_cov_trace_pc | 0000000000000018 R_AARCH64_CALL26 __asan_load4_noabort | 0000000000000024 R_AARCH64_CALL26 __asan_load4_noabort Move the affected code to a new file under 'hyp's Makefile. Fixes: 3d91befb ("arm64: KVM: Enable !VHE support for :G/:H perf event modifiers") Cc: Andrew Murray <Andrew.Murray@arm.com> Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 24 4月, 2019 6 次提交
-
-
由 Andrew Murray 提交于
With VHE different exception levels are used between the host (EL2) and guest (EL1) with a shared exception level for userpace (EL0). We can take advantage of this and use the PMU's exception level filtering to avoid enabling/disabling counters in the world-switch code. Instead we just modify the counter type to include or exclude EL0 at vcpu_{load,put} time. We also ensure that trapped PMU system register writes do not re-enable EL0 when reconfiguring the backing perf events. This approach completely avoids blackout windows seen with !VHE. Suggested-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NAndrew Murray <andrew.murray@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Andrew Murray 提交于
Enable/disable event counters as appropriate when entering and exiting the guest to enable support for guest or host only event counting. For both VHE and non-VHE we switch the counters between host/guest at EL2. The PMU may be on when we change which counters are enabled however we avoid adding an isb as we instead rely on existing context synchronisation events: the eret to enter the guest (__guest_enter) and eret in kvm_call_hyp for __kvm_vcpu_run_nvhe on returning. Signed-off-by: NAndrew Murray <andrew.murray@arm.com> Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Andrew Murray 提交于
In order to effeciently switch events_{guest,host} perf counters at guest entry/exit we add bitfields to kvm_cpu_context for guest and host events as well as accessors for updating them. A function is also provided which allows the PMU driver to determine if a counter should start counting when it is enabled. With exclude_host, we may only start counting when entering the guest. Signed-off-by: NAndrew Murray <andrew.murray@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Andrew Murray 提交于
The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is a typedef to kvm_cpu_context and is used to store host cpu context. The kvm_cpu_context structure is also used elsewhere to hold vcpu context. In order to use the percpu to hold additional future host information we encapsulate kvm_cpu_context in a new structure and rename the typedef and percpu to match. Signed-off-by: NAndrew Murray <andrew.murray@arm.com> Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Amit Daniel Kachhap 提交于
Now that the building blocks of pointer authentication are present, lets add userspace flags KVM_ARM_VCPU_PTRAUTH_ADDRESS and KVM_ARM_VCPU_PTRAUTH_GENERIC. These flags will enable pointer authentication for the KVM guest on a per-vcpu basis through the ioctl KVM_ARM_VCPU_INIT. This features will allow the KVM guest to allow the handling of pointer authentication instructions or to treat them as undefined if not set. Necessary documentations are added to reflect the changes done. Reviewed-by: NDave Martin <Dave.Martin@arm.com> Signed-off-by: NAmit Daniel Kachhap <amit.kachhap@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Mark Rutland 提交于
When pointer authentication is supported, a guest may wish to use it. This patch adds the necessary KVM infrastructure for this to work, with a semi-lazy context switch of the pointer auth state. Pointer authentication feature is only enabled when VHE is built in the kernel and present in the CPU implementation so only VHE code paths are modified. When we schedule a vcpu, we disable guest usage of pointer authentication instructions and accesses to the keys. While these are disabled, we avoid context-switching the keys. When we trap the guest trying to use pointer authentication functionality, we change to eagerly context-switching the keys, and enable the feature. The next time the vcpu is scheduled out/in, we start again. However the host key save is optimized and implemented inside ptrauth instruction/register access trap. Pointer authentication consists of address authentication and generic authentication, and CPUs in a system might have varied support for either. Where support for either feature is not uniform, it is hidden from guests via ID register emulation, as a result of the cpufeature framework in the host. Unfortunately, address authentication and generic authentication cannot be trapped separately, as the architecture provides a single EL2 trap covering both. If we wish to expose one without the other, we cannot prevent a (badly-written) guest from intermittently using a feature which is not uniformly supported (when scheduled on a physical CPU which supports the relevant feature). Hence, this patch expects both type of authentication to be present in a cpu. This switch of key is done from guest enter/exit assembly as preparation for the upcoming in-kernel pointer authentication support. Hence, these key switching routines are not implemented in C code as they may cause pointer authentication key signing error in some situations. Signed-off-by: NMark Rutland <mark.rutland@arm.com> [Only VHE, key switch in full assembly, vcpu_has_ptrauth checks , save host key in ptrauth exception trap] Signed-off-by: NAmit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: kvmarm@lists.cs.columbia.edu [maz: various fixups] Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 23 4月, 2019 1 次提交
-
-
由 Amit Daniel Kachhap 提交于
A per vcpu flag is added to check if pointer authentication is enabled for the vcpu or not. This flag may be enabled according to the necessary user policies and host capabilities. This patch also adds a helper to check the flag. Reviewed-by: NDave Martin <Dave.Martin@arm.com> Signed-off-by: NAmit Daniel Kachhap <amit.kachhap@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 19 4月, 2019 2 次提交
-
-
由 Dave Martin 提交于
Currently, the internal vcpu finalization functions use a different name ("what") for the feature parameter than the name ("feature") used in the documentation. To avoid future confusion, this patch converts everything to use the name "feature" consistently. No functional change. Suggested-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
The introduction of kvm_arm_init_arch_resources() looks like premature factoring, since nothing else uses this hook yet and it is not clear what will use it in the future. For now, let's not pretend that this is a general thing: This patch simply renames the function to kvm_arm_init_sve(), retaining the arm stub version under the new name. Suggested-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 29 3月, 2019 9 次提交
-
-
由 Dave Martin 提交于
Now that all the pieces are in place, this patch offers a new flag KVM_ARM_VCPU_SVE that userspace can pass to KVM_ARM_VCPU_INIT to turn on SVE for the guest, on a per-vcpu basis. As part of this, support for initialisation and reset of the SVE vector length set and registers is added in the appropriate places, as well as finally setting the KVM_ARM64_GUEST_HAS_SVE vcpu flag, to turn on the SVE support code. Allocation of the SVE register storage in vcpu->arch.sve_state is deferred until the SVE configuration is finalized, by which time the size of the registers is known. Setting the vector lengths supported by the vcpu is considered configuration of the emulated hardware rather than runtime configuration, so no support is offered for changing the vector lengths available to an existing vcpu across reset. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
This patch adds a new pseudo-register KVM_REG_ARM64_SVE_VLS to allow userspace to set and query the set of vector lengths visible to the guest. In the future, multiple register slices per SVE register may be visible through the ioctl interface. Once the set of slices has been determined we would not be able to allow the vector length set to be changed any more, in order to avoid userspace seeing inconsistent sets of registers. For this reason, this patch adds support for explicit finalization of the SVE configuration via the KVM_ARM_VCPU_FINALIZE ioctl. Finalization is the proper place to allocate the SVE register state storage in vcpu->arch.sve_state, so this patch adds that as appropriate. The data is freed via kvm_arch_vcpu_uninit(), which was previously a no-op on arm64. To simplify the logic for determining what vector lengths can be supported, some code is added to KVM init to work this out, in the kvm_arm_init_arch_resources() hook. The KVM_REG_ARM64_SVE_VLS pseudo-register is not exposed yet. Subsequent patches will allow SVE to be turned on for guest vcpus, making it visible. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
Some aspects of vcpu configuration may be too complex to be completed inside KVM_ARM_VCPU_INIT. Thus, there may be a requirement for userspace to do some additional configuration before various other ioctls will work in a consistent way. In particular this will be the case for SVE, where userspace will need to negotiate the set of vector lengths to be made available to the guest before the vcpu becomes fully usable. In order to provide an explicit way for userspace to confirm that it has finished setting up a particular vcpu feature, this patch adds a new ioctl KVM_ARM_VCPU_FINALIZE. When userspace has opted into a feature that requires finalization, typically by means of a feature flag passed to KVM_ARM_VCPU_INIT, a matching call to KVM_ARM_VCPU_FINALIZE is now required before KVM_RUN or KVM_GET_REG_LIST is allowed. Individual features may impose additional restrictions where appropriate. No existing vcpu features are affected by this, so current userspace implementations will continue to work exactly as before, with no need to issue KVM_ARM_VCPU_FINALIZE. As implemented in this patch, KVM_ARM_VCPU_FINALIZE is currently a placeholder: no finalizable features exist yet, so ioctl is not required and will always yield EINVAL. Subsequent patches will add the finalization logic to make use of this ioctl for SVE. No functional change for existing userspace. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
This patch adds a kvm_arm_init_arch_resources() hook to perform subarch-specific initialisation when starting up KVM. This will be used in a subsequent patch for global SVE-related setup on arm64. No functional change. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
This patch adds the following registers for access via the KVM_{GET,SET}_ONE_REG interface: * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices) * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices) * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices) In order to adapt gracefully to future architectural extensions, the registers are logically divided up into slices as noted above: the i parameter denotes the slice index. This allows us to reserve space in the ABI for future expansion of these registers. However, as of today the architecture does not permit registers to be larger than a single slice, so no code is needed in the kernel to expose additional slices, for now. The code can be extended later as needed to expose them up to a maximum of 32 slices (as carved out in the architecture itself) if they really exist someday. The registers are only visible for vcpus that have SVE enabled. They are not enumerated by KVM_GET_REG_LIST on vcpus that do not have SVE. Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not allowed for SVE-enabled vcpus: SVE-aware userspace can use the KVM_REG_ARM64_SVE_ZREG() interface instead to access the same register state. This avoids some complex and pointless emulation in the kernel to convert between the two views of these aliased registers. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
In order to give each vcpu its own view of the SVE registers, this patch adds context storage via a new sve_state pointer in struct vcpu_arch. An additional member sve_max_vl is also added for each vcpu, to determine the maximum vector length visible to the guest and thus the value to be configured in ZCR_EL2.LEN while the vcpu is active. This also determines the layout and size of the storage in sve_state, which is read and written by the same backend functions that are used for context-switching the SVE state for host tasks. On SVE-enabled vcpus, SVE access traps are now handled by switching in the vcpu's SVE context and disabling the trap before returning to the guest. On other vcpus, the trap is not handled and an exit back to the host occurs, where the handle_sve() fallback path reflects an undefined instruction exception back to the guest, consistently with the behaviour of non-SVE-capable hardware (as was done unconditionally prior to this patch). No SVE handling is added on non-VHE-only paths, since VHE is an architectural and Kconfig prerequisite of SVE. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
This patch adds the necessary support for context switching ZCR_EL1 for each vcpu. ZCR_EL1 is trapped alongside the FPSIMD/SVE registers, so it makes sense for it to be handled as part of the guest FPSIMD/SVE context for context switch purposes instead of handling it as a general system register. This means that it can be switched in lazily at the appropriate time. No effort is made to track host context for this register, since SVE requires VHE: thus the hosts's value for this register lives permanently in ZCR_EL2 and does not alias the guest's value at any time. The Hyp switch and fpsimd context handling code is extended appropriately. Accessors are added in sys_regs.c to expose the SVE system registers and ID register fields. Because these need to be conditionally visible based on the guest configuration, they are implemented separately for now rather than by use of the generic system register helpers. This may be abstracted better later on when/if there are more features requiring this model. ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the guest, but for compatibility with non-SVE aware KVM implementations the register should not be enumerated at all for KVM_GET_REG_LIST in this case. For consistency we also reject ioctl access to the register. This ensures that a non-SVE-enabled guest looks the same to userspace, irrespective of whether the kernel KVM implementation supports SVE. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
Since SVE will be enabled or disabled on a per-vcpu basis, a flag is needed in order to track which vcpus have it enabled. This patch adds a suitable flag and a helper for checking it. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Dave Martin 提交于
kvm_host.h uses some declarations from other headers that are currently included by accident, without an explicit #include. This patch adds a few #includes that are clearly missing. Although the header builds without them today, this should help to avoid future surprises. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 20 2月, 2019 2 次提交
-
-
由 Christoffer Dall 提交于
In preparation for nested virtualization where we are going to have more than a single VMID per VM, let's factor out the VMID data into a separate VMID data structure and change the VMID allocator to operate on this new structure instead of using a struct kvm. This also means that udate_vttbr now becomes update_vmid, and that the vttbr itself is generated on the fly based on the stage 2 page table base address and the vmid. We cache the physical address of the pgd when allocating the pgd to avoid doing the calculation on every entry to the guest and to avoid calling into potentially non-hyp-mapped code from hyp/EL2. If we wanted to merge the VMID allocator with the arm64 ASID allocator at some point in the future, it should actually become easier to do that after this patch. Note that to avoid mapping the kvm_vmid_bits variable into hyp, we simply forego the masking of the vmid value in kvm_get_vttbr and rely on update_vmid to always assign a valid vmid value (within the supported range). Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com> [maz: minor cleanups] Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
We currently eagerly save/restore MPIDR. It turns out to be slightly pointless: - On the host, this value is known as soon as we're scheduled on a physical CPU - In the guest, this value cannot change, as it is set by KVM (and this is a read-only register) The result of the above is that we can perfectly avoid the eager saving of MPIDR_EL1, and only keep the restore. We just have to setup the host contexts appropriately at boot time. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
-