提交 · 9a6e594869b29ccec4f99db83c071e4f2dbfc11f · openanolis / cloud-kernel

25 5月, 2018 2 次提交

KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing · e6b673b7

由 Dave Martin 提交于 4月 06, 2018

This patch refactors KVM to align the host and guest FPSIMD
save/restore logic with each other for arm64.  This reduces the
number of redundant save/restore operations that must occur, and
reduces the common-case IRQ blackout time during guest exit storms
by saving the host state lazily and optimising away the need to
restore the host state before returning to the run loop.

Four hooks are defined in order to enable this:

 * kvm_arch_vcpu_run_map_fp():
   Called on PID change to map necessary bits of current to Hyp.

 * kvm_arch_vcpu_load_fp():
   Set up FP/SIMD for entering the KVM run loop (parse as
   "vcpu_load fp").

 * kvm_arch_vcpu_ctxsync_fp():
   Get FP/SIMD into a safe state for re-enabling interrupts after a
   guest exit back to the run loop.

   For arm64 specifically, this involves updating the host kernel's
   FPSIMD context tracking metadata so that kernel-mode NEON use
   will cause the vcpu's FPSIMD state to be saved back correctly
   into the vcpu struct.  This must be done before re-enabling
   interrupts because kernel-mode NEON may be used by softirqs.

 * kvm_arch_vcpu_put_fp():
   Save guest FP/SIMD state back to memory and dissociate from the
   CPU ("vcpu_put fp").

Also, the arm64 FPSIMD context switch code is updated to enable it
to save back FPSIMD state for a vcpu, not just current.  A few
helpers drive this:

 * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
   mark this CPU as having context fp (which may belong to a vcpu)
   currently loaded in its registers.  This is the non-task
   equivalent of the static function fpsimd_bind_to_cpu() in
   fpsimd.c.

 * task_fpsimd_save():
   exported to allow KVM to save the guest's FPSIMD state back to
   memory on exit from the run loop.

 * fpsimd_flush_state():
   invalidate any context's FPSIMD state that is currently loaded.
   Used to disassociate the vcpu from the CPU regs on run loop exit.

These changes allow the run loop to enable interrupts (and thus
softirqs that may use kernel-mode NEON) without having to save the
guest's FPSIMD state eagerly.

Some new vcpu_arch fields are added to make all this work.  Because
host FPSIMD state can now be saved back directly into current's
thread_struct as appropriate, host_cpu_context is no longer used
for preserving the FPSIMD state.  However, it is still needed for
preserving other things such as the host's system registers.  To
avoid ABI churn, the redundant storage space in host_cpu_context is
not removed for now.

arch/arm is not addressed by this patch and continues to use its
current save/restore logic.  It could provide implementations of
the helpers later if desired.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

e6b673b7

KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags · fa89d31c

由 Dave Martin 提交于 5月 08, 2018

In struct vcpu_arch, the debug_flags field is used to store
debug-related flags about the vcpu state.

Since we are about to add some more flags related to FPSIMD and
SVE, it makes sense to add them to the existing flags field rather
than adding new fields.  Since there is only one debug_flags flag
defined so far, there is plenty of free space for expansion.

In preparation for adding more flags, this patch renames the
debug_flags field to simply "flags", and updates comments
appropriately.

The flag definitions are also moved to <asm/kvm_host.h>, since
their presence in <asm/kvm_asm.h> was for purely historical
reasons:  these definitions are not used from asm any more, and not
very likely to be as more Hyp asm is migrated to C.

KVM_ARM64_DEBUG_DIRTY_SHIFT has not been used since commit
1ea66d27 ("arm64: KVM: Move away from the assembly version of
the world switch"), so this patch gets rid of that too.

No functional change.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
[maz: fixed minor conflict]
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

fa89d31c

20 4月, 2018 1 次提交

arm/arm64: KVM: Add PSCI version selection API · 85bd0ba1

由 Marc Zyngier 提交于 1月 21, 2018

Although we've implemented PSCI 0.1, 0.2 and 1.0, we expose either 0.1
or 1.0 to a guest, defaulting to the latest version of the PSCI
implementation that is compatible with the requested version. This is
no different from doing a firmware upgrade on KVM.

But in order to give a chance to hypothetical badly implemented guests
that would have a fit by discovering something other than PSCI 0.2,
let's provide a new API that allows userspace to pick one particular
version of the API.

This is implemented as a new class of "firmware" registers, where
we expose the PSCI version. This allows the PSCI version to be
save/restored as part of a guest migration, and also set to
any supported version if the guest requires it.

Cc: stable@vger.kernel.org #4.16
Reviewed-by: NChristoffer Dall <cdall@kernel.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

85bd0ba1

19 3月, 2018 7 次提交

KVM: arm64: Introduce framework for accessing deferred sysregs · d47533da

由 Christoffer Dall 提交于 12月 23, 2017

We are about to defer saving and restoring some groups of system
registers to vcpu_put and vcpu_load on supported systems.  This means
that we need some infrastructure to access system registes which
supports either accessing the memory backing of the register or directly
accessing the system registers, depending on the state of the system
when we access the register.

We do this by defining read/write accessor functions, which can handle
both "immediate" and "deferrable" system registers.  Immediate registers
are always saved/restored in the world-switch path, but deferrable
registers are only saved/restored in vcpu_put/vcpu_load when supported
and sysregs_loaded_on_cpu will be set in that case.

Note that we don't use the deferred mechanism yet in this patch, but only
introduce infrastructure.  This is to improve convenience of review in
the subsequent patches where it is clear which registers become
deferred.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

d47533da

KVM: arm64: Rewrite system register accessors to read/write functions · 8d404c4c

由 Christoffer Dall 提交于 3月 16, 2016

Currently we access the system registers array via the vcpu_sys_reg()
macro.  However, we are about to change the behavior to some times
modify the register file directly, so let's change this to two
primitives:

 * Accessor macros vcpu_write_sys_reg() and vcpu_read_sys_reg()
 * Direct array access macro __vcpu_sys_reg()

The accessor macros should be used in places where the code needs to
access the currently loaded VCPU's state as observed by the guest.  For
example, when trapping on cache related registers, a write to a system
register should go directly to the VCPU version of the register.

The direct array access macro can be used in places where the VCPU is
known to never be running (for example userspace access) or for
registers which are never context switched (for example all the PMU
system registers).

This rewrites all users of vcpu_sys_regs to one of the macros described
above.

No functional change.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

8d404c4c

KVM: arm64: Change 32-bit handling of VM system registers · 52f6c4f0

由 Christoffer Dall 提交于 10月 11, 2017

We currently handle 32-bit accesses to trapped VM system registers using
the 32-bit index into the coproc array on the vcpu structure, which is a
union of the coproc array and the sysreg array.

Since all the 32-bit coproc indices are created to correspond to the
architectural mapping between 64-bit system registers and 32-bit
coprocessor registers, and because the AArch64 system registers are the
double in size of the AArch32 coprocessor registers, we can always find
the system register entry that we must update by dividing the 32-bit
coproc index by 2.

This is going to make our lives much easier when we have to start
accessing system registers that use deferred save/restore and might
have to be read directly from the physical CPU.
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

52f6c4f0

KVM: arm64: Introduce VHE-specific kvm_vcpu_run · 3f5c90b8

由 Christoffer Dall 提交于 10月 03, 2017

So far this is mostly (see below) a copy of the legacy non-VHE switch
function, but we will start reworking these functions in separate
directions to work on VHE and non-VHE in the most optimal way in later
patches.

The only difference after this patch between the VHE and non-VHE run
functions is that we omit the branch-predictor variant-2 hardening for
QC Falkor CPUs, because this workaround is specific to a series of
non-VHE ARMv8.0 CPUs.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

3f5c90b8

KVM: arm/arm64: Add kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs · bc192cee

由 Christoffer Dall 提交于 10月 10, 2017

As we are about to move a bunch of save/restore logic for VHE kernels to
the load and put functions, we need some infrastructure to do this.
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

bc192cee

KVM: arm/arm64: Get rid of vcpu->arch.irq_lines · 3df59d8d

由 Christoffer Dall 提交于 8月 03, 2017

We currently have a separate read-modify-write of the HCR_EL2 on entry
to the guest for the sole purpose of setting the VF and VI bits, if set.
Since this is most rarely the case (only when using userspace IRQ chip
and interrupts are in flight), let's get rid of this operation and
instead modify the bits in the vcpu->arch.hcr[_el2] directly when
needed.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

3df59d8d

KVM: arm64: Avoid storing the vcpu pointer on the stack · 4464e210

由 Christoffer Dall 提交于 10月 08, 2017

We already have the percpu area for the host cpu state, which points to
the VCPU, so there's no need to store the VCPU pointer on the stack on
every context switch.  We can be a little more clever and just use
tpidr_el2 for the percpu offset and load the VCPU pointer from the host
context.

This has the benefit of being able to retrieve the host context even
when our stack is corrupted, and it has a potential performance benefit
because we trade a store plus a load for an mrs and a load on a round
trip to the guest.

This does require us to calculate the percpu offset without including
the offset from the kernel mapping of the percpu array to the linear
mapping of the array (which is what we store in tpidr_el1), because a
PC-relative generated address in EL2 is already giving us the hyp alias
of the linear mapping of a kernel address.  We do this in
__cpu_init_hyp_mode() by using kvm_ksym_ref().

The code that accesses ESR_EL2 was previously using an alternative to
use the _EL1 accessor on VHE systems, but this was actually unnecessary
as the _EL1 accessor aliases the ESR_EL2 register on VHE, and the _EL2
accessor does the same thing on both systems.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

4464e210

07 2月, 2018 1 次提交

arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support · 6167ec5c

由 Marc Zyngier 提交于 2月 06, 2018

A new feature of SMCCC 1.1 is that it offers firmware-based CPU
workarounds. In particular, SMCCC_ARCH_WORKAROUND_1 provides
BP hardening for CVE-2017-5715.

If the host has some mitigation for this issue, report that
we deal with it using SMCCC_ARCH_WORKAROUND_1, as we apply the
host workaround on every guest exit.
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

6167ec5c

16 1月, 2018 5 次提交

KVM: arm64: Handle RAS SErrors from EL2 on guest exit · 0067df41

由 James Morse 提交于 1月 15, 2018

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

The current SError from EL2 code unmasks SError and tries to fence any
pending SError into a single instruction window. It then leaves SError
unmasked.

With the v8.2 RAS Extensions we may take an SError for a 'corrected'
error, but KVM is only able to handle SError from EL2 if they occur
during this single instruction window...

The RAS Extensions give us a new instruction to synchronise and
consume SErrors. The RAS Extensions document (ARM DDI0587),
'2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
SError interrupts generated by 'instructions, translation table walks,
hardware updates to the translation tables, and instruction fetches on
the same PE'. This makes ESB equivalent to KVMs existing
'dsb, mrs-daifclr, isb' sequence.

Use the alternatives to synchronise and consume any SError using ESB
instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
in the exit_code so that we can restart the vcpu if it turns out this
SError has no impact on the vcpu.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

0067df41

KVM: arm64: Handle RAS SErrors from EL1 on guest exit · 3368bd80

由 James Morse 提交于 1月 15, 2018

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

For SError that interrupt a guest and are routed to EL2 the existing
behaviour is to inject an impdef SError into the guest.

Add code to handle RAS SError based on the ESR. For uncontained and
uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
errors compromise the host too. All other error types are contained:
For the fatal errors the vCPU can't make progress, so we inject a virtual
SError. We ignore contained errors where we can make progress as if
we're lucky, we may not hit them again.

If only some of the CPUs support RAS the guest will see the cpufeature
sanitised version of the id registers, but we may still take RAS SError
on this CPU. Move the SError handling out of handle_exit() into a new
handler that runs before we can be preempted. This allows us to use
this_cpu_has_cap(), via arm64_is_ras_serror().
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

3368bd80

KVM: arm64: Save/Restore guest DISR_EL1 · c773ae2b

由 James Morse 提交于 1月 15, 2018

If we deliver a virtual SError to the guest, the guest may defer it
with an ESB instruction. The guest reads the deferred value via DISR_EL1,
but the guests view of DISR_EL1 is re-mapped to VDISR_EL2 when HCR_EL2.AMO
is set.

Add the KVM code to save/restore VDISR_EL2, and make it accessible to
userspace as DISR_EL1.
Signed-off-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

c773ae2b

KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. · 4715c14b

由 James Morse 提交于 1月 15, 2018

Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
generated an SError with an implementation defined ESR_EL1.ISS, because we
had no mechanism to specify the ESR value.

On Juno this generates an all-zero ESR, the most significant bit 'ISV'
is clear indicating the remainder of the ISS field is invalid.

With the RAS Extensions we have a mechanism to specify this value, and the
most significant bit has a new meaning: 'IDS - Implementation Defined
Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
instead of 'no valid ISS'.

Add KVM support for the VSESR_EL2 register to specify an ESR value when
HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
specify an implementation-defined value.

We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
save/restores this bit during __{,de}activate_traps() and hardware clears the
bit once the guest has consumed the virtual-SError.

Future patches may add an API (or KVM CAP) to pend a virtual SError with
a specified ESR.

Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4715c14b

KVM: arm/arm64: mask/unmask daif around VHE guests · 4f5abad9

由 James Morse 提交于 1月 15, 2018

Non-VHE systems take an exception to EL2 in order to world-switch into the
guest. When returning from the guest KVM implicitly restores the DAIF
flags when it returns to the kernel at EL1.

With VHE none of this exception-level jumping happens, so KVMs
world-switch code is exposed to the host kernel's DAIF values, and KVM
spills the guest-exit DAIF values back into the host kernel.
On entry to a guest we have Debug and SError exceptions unmasked, KVM
has switched VBAR but isn't prepared to handle these. On guest exit
Debug exceptions are left disabled once we return to the host and will
stay this way until we enter user space.

Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
happen after the hosts VBAR value has been synchronised by the isb in
__vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
setting KVMs VBAR value, but is kept here for symmetry.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4f5abad9

13 1月, 2018 1 次提交

KVM: arm64: Change hyp_panic()s dependency on tpidr_el2 · c97e166e

由 James Morse 提交于 1月 08, 2018

Make tpidr_el2 a cpu-offset for per-cpu variables in the same way the
host uses tpidr_el1. This lets tpidr_el{1,2} have the same value, and
on VHE they can be the same register.

KVM calls hyp_panic() when anything unexpected happens. This may occur
while a guest owns the EL1 registers. KVM stashes the vcpu pointer in
tpidr_el2, which it uses to find the host context in order to restore
the host EL1 registers before parachuting into the host's panic().

The host context is a struct kvm_cpu_context allocated in the per-cpu
area, and mapped to hyp. Given the per-cpu offset for this CPU, this is
easy to find. Change hyp_panic() to take a pointer to the
struct kvm_cpu_context. Wrap these calls with an asm function that
retrieves the struct kvm_cpu_context from the host's per-cpu area.

Copy the per-cpu offset from the hosts tpidr_el1 into tpidr_el2 during
kvm init. (Later patches will make this unnecessary for VHE hosts)

We print out the vcpu pointer as part of the panic message. Add a back
reference to the 'running vcpu' in the host cpu context to preserve this.
Signed-off-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

c97e166e

02 1月, 2018 1 次提交

KVM: arm/arm64: Avoid work when userspace iqchips are not used · 61bbe380

由 Christoffer Dall 提交于 10月 27, 2017

We currently check if the VM has a userspace irqchip in several places
along the critical path, and if so, we do some work which is only
required for having an irqchip in userspace.  This is unfortunate, as we
could avoid doing any work entirely, if we didn't have to support
irqchip in userspace.

Realizing the userspace irqchip on ARM is mostly a developer or hobby
feature, and is unlikely to be used in servers or other scenarios where
performance is a priority, we can use a refcounted static key to only
check the irqchip configuration when we have at least one VM that uses
an irqchip in userspace.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

61bbe380

29 11月, 2017 1 次提交

KVM: arm/arm64: debug: Introduce helper for single-step · 696673d1

由 Alex Bennée 提交于 11月 16, 2017

After emulating instructions we may want return to user-space to handle
single-step debugging. Introduce a helper function, which, if
single-step is enabled, sets the run structure for return and returns
true.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

696673d1

03 11月, 2017 1 次提交

arm64/sve: KVM: Prevent guests from using SVE · 17eed27b

由 Dave Martin 提交于 10月 31, 2017

Until KVM has full SVE support, guests must not be allowed to
execute SVE instructions.

This patch enables the necessary traps, and also ensures that the
traps are disabled again on exit from the guest so that the host
can still use SVE if it wants to.

On guest exit, high bits of the SVE Zn registers may have been
clobbered as a side-effect the execution of FPSIMD instructions in
the guest.  The existing KVM host FPSIMD restore code is not
sufficient to restore these bits, so this patch explicitly marks
the CPU as not containing cached vector state for any task, thus
forcing a reload on the next return to userspace.  This is an
interim measure, in advance of adding full SVE awareness to KVM.

This marking of cached vector state in the CPU as invalid is done
using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c.  Due
to the repeated use of this rather obscure operation, it makes
sense to factor it out as a separate helper with a clearer name.
This patch factors it out as fpsimd_flush_cpu_state(), and ports
all callers to use it.

As a side effect of this refactoring, a this_cpu_write() in
fpsimd_cpu_pm_notifier() is changed to __this_cpu_write().  This
should be fine, since cpu_pm_enter() is supposed to be called only
with interrupts disabled.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

17eed27b

01 9月, 2017 1 次提交

KVM: update to new mmu_notifier semantic v2 · fb1522e0

由 Jérôme Glisse 提交于 8月 31, 2017

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Changed since v1 (Linus Torvalds)
    - remove now useless kvm_arch_mmu_notifier_invalidate_page()
Signed-off-by: NJérôme Glisse <jglisse@redhat.com>
Tested-by: NMike Galbraith <efault@gmx.de>
Tested-by: NAdam Borowski <kilobyte@angband.pl>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fb1522e0

04 6月, 2017 3 次提交

KVM: arm/arm64: use vcpu requests for irq injection · 325f9c64

由 Andrew Jones 提交于 6月 04, 2017

Don't use request-less VCPU kicks when injecting IRQs, as a VCPU
kick meant to trigger the interrupt injection could be sent while
the VCPU is outside guest mode, which means no IPI is sent, and
after it has called kvm_vgic_flush_hwstate(), meaning it won't see
the updated GIC state until its next exit some time later for some
other reason.  The receiving VCPU only needs to check this request
in VCPU RUN to handle it.  By checking it, if it's pending, a
memory barrier will be issued that ensures all state is visible.
See "Ensuring Requests Are Seen" of
Documentation/virtual/kvm/vcpu-requests.rst
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

325f9c64

KVM: arm/arm64: change exit request to sleep request · 7b244e2b

由 Andrew Jones 提交于 6月 04, 2017

A request called EXIT is too generic. All requests are meant to cause
exits, but different requests have different flags. Let's not make
it difficult to decide if the EXIT request is correct for some case
by just always providing unique requests for each case. This patch
changes EXIT to SLEEP, because that's what the request is asking the
VCPU to do.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Acked-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

7b244e2b

KVM: improve arch vcpu request defining · 2387149e

由 Andrew Jones 提交于 6月 04, 2017

Marc Zyngier suggested that we define the arch specific VCPU request
base, rather than requiring each arch to remember to start from 8.
That suggestion, along with Radim Krcmar's recent VCPU request flag
addition, snowballed into defining something of an arch VCPU request
defining API.

No functional change.

(Looks like x86 is running out of arch VCPU request bits.  Maybe
 someday we'll need to extend to 64.)
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Acked-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

2387149e

23 5月, 2017 1 次提交

KVM: arm/arm64: Simplify active_change_prepare and plug race · abd72296

由 Christoffer Dall 提交于 5月 06, 2017

We don't need to stop a specific VCPU when changing the active state,
because private IRQs can only be modified by a running VCPU for the
VCPU itself and it is therefore already stopped.

However, it is also possible for two VCPUs to be modifying the active
state of SPIs at the same time, which can cause the thread being stuck
in the loop that checks other VCPU threads for a potentially very long
time, or to modify the active state of a running VCPU.  Fix this by
serializing all accesses to setting and clearing the active state of
interrupts using the KVM mutex.
Reported-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>

abd72296

18 5月, 2017 1 次提交

arm64/cpufeature: don't use mutex in bringup path · 63a1e1c9

由 Mark Rutland 提交于 5月 16, 2017

Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
must take the jump_label mutex.

We call cpus_set_cap() in the secondary bringup path, from the idle
thread where interrupts are disabled. Taking a mutex in this path "is a
NONO" regardless of whether it's contended, and something we must avoid.
We didn't spot this until recently, as ___might_sleep() won't warn for
this case until all CPUs have been brought up.

This patch avoids taking the mutex in the secondary bringup path. The
poking of static keys is deferred until enable_cpu_capabilities(), which
runs in a suitable context on the boot CPU. To account for the static
keys being set later, cpus_have_const_cap() is updated to use another
static key to check whether the const cap keys have been initialised,
falling back to the caps bitmap until this is the case.

This means that users of cpus_have_const_cap() gain should only gain a
single additional NOP in the fast path once the const caps are
initialised, but should always see the current cap value.

The hyp code should never dereference the caps array, since the caps are
initialized before we run the module initcall to initialise hyp. A check
is added to the hyp init code to document this requirement.

This change will sidestep a number of issues when the upcoming hotplug
locking rework is merged.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NMarc Zyniger <marc.zyngier@arm.com>
Reviewed-by: NSuzuki Poulose <suzuki.poulose@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

63a1e1c9

27 4月, 2017 2 次提交

KVM: mark requests that need synchronization · 7a97cec2

由 Paolo Bonzini 提交于 4月 27, 2017

kvm_make_all_requests() provides a synchronization that waits until all
kicked VCPUs have acknowledged the kick.  This is important for
KVM_REQ_MMU_RELOAD as it prevents freeing while lockless paging is
underway.

This patch adds the synchronization property into all requests that are
currently being used with kvm_make_all_requests() in order to preserve
the current behavior and only introduce a new framework.  Removing it
from requests where it is not necessary is left for future patches.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7a97cec2

KVM: mark requests that do not need a wakeup · 930f7fd6

由 Radim Krčmář 提交于 4月 26, 2017

Some operations must ensure that the guest is not running with stale
data, but if the guest is halted, then the update can wait until another
event happens.  kvm_make_all_requests() currently doesn't wake up, so we
can mark all requests used with it.

First 8 bits were arbitrarily reserved for request numbers.

Most uses of requests have the request type as a constant, so a compiler
will optimize the '&'.

An alternative would be to have an inline function that would return
whether the request needs a wake-up or not, but I like this one better
even though it might produce worse assembly.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

930f7fd6

09 4月, 2017 2 次提交

arm/arm64: KVM: Use __hyp_reset_vectors() directly · 0fb26593

由 Marc Zyngier 提交于 4月 03, 2017

__cpu_reset_hyp_mode doesn't need to be passed any argument now,
as the hyp-stub implementations are self-contained, and is now
reduced to just calling __hyp_reset_vectors(). Let's drop the
wrapper and use the stub hypercall directly.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

0fb26593

arm64: KVM: Convert __cpu_reset_hyp_mode to using __hyp_reset_vectors · 4adb1341

由 Marc Zyngier 提交于 4月 03, 2017

We are now able to use the hyp stub to reset HYP mode. Time to
kiss __kvm_hyp_reset goodbye, and use __hyp_reset_vectors.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

4adb1341

07 4月, 2017 1 次提交

kvm: make KVM_COALESCED_MMIO_PAGE_OFFSET public · 4b4357e0

由 Paolo Bonzini 提交于 3月 31, 2017

Its value has never changed; we might as well make it part of the ABI instead
of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO).

Because PPC does not always make MMIO available, the code has to be made
dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4b4357e0

09 3月, 2017 2 次提交

KVM: arm64: Increase number of user memslots to 512 · 955a3fc6

由 Linu Cherian 提交于 3月 08, 2017

Having only 32 memslots is a real constraint for the maximum
number of PCI devices that can be assigned to a single guest.
Assuming each PCI device/virtual function having two memory BAR
regions, we could assign only 15 devices/virtual functions to a
guest.

Hence increase KVM_USER_MEM_SLOTS to 512 as done in other archs like
powerpc.
Reviewed-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NLinu Cherian <linu.cherian@cavium.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

955a3fc6

KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are unused · 3e92f94a

由 Linu Cherian 提交于 3月 08, 2017

arm/arm64 architecture doesnt use private memslots, hence removing
KVM_PRIVATE_MEM_SLOTS macro definition.
Reviewed-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NLinu Cherian <linu.cherian@cavium.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

3e92f94a

08 2月, 2017 1 次提交

KVM: arm/arm64: Move cntvoff to each timer context · 90de943a

由 Jintack Lim 提交于 2月 03, 2017

Make cntvoff per each timer context. This is helpful to abstract kvm
timer functions to work with timer context without considering timer
types (e.g. physical timer or virtual timer).

This also would pave the way for ever doing adjustments of the cntvoff
on a per-CPU basis if that should ever make sense.
Signed-off-by: NJintack Lim <jintack@cs.columbia.edu>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

90de943a

03 2月, 2017 1 次提交

arm64: KVM: Save/restore the host SPE state when entering/leaving a VM · f85279b4

由 Will Deacon 提交于 9月 22, 2016

The SPE buffer is virtually addressed, using the page tables of the CPU
MMU. Unusually, this means that the EL0/1 page table may be live whilst
we're executing at EL2 on non-VHE configurations. When VHE is in use,
we can use the same property to profile the guest behind its back.

This patch adds the relevant disabling and flushing code to KVM so that
the host can make use of SPE without corrupting guest memory, and any
attempts by a guest to use SPE will result in a trap.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f85279b4

05 11月, 2016 1 次提交

arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU · 94d0e598

由 Marc Zyngier 提交于 10月 18, 2016

Architecturally, TLBs are private to the (physical) CPU they're
associated with. But when multiple vcpus from the same VM are
being multiplexed on the same CPU, the TLBs are not private
to the vcpus (and are actually shared across the VMID).

Let's consider the following scenario:

- vcpu-0 maps PA to VA
- vcpu-1 maps PA' to VA

If run on the same physical CPU, vcpu-1 can hit TLB entries generated
by vcpu-0 accesses, and access the wrong physical page.

The solution to this is to keep a per-VM map of which vcpu ran last
on each given physical CPU, and invalidate local TLBs when switching
to a different vcpu from the same VM.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

94d0e598

08 9月, 2016 1 次提交

KVM: Add provisioning for ulong vm stats and u64 vcpu stats · 8a7e75d4

由 Suraj Jitindar Singh 提交于 8月 02, 2016

vms and vcpus have statistics associated with them which can be viewed
within the debugfs. Currently it is assumed within the vcpu_stat_get() and
vm_stat_get() functions that all of these statistics are represented as
u32s, however the next patch adds some u64 vcpu statistics.

Change all vcpu statistics to u64 and modify vcpu_stat_get() accordingly.
Since vcpu statistics are per vcpu, they will only be updated by a single
vcpu at a time so this shouldn't present a problem on 32-bit machines
which can't atomically increment 64-bit numbers. However vm statistics
could potentially be updated by multiple vcpus from that vm at a time.
To avoid the overhead of atomics make all vm statistics ulong such that
they are 64-bit on 64-bit systems where they can be atomically incremented
and are 32-bit on 32-bit systems which may not be able to atomically
increment 64-bit numbers. Modify vm_stat_get() to expect ulongs.
Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

8a7e75d4

19 7月, 2016 1 次提交

KVM: arm/arm64: Extend arch CAP checks to allow per-VM capabilities · b46f01ce

由 Andre Przywara 提交于 7月 15, 2016

KVM capabilities can be a per-VM property, though ARM/ARM64 currently
does not pass on the VM pointer to the architecture specific
capability handlers.
Add a "struct kvm*" parameter to those function to later allow proper
per-VM capability reporting.
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Reviewed-by: NEric Auger <eric.auger@linaro.org>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Tested-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

b46f01ce

04 7月, 2016 2 次提交

arm: KVM: Allow hyp teardown · e537ecd7

由 Marc Zyngier 提交于 6月 30, 2016

So far, KVM was getting in the way of kexec on 32bit (and the arm64
kexec hackers couldn't be bothered to fix it on 32bit...).

With simpler page tables, tearing KVM down becomes very easy, so
let's just do it.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

e537ecd7

arm/arm64: KVM: Drop boot_pgd · 12fda812

由 Marc Zyngier 提交于 6月 30, 2016

Since we now only have one set of page tables, the concept of
boot_pgd is useless and can be removed. We still keep it as
an element of the "extended idmap" thing.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

12fda812

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功