提交 · df635c5b184da05145db1d63e1dcb1c853f425c2 · openanolis / cloud-kernel

02 1月, 2018 7 次提交

KVM: arm/arm64: Support VGIC dist pend/active changes for mapped IRQs · df635c5b

由 Christoffer Dall 提交于 9月 01, 2017

For mapped IRQs (with the HW bit set in the LR) we have to follow some
rules of the architecture.  One of these rules is that VM must not be
allowed to deactivate a virtual interrupt with the HW bit set unless the
physical interrupt is also active.

This works fine when injecting mapped interrupts, because we leave it up
to the injector to either set EOImode==1 or manually set the active
state of the physical interrupt.

However, the guest can set virtual interrupt to be pending or active by
writing to the virtual distributor, which could lead to deactivating a
virtual interrupt with the HW bit set without the physical interrupt
being active.

We could set the physical interrupt to active whenever we are about to
enter the VM with a HW interrupt either pending or active, but that
would be really slow, especially on GICv2.  So we take the long way
around and do the hard work when needed, which is expected to be
extremely rare.

When the VM sets the pending state for a HW interrupt on the virtual
distributor we set the active state on the physical distributor, because
the virtual interrupt can become active and then the guest can
deactivate it.

When the VM clears the pending state we also clear it on the physical
side, because the injector might otherwise raise the interrupt.  We also
clear the physical active state when the virtual interrupt is not
active, since otherwise a SPEND/CPEND sequence from the guest would
prevent signaling of future interrupts.

Changing the state of mapped interrupts from userspace is not supported,
and it's expected that userspace unmaps devices from VFIO before
attempting to set the interrupt state, because the interrupt state is
driven by hardware.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

df635c5b

KVM: arm/arm64: Support a vgic interrupt line level sample function · b6909a65

由 Christoffer Dall 提交于 10月 27, 2017

The GIC sometimes need to sample the physical line of a mapped
interrupt.  As we know this to be notoriously slow, provide a callback
function for devices (such as the timer) which can do this much faster
than talking to the distributor, for example by comparing a few
in-memory values.  Fall back to the good old method of poking the
physical GIC if no callback is provided.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

b6909a65

KVM: arm/arm64: vgic: Support level-triggered mapped interrupts · e40cc57b

由 Christoffer Dall 提交于 8月 29, 2017

Level-triggered mapped IRQs are special because we only observe rising
edges as input to the VGIC, and we don't set the EOI flag and therefore
are not told when the level goes down, so that we can re-queue a new
interrupt when the level goes up.

One way to solve this problem is to side-step the logic of the VGIC and
special case the validation in the injection path, but it has the
unfortunate drawback of having to peak into the physical GIC state
whenever we want to know if the interrupt is pending on the virtual
distributor.

Instead, we can maintain the current semantics of a level triggered
interrupt by sort of treating it as an edge-triggered interrupt,
following from the fact that we only observe an asserting edge.  This
requires us to be a bit careful when populating the LRs and when folding
the state back in though:

 * We lower the line level when populating the LR, so that when
   subsequently observing an asserting edge, the VGIC will do the right
   thing.

 * If the guest never acked the interrupt while running (for example if
   it had masked interrupts at the CPU level while running), we have
   to preserve the pending state of the LR and move it back to the
   line_level field of the struct irq when folding LR state.

   If the guest never acked the interrupt while running, but changed the
   device state and lowered the line (again with interrupts masked) then
   we need to observe this change in the line_level.

   Both of the above situations are solved by sampling the physical line
   and set the line level when folding the LR back.

 * Finally, if the guest never acked the interrupt while running and
   sampling the line reveals that the device state has changed and the
   line has been lowered, we must clear the physical active state, since
   we will otherwise never be told when the interrupt becomes asserted
   again.

This has the added benefit of making the timer optimization patches
(https://lists.cs.columbia.edu/pipermail/kvmarm/2017-July/026343.html) a
bit simpler, because the timer code doesn't have to clear the active
state on the sync anymore.  It also potentially improves the performance
of the timer implementation because the GIC knows the state or the LR
and only needs to clear the
active state when the pending bit in the LR is still set, where the
timer has to always clear it when returning from running the guest with
an injected timer interrupt.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

e40cc57b

KVM: arm/arm64: Don't cache the timer IRQ level · 70450a9f

由 Christoffer Dall 提交于 9月 04, 2017

The timer logic was designed after a strict idea of modeling an
interrupt line level in software, meaning that only transitions in the
level need to be reported to the VGIC. This works well for the timer,
because the arch timer code is in complete control of the device and can
track the transitions of the line.

However, as we are about to support using the HW bit in the VGIC not
just for the timer, but also for VFIO which cannot track transitions of
the interrupt line, we have to decide on an interface between the GIC
and other subsystems for level triggered mapped interrupts, which both
the timer and VFIO can use.

VFIO only sees an asserting transition of the physical interrupt line,
and tells the VGIC when that happens. That means that part of the
interrupt flow is offloaded to the hardware.

To use the same interface for VFIO devices and the timer, we therefore
have to change the timer (we cannot change VFIO because it doesn't know
the details of the device it is assigning to a VM).

Luckily, changing the timer is simple, we just need to stop 'caching'
the line level, but instead let the VGIC know the state of the timer
every time there is a potential change in the line level, and when the
line level should be asserted from the timer ISR. The VGIC can ignore
extra notifications using its validate mechanism.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

70450a9f

KVM: arm/arm64: Factor out functionality to get vgic mmio requester_vcpu · 6c1b7521

由 Christoffer Dall 提交于 9月 14, 2017

We are about to distinguish between userspace accesses and mmio traps
for a number of the mmio handlers.  When the requester vcpu is NULL, it
means we are handling a userspace access.

Factor out the functionality to get the request vcpu into its own
function, mostly so we have a common place to document the semantics of
the return value.

Also take the chance to move the functionality outside of holding a
spinlock and instead explicitly disable and enable preemption.  This
supports PREEMPT_RT kernels as well.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

6c1b7521

KVM: arm/arm64: Remove redundant preemptible checks · 5a245750

由 Christoffer Dall 提交于 9月 08, 2017

The __this_cpu_read() and __this_cpu_write() functions already implement
checks for the required preemption levels when using
CONFIG_DEBUG_PREEMPT which gives you nice error messages and such.
Therefore there is no need to explicitly check this using a BUG_ON() in
the code (which we don't do for other uses of per cpu variables either).
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

5a245750

KVM: arm: Use PTR_ERR_OR_ZERO() · 4404b336

由 Vasyl Gomonovych 提交于 11月 28, 2017

Fix ptr_ret.cocci warnings:
virt/kvm/arm/vgic/vgic-its.c:971:1-3: WARNING: PTR_ERR_OR_ZERO can be used

Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci
Signed-off-by: NVasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

4404b336

04 12月, 2017 1 次提交

KVM: arm/arm64: Fix broken GICH_ELRSR big endian conversion · fc396e06

由 Christoffer Dall 提交于 12月 03, 2017

We are incorrectly rearranging 32-bit words inside a 64-bit typed value
for big endian systems, which would result in never marking a virtual
interrupt as inactive on big endian systems (assuming 32 or fewer LRs on
the hardware).  Fix this by not doing any word order manipulation for
the typed values.

Cc: <stable@vger.kernel.org>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

fc396e06

01 12月, 2017 2 次提交

KVM: arm/arm64: kvm_arch_destroy_vm cleanups · 6b2ad81b

由 Andrew Jones 提交于 11月 27, 2017

kvm_vgic_vcpu_destroy already gets called from kvm_vgic_destroy for
each vcpu, so we don't have to call it from kvm_arch_vcpu_free.

Additionally the other architectures set kvm->online_vcpus to zero
after freeing them. We might as well do that for ARM too.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

6b2ad81b

KVM: arm/arm64: Fix spinlock acquisition in vgic_set_owner · 7465894e

由 Marc Zyngier 提交于 11月 30, 2017

vgic_set_owner acquires the irq lock without disabling interrupts,
resulting in a lockdep splat (an interrupt could fire and result
in the same lock being taken if the same virtual irq is to be
injected).

In practice, it is almost impossible to trigger this bug, but
better safe than sorry. Convert the lock acquisition to a
spin_lock_irqsave() and keep lockdep happy.
Reported-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

7465894e

30 11月, 2017 2 次提交

kvm: arm: don't treat unavailable HYP mode as an error · 58d0d19a

由 Ard Biesheuvel 提交于 11月 28, 2017

Since it is perfectly legal to run the kernel at EL1, it is not
actually an error if HYP mode is not available when attempting to
initialize KVM, given that KVM support cannot be built as a module.
So demote the kvm_err() to kvm_info(), which prevents the error from
appearing on an otherwise 'quiet' console.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

58d0d19a

KVM: arm/arm64: Avoid attempting to load timer vgic state without a vgic · 22601127

由 Christoffer Dall 提交于 11月 29, 2017

The timer optimization patches inadvertendly changed the logic to always
load the timer state as if we have a vgic, even if we don't have a vgic.

Fix this by doing the usual irqchip_in_kernel() check and call the
appropriate load function.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

22601127

29 11月, 2017 8 次提交

kvm: arm64: handle single-step of userspace mmio instructions · 1eb59128

由 Alex Bennée 提交于 11月 16, 2017

The system state of KVM when using userspace emulation is not complete
until we return into KVM_RUN. To handle mmio related updates we wait
until they have been committed and then schedule our KVM_EXIT_DEBUG.

The kvm_arm_handle_step_debug() helper tells us if we need to return
and sets up the exit_reason for us.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

1eb59128

KVM: arm/arm64: vgic-v4: Only perform an unmap for valid vLPIs · a05d1c0d

由 Marc Zyngier 提交于 11月 16, 2017

Before performing an unmap, let's check that what we have was
really mapped the first place.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

a05d1c0d

KVM: arm/arm64: vgic-its: Check result of allocation before use · 686f294f

由 Marc Zyngier 提交于 11月 16, 2017

We miss a test against NULL after allocation.

Fixes: 6d03a68f ("KVM: arm64: vgic-its: Turn device_id validation into generic ID validation")
Cc: stable@vger.kernel.org # 4.8
Reported-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

686f294f

KVM: arm/arm64: vgic-its: Preserve the revious read from the pending table · 64afe6e9

由 Marc Zyngier 提交于 11月 16, 2017

The current pending table parsing code assumes that we keep the
previous read of the pending bits, but keep that variable in
the current block, making sure it is discarded on each loop.

We end-up using whatever is on the stack. Who knows, it might
just be the right thing...

Fixes: 33d3bc95 ("KVM: arm64: vgic-its: Read initial LPI pending table")
Cc: stable@vger.kernel.org # 4.8
Reported-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

64afe6e9

KVM: arm/arm64: vgic: Preserve the revious read from the pending table · ddb4b010

由 Marc Zyngier 提交于 11月 16, 2017

The current pending table parsing code assumes that we keep the
previous read of the pending bits, but keep that variable in
the current block, making sure it is discarded on each loop.

We end-up using whatever is on the stack. Who knows, it might
just be the right thing...

Fixes: 28077125 ("KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES")
Cc: stable@vger.kernel.org # 4.12
Reported-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

ddb4b010

KVM: arm/arm64: vgic-irqfd: Fix MSI entry allocation · 150009e2

由 Marc Zyngier 提交于 11月 16, 2017

Using the size of the structure we're allocating is a good idea
and avoids any surprise... In this case, we're happilly confusing
kvm_kernel_irq_routing_entry and kvm_irq_routing_entry...

Fixes: 95b110ab ("KVM: arm/arm64: Enable irqchip routing")
Cc: stable@vger.kernel.org # 4.8
Reported-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

150009e2

KVM: arm/arm64: VGIC: extend !vgic_is_initialized guard · 285a90e3

由 Andre Przywara 提交于 11月 17, 2017

Commit f39d16cb ("KVM: arm/arm64: Guard kvm_vgic_map_is_active against
!vgic_initialized") introduced a check whether the VGIC has been
initialized before accessing the spinlock and the VGIC data structure.
However the vgic_get_irq() call in the variable declaration sneaked
through the net, so lets make sure that this also gets called only after
we actually allocated the arrays this function accesses.
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

285a90e3

KVM: arm/arm64: Don't enable/disable physical timer access on VHE · ec6449a9

由 Christoffer Dall 提交于 11月 20, 2017

After the timer optimization rework we accidentally end up calling
physical timer enable/disable functions on VHE systems, which is neither
needed nor correct, since the CNTHCTL_EL2 register format is
different when HCR_EL2.E2H is set.

The CNTHCTL_EL2 is initialized when CPUs become online in
kvm_timer_init_vhe() and we don't have to call these functions on VHE
systems, which also allows us to inline the non-VHE functionality.
Reported-by: NJintack Lim <jintack@cs.columbia.edu>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

ec6449a9

28 11月, 2017 1 次提交

KVM: Let KVM_SET_SIGNAL_MASK work as advertised · 20b7035c

由 Jan H. Schönherr 提交于 11月 24, 2017

KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
"any unblocked signal received [...] will cause KVM_RUN to return with
-EINTR" and that "the signal will only be delivered if not blocked by
the original signal mask".

This, however, is only true, when the calling task has a signal handler
registered for a signal. If not, signal evaluation is short-circuited for
SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
returning or the whole process is terminated.

Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
to that in do_sigtimedwait() to avoid short-circuiting of signals.
Signed-off-by: NJan H. SchÃ¶nherr <jschoenh@amazon.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

20b7035c

10 11月, 2017 19 次提交

KVM: arm/arm64: Don't queue VLPIs on INV/INVALL · 95b110ab

由 Christoffer Dall 提交于 11月 10, 2017

Since VLPIs are injected directly by the hardware there's no need to
mark these as pending in software and queue them on the AP list.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

95b110ab

KVM: arm/arm64: Fix GICv4 ITS initialization issues · 3d1ad640

由 Christoffer Dall 提交于 11月 10, 2017

We should only try to initialize GICv4 data structures on a GICv4
capable system.  Move the vgic_supports_direct_msis() check inito
vgic_v4_init() so that any KVM VGIC initialization path does not fail
on non-GICv4 systems.

Also be slightly more strict in the checking of the return value in
vgic_its_create, and only error out on negative return values from the
vgic_v4_init() function.  This is important because the kvm device code
only treats negative values as errors and only cleans up in this case.
Errornously treating a positive return value as an error from the
vgic_v4_init() function can lead to NULL pointer dereferences, as has
recently been observed.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

3d1ad640

KVM: arm/arm64: GICv4: Theory of operations · ed8703a5