提交 · c2f58514cfb374d5368c9da945f1765cd48eb0da · openanolis / cloud-kernel

17 9月, 2015 1 次提交

arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources · c2f58514

由 Pavel Fedin 提交于 8月 05, 2015

Until b26e5fda ("arm/arm64: KVM: introduce per-VM ops"),
kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(),
and vgic_v2_map_resources() still has it.

But now vm_ops are not initialized until we call kvm_vgic_create().
Therefore kvm_vgic_map_resources() can being called without a VGIC,
and we die because vm_ops.map_resources is NULL.

Fixing this restores QEMU's kernel-irqchip=off option to a working state,
allowing to use GIC emulation in userspace.

Fixes: b26e5fda ("arm/arm64: KVM: introduce per-VM ops")
Cc: stable@vger.kernel.org
Signed-off-by: NPavel Fedin <p.fedin@samsung.com>
[maz: reworked commit message]
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

c2f58514

16 9月, 2015 1 次提交

arm: KVM: Fix incorrect device to IPA mapping · ca09f02f

由 Marek Majtyka 提交于 9月 16, 2015

A critical bug has been found in device memory stage1 translation for
VMs with more then 4GB of address space. Once vm_pgoff size is smaller
then pa (which is true for LPAE case, u32 and u64 respectively) some
more significant bits of pa may be lost as a shift operation is performed
on u32 and later cast onto u64.

Example: vm_pgoff(u32)=0x00210030, PAGE_SHIFT=12
        expected pa(u64):   0x0000002010030000
        produced pa(u64):   0x0000000010030000

The fix is to change the order of operations (casting first onto phys_addr_t
and then shifting).
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
[maz: fixed changelog and patch formatting]
Cc: stable@vger.kernel.org
Signed-off-by: NMarek Majtyka <marek.majtyka@tieto.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

ca09f02f

05 9月, 2015 1 次提交

arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores · 0c067292

由 Alexander Spyridakis 提交于 9月 04, 2015

If a guest requests the affinity info for a non-existing vCPU we need to
properly return an error, instead of erroneously reporting an off state.
Signed-off-by: NAlexander Spyridakis <a.spyridakis@virtualopensystems.com>
Signed-off-by: NAlvise Rigo <a.rigo@virtualopensystems.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

0c067292

20 8月, 2015 1 次提交

arm: KVM: keep arm vfp/simd exit handling consistent with arm64 · 054167b3

由 Mario Smarduch 提交于 7月 16, 2015

After enhancing arm64 FP/SIMD exit handling, ARMv7 VFP exit branch is moved
to guest trap handling. This allows us to keep exit handling flow between both
architectures consistent.
Signed-off-by: NMario Smarduch <m.smarduch@samsung.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

054167b3

12 8月, 2015 4 次提交

KVM: arm/arm64: timer: Allow the timer to control the active state · f120cd65

由 Marc Zyngier 提交于 6月 23, 2014

In order to remove the crude hack where we sneak the masked bit
into the timer's control register, make use of the phys_irq_map
API control the active state of the interrupt.

This causes some limited changes to allow for potential error
propagation.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

f120cd65

KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts · 6c3d63c9

由 Marc Zyngier 提交于 6月 23, 2014

In order to be able to feed physical interrupts to a guest, we need
to be able to establish the virtual-physical mapping between the two
worlds.

The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

6c3d63c9

arm/arm64: KVM: Move vgic handling to a non-preemptible section · abdf5843

由 Marc Zyngier 提交于 6月 08, 2015

As we're about to introduce some serious GIC-poking to the vgic code,
it is important to make sure that we're going to poke the part of
the GIC that belongs to the CPU we're about to run on (otherwise,
we'd end up with some unexpected interrupts firing)...

Introducing a non-preemptible section in kvm_arch_vcpu_ioctl_run
prevents the problem from occuring.
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

abdf5843

arm/arm64: KVM: Fix ordering of timer/GIC on guest entry · 9a99d050

由 Marc Zyngier 提交于 6月 05, 2015

As we now inject the timer interrupt when we're about to enter
the guest, it makes a lot more sense to make sure this happens
before the vgic code queues the pending interrupts.

Otherwise, we get the interrupt on the following exit, which is
not great for latency (and leads to all kind of bizarre issues
when using with active interrupts at the HW level).
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>

9a99d050

21 7月, 2015 3 次提交

KVM: arm64: introduce vcpu->arch.debug_ptr · 84e690bf

由 Alex Bennée 提交于 7月 07, 2015

This introduces a level of indirection for the debug registers. Instead
of using the sys_regs[] directly we store registers in a structure in
the vcpu. The new kvm_arm_reset_debug_ptr() sets the debug ptr to the
guest context.

Because we no longer give the sys_regs offset for the sys_reg_desc->reg
field, but instead the index into a debug-specific struct we need to
add a number of additional trap functions for each register. Also as the
generic generic user-space access code no longer works we have
introduced a new pair of function pointers to the sys_reg_desc structure
to override the generic code when needed.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

84e690bf

KVM: arm: introduce kvm_arm_init/setup/clear_debug · 56c7f5e7

由 Alex Bennée 提交于 7月 07, 2015

This is a precursor for later patches which will need to do more to
setup debug state before entering the hyp.S switch code. The existing
functionality for setting mdcr_el2 has been moved out of hyp.S and now
uses the value kept in vcpu->arch.mdcr_el2.

As the assembler used to previously mask and preserve MDCR_EL2.HPMN I've
had to add a mechanism to save the value of mdcr_el2 as a per-cpu
variable during the initialisation code. The kernel never sets this
number so we are assuming the bootcode has set up the correct value
here.

This also moves the conditional setting of the TDA bit from the hyp code
into the C code which is currently used for the lazy debug register
context switch code.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

56c7f5e7

KVM: arm: guest debug, add stub KVM_SET_GUEST_DEBUG ioctl · 0e6f07f2

由 Alex Bennée 提交于 7月 07, 2015

This commit adds a stub function to support the KVM_SET_GUEST_DEBUG
ioctl. Any unsupported flag will return -EINVAL. For now, only
KVM_GUESTDBG_ENABLE is supported, although it won't have any effects.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

0e6f07f2

17 6月, 2015 6 次提交

arm/arm64: KVM: vgic: Do not save GICH_HCR / ICH_HCR_EL2 · 4642019d

由 Marc Zyngier 提交于 6月 11, 2015

The GIC Hypervisor Configuration Register is used to enable
the delivery of virtual interupts to a guest, as well as to
define in which conditions maintenance interrupts are delivered
to the host.

This register doesn't contain any information that we need to
read back (the EOIcount is utterly useless for us).

So let's save ourselves some cycles, and not save it before
writing zero to it.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

4642019d

ARM: kvm: psci: fix handling of unimplemented functions · e2d99736

由 Lorenzo Pieralisi 提交于 6月 10, 2015

According to the PSCI specification and the SMC/HVC calling
convention, PSCI function_ids that are not implemented must
return NOT_SUPPORTED as return value.

Current KVM implementation takes an unhandled PSCI function_id
as an error and injects an undefined instruction into the guest
if PSCI implementation is called with a function_id that is not
handled by the resident PSCI version (ie it is not implemented),
which is not the behaviour expected by a guest when calling a
PSCI function_id that is not implemented.

This patch fixes this issue by returning NOT_SUPPORTED whenever
the kvm PSCI call is executed for a function_id that is not
implemented by the PSCI kvm layer.

Cc: <stable@vger.kernel.org> # 3.18+
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

e2d99736

KVM: arm/arm64: Enable the KVM-VFIO device · 8889583c

由 Kim Phillips 提交于 6月 05, 2015

The KVM-VFIO device is used by the QEMU VFIO device. It is used to
record the list of in-use VFIO groups so that KVM can manipulate
them.
Signed-off-by: NKim Phillips <kim.phillips@linaro.org>
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

8889583c

arm/arm64: KVM: Properly account for guest CPU time · 1b3d546d

由 Christoffer Dall 提交于 5月 28, 2015

Until now we have been calling kvm_guest_exit after re-enabling
interrupts when we come back from the guest, but this has the
unfortunate effect that CPU time accounting done in the context of timer
interrupts occurring while the guest is running doesn't properly notice
that the time since the last tick was spent in the guest.

Inspired by the comment in the x86 code, move the kvm_guest_exit() call
below the local_irq_enable() call and change __kvm_guest_exit() to
kvm_guest_exit(), because we are now calling this function with
interrupts enabled. We have to now explicitly disable preemption and
not enable preemption before we've called kvm_guest_exit(), since
otherwise we could be preempted and everything happening before we
eventually get scheduled again would be accounted for as guest time.

At the same time, move the trace_kvm_exit() call outside of the atomic
section, since there is no reason for us to do that with interrupts
disabled.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

1b3d546d

kvm: remove one useless check extension · ea2c6d97

由 Tiejun Chen 提交于 5月 04, 2015

We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD,

kvm_vm_ioctl_check_extension_generic()
    |
    + switch (arg) {
    +   ...
    +   #ifdef CONFIG_HAVE_KVM_IRQFD
    +       case KVM_CAP_IRQFD:
    +   #endif
    +   ...
    +   return 1;
    +   ...
    + }
    |
    + kvm_vm_ioctl_check_extension()

So its not necessary to check this in arch again, and also fix one typo,
s/emlation/emulation.
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

ea2c6d97

arm: KVM: force execution of HCPTR access on VM exit · 85e84ba3

由 Marc Zyngier 提交于 3月 16, 2015

On VM entry, we disable access to the VFP registers in order to
perform a lazy save/restore of these registers.

On VM exit, we restore access, test if we did enable them before,
and save/restore the guest/host registers if necessary. In this
sequence, the FPEXC register is always accessed, irrespective
of the trapping configuration.

If the guest didn't touch the VFP registers, then the HCPTR access
has now enabled such access, but we're missing a barrier to ensure
architectural execution of the new HCPTR configuration. If the HCPTR
access has been delayed/reordered, the subsequent access to FPEXC
will cause a trap, which we aren't prepared to handle at all.

The same condition exists when trapping to enable VFP for the guest.

The fix is to introduce a barrier after enabling VFP access. In the
vmexit case, it can be relaxed to only takes place if the guest hasn't
accessed its view of the VFP registers, making the access to FPEXC safe.

The set_hcptr macro is modified to deal with both vmenter/vmexit and
vmtrap operations, and now takes an optional label that is branched to
when the guest hasn't touched the VFP registers.
Reported-by: NVikram Sethi <vikrams@codeaurora.org>
Cc: stable@kernel.org	# v3.9+
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

85e84ba3

10 6月, 2015 1 次提交

ARM: KVM: Remove pointless void pointer cast · a5f56ba3

由 Firo Yang 提交于 4月 23, 2015

No need to cast the void pointer returned by kmalloc() in
arch/arm/kvm/mmu.c::kvm_alloc_stage2_pgd().
Signed-off-by: NFiro Yang <firogm@gmail.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

a5f56ba3

28 5月, 2015 1 次提交

KVM: add "new" argument to kvm_arch_commit_memory_region · f36f3f28

由 Paolo Bonzini 提交于 5月 18, 2015

This lets the function access the new memory slot without going through
kvm_memslots and id_to_memslot.  It will simplify the code when more
than one address space will be supported.

Unfortunately, the "const"ness of the new argument must be casted
away in two places.  Fixing KVM to accept const struct kvm_memory_slot
pointers would require modifications in pretty much all architectures,
and is left for later.
Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f36f3f28

27 5月, 2015 1 次提交

arm/arm64: kvm: add missing PSCI include · 538b9b25

由 Mark Rutland 提交于 5月 01, 2015

We make use of the PSCI function IDs, but don't explicitly include the
header which defines them. Relying on transitive header includes is
fragile and will be broken as headers are refactored.

This patch includes the relevant header file directly so as to avoid
future breakage.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NHanjun Guo <hanjun.guo@linaro.org>
Tested-by: NHanjun Guo <hanjun.guo@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>

538b9b25

26 5月, 2015 3 次提交

KVM: add memslots argument to kvm_arch_memslots_updated · 15f46015

由 Paolo Bonzini 提交于 5月 17, 2015

Prepare for the case of multiple address spaces.
Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

15f46015

KVM: const-ify uses of struct kvm_userspace_memory_region · 09170a49

由 Paolo Bonzini 提交于 5月 18, 2015

Architecture-specific helpers are not supposed to muck with
struct kvm_userspace_memory_region contents.  Add const to
enforce this.

In order to eliminate the only write in __kvm_set_memory_region,
the cleaning of deleted slots is pulled up from update_memslots
to __kvm_set_memory_region.
Reviewed-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

09170a49

KVM: use kvm_memslots whenever possible · 9f6b8029

由 Paolo Bonzini 提交于 5月 17, 2015

kvm_memslots provides lockdep checking.  Use it consistently instead of
explicit dereferencing of kvm->memslots.
Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9f6b8029

09 5月, 2015 1 次提交

ARM: kvm: fix a bad BSYM() usage · 5890298a

由 Russell King 提交于 4月 21, 2015

BSYM() should only be used when refering to local symbols in the same
assembly file which are resolved by the assembler, and not for
linker-fixed up symbols.  The use of BSYM() with panic is incorrect as
the linker is involved in fixing up this relocation, and it knows
whether panic() is ARM or Thumb.
Acked-by: NNicolas Pitre <nico@linaro.org>
Acked-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

5890298a

07 5月, 2015 1 次提交

KVM: arm/mips/x86/power use __kvm_guest_{enter|exit} · ccf73aaf

由 Christian Borntraeger 提交于 4月 30, 2015

Use __kvm_guest_{enter|exit} instead of kvm_guest_{enter|exit}
where interrupts are disabled.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ccf73aaf

22 4月, 2015 1 次提交

KVM: arm/arm64: check IRQ number on userland injection · fd1d0ddf

由 Andre Przywara 提交于 4月 10, 2015

When userland injects a SPI via the KVM_IRQ_LINE ioctl we currently
only check it against a fixed limit, which historically is set
to 127. With the new dynamic IRQ allocation the effective limit may
actually be smaller (64).
So when now a malicious or buggy userland injects a SPI in that
range, we spill over on our VGIC bitmaps and bytemaps memory.
I could trigger a host kernel NULL pointer dereference with current
mainline by injecting some bogus IRQ number from a hacked kvmtool:
-----------------
....
DEBUG: kvm_vgic_inject_irq(kvm, cpu=0, irq=114, level=1)
DEBUG: vgic_update_irq_pending(kvm, cpu=0, irq=114, level=1)
DEBUG: IRQ #114 still in the game, writing to bytemap now...
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ffffffc07652e000
[00000000] *pgd=00000000f658b003, *pud=00000000f658b003, *pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1053 Comm: lkvm-msi-irqinj Not tainted 4.0.0-rc7+ #3027
Hardware name: FVP Base (DT)
task: ffffffc0774e9680 ti: ffffffc0765a8000 task.ti: ffffffc0765a8000
PC is at kvm_vgic_inject_irq+0x234/0x310
LR is at kvm_vgic_inject_irq+0x30c/0x310
pc : [<ffffffc0000ae0a8>] lr : [<ffffffc0000ae180>] pstate: 80000145
.....

So this patch fixes this by checking the SPI number against the
actual limit. Also we remove the former legacy hard limit of
127 in the ioctl code.
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
CC: <stable@vger.kernel.org> # 4.0, 3.19, 3.18
[maz: wrap KVM_ARM_IRQ_GIC_MAX with #ifndef __KERNEL__,
as suggested by Christopher Covington]
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

fd1d0ddf

31 3月, 2015 2 次提交

KVM: arm/arm64: enable KVM_CAP_IOEVENTFD · d44758c0

由 Nikolay Nikolaev 提交于 1月 24, 2015

As the infrastructure for eventfd has now been merged, report the
ioeventfd capability as being supported.
Signed-off-by: NNikolay Nikolaev <n.nikolaev@virtualopensystems.com>
[maz: grouped the case entry with the others, fixed commit log]
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

d44758c0

KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO bus · 950324ab

由 Andre Przywara 提交于 3月 28, 2015

Currently we have struct kvm_exit_mmio for encapsulating MMIO abort
data to be passed on from syndrome decoding all the way down to the
VGIC register handlers. Now as we switch the MMIO handling to be
routed through the KVM MMIO bus, it does not make sense anymore to
use that structure already from the beginning. So we keep the data in
local variables until we put them into the kvm_io_bus framework.
Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private
structure. On that way we replace the data buffer in that structure
with a pointer pointing to a single location in a local variable, so
we get rid of some copying on the way.
With all of the virtual GIC emulation code now being registered with
the kvm_io_bus, we can remove all of the old MMIO handling code and
its dispatching functionality.

I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something),
because that touches a lot of code lines without any good reason.

This is based on an original patch by Nikolay.
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

950324ab

27 3月, 2015 2 次提交

ARM: kvm: round HYP section to page size instead of log2 upper bound · a9fea8b3

由 Ard Biesheuvel 提交于 3月 27, 2015

Older binutils do not support expressions involving the values of
external symbols so just round up the HYP region to the page size.
Tested-by: NSimon Horman <horms+renesas@verge.net.au>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
[will: when will this ever end?!]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

a9fea8b3

KVM: arm/arm64: remove now unneeded include directory from Makefile · 5d9d15af

由 Andre Przywara 提交于 3月 26, 2015

virt/kvm was never really a good include directory for anything else
than locally included headers.
With the move of iodev.h there is no need anymore to add this
directory the compiler's include path, so remove it from the arm and
arm64 kvm Makefile.
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

5d9d15af

23 3月, 2015 1 次提交

arm64: KVM: use ID map with increased VA range if required · e4c5a685

由 Ard Biesheuvel 提交于 3月 19, 2015

This patch modifies the HYP init code so it can deal with system
RAM residing at an offset which exceeds the reach of VA_BITS.

Like for EL1, this involves configuring an additional level of
translation for the ID map. However, in case of EL2, this implies
that all translations use the extra level, as we cannot seamlessly
switch between translation tables with different numbers of
translation levels.

So add an extra translation table at the root level. Since the
ID map and the runtime HYP map are guaranteed not to overlap, they
can share this root level, and we can essentially merge these two
tables into one.
Tested-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e4c5a685

20 3月, 2015 1 次提交

ARM, arm64: kvm: get rid of the bounce page · 06f75a1f

由 Ard Biesheuvel 提交于 3月 19, 2015

The HYP init bounce page is a runtime construct that ensures that the
HYP init code does not cross a page boundary. However, this is something
we can do perfectly well at build time, by aligning the code appropriately.

For arm64, we just align to 4 KB, and enforce that the code size is less
than 4 KB, regardless of the chosen page size.

For ARM, the whole code is less than 256 bytes, so we tweak the linker
script to align at a power of 2 upper bound of the code size

Note that this also fixes a benign off-by-one error in the original bounce
page code, where a bounce page would be allocated unnecessarily if the code
was exactly 1 page in size.

On ARM, it also fixes an issue with very large kernels reported by Arnd
Bergmann, where stub sections with linker emitted veneers could erroneously
trigger the size/alignment ASSERT() in the linker script.
Tested-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

06f75a1f

14 3月, 2015 2 次提交

arm/arm64: KVM: Fix migration race in the arch timer · 1a748478

由 Christoffer Dall 提交于 3月 13, 2015

When a VCPU is no longer running, we currently check to see if it has a
timer scheduled in the future, and if it does, we schedule a host
hrtimer to notify is in case the timer expires while the VCPU is still
not running.  When the hrtimer fires, we mask the guest's timer and
inject the timer IRQ (still relying on the guest unmasking the time when
it receives the IRQ).

This is all good and fine, but when migration a VM (checkpoint/restore)
this introduces a race.  It is unlikely, but possible, for the following
sequence of events to happen:

 1. Userspace stops the VM
 2. Hrtimer for VCPU is scheduled
 3. Userspace checkpoints the VGIC state (no pending timer interrupts)
 4. The hrtimer fires, schedules work in a workqueue
 5. Workqueue function runs, masks the timer and injects timer interrupt
 6. Userspace checkpoints the timer state (timer masked)

At restore time, you end up with a masked timer without any timer
interrupts and your guest halts never receiving timer interrupts.

Fix this by only kicking the VCPU in the workqueue function, and sample
the expired state of the timer when entering the guest again and inject
the interrupt and mask the timer only then.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

1a748478

arm/arm64: KVM: export VCPU power state via MP_STATE ioctl · ecccf0cc

由 Alex Bennée 提交于 3月 13, 2015

To cleanly restore an SMP VM we need to ensure that the current pause
state of each vcpu is correctly recorded. Things could get confused if
the CPU starts running after migration restore completes when it was
paused before it state was captured.

We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
interface is a lot simpler as the only valid states are
KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_STOPPED.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

ecccf0cc

13 3月, 2015 3 次提交

arm/arm64: KVM: Optimize handling of Access Flag faults · aeda9130

由 Marc Zyngier 提交于 3月 12, 2015

Now that we have page aging in Stage-2, it becomes obvious that
we're doing way too much work handling the fault.

The page is not going anywhere (it is still mapped), the page
tables are already allocated, and all we want is to flip a bit
in the PMD or PTE. Also, we can avoid any form of TLB invalidation,
since a page with the AF bit off is not allowed to be cached.

An obvious solution is to have a separate handler for FSC_ACCESS,
where we pride ourselves to only do the very minimum amount of
work.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

aeda9130

arm/arm64: KVM: Implement Stage-2 page aging · 35307b9a

由 Marc Zyngier 提交于 3月 12, 2015

Until now, KVM/arm didn't care much for page aging (who was swapping
anyway?), and simply provided empty hooks to the core KVM code. With
server-type systems now being available, things are quite different.

This patch implements very simple support for page aging, by clearing
the Access flag in the Stage-2 page tables. On access fault, the current
fault handling will write the PTE or PMD again, putting the Access flag
back on.

It should be possible to implement a much faster handling for Access
faults, but that's left for a later patch.

With this in place, performance in VMs is degraded much more gracefully.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

35307b9a

arm/arm64: KVM: Allow handle_hva_to_gpa to return a value · 1d2ebacc

由 Marc Zyngier 提交于 3月 12, 2015

So far, handle_hva_to_gpa was never required to return a value.
As we prepare to age pages at Stage-2, we need to be able to
return a value from the iterator (kvm_test_age_hva).

Adapt the code to handle this situation. No semantic change.
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

1d2ebacc

12 3月, 2015 3 次提交

KVM: arm/arm64: add irqfd support · 174178fe

由 Eric Auger 提交于 3月 04, 2015

This patch enables irqfd on arm/arm64.

Both irqfd and resamplefd are supported. Injection is implemented
in vgic.c without routing.

This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD.

KVM_CAP_IRQFD is now advertised. KVM_CAP_IRQFD_RESAMPLE capability
automatically is advertised as soon as CONFIG_HAVE_KVM_IRQFD is set.

Irqfd injection is restricted to SPI. The rationale behind not
supporting PPI irqfd injection is that any device using a PPI would
be a private-to-the-CPU device (timer for instance), so its state
would have to be context-switched along with the VCPU and would
require in-kernel wiring anyhow. It is not a relevant use case for
irqfds.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

174178fe

KVM: arm/arm64: implement kvm_arch_intc_initialized · c1426e4c

由 Eric Auger 提交于 3月 04, 2015

On arm/arm64 the VGIC is dynamically instantiated and it is useful
to expose its state, especially for irqfd setup.

This patch defines __KVM_HAVE_ARCH_INTC_INITIALIZED and
implements kvm_arch_intc_initialized.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

c1426e4c

KVM: arm/arm64: unset CONFIG_HAVE_KVM_IRQCHIP · df2bd1ac

由 Eric Auger 提交于 3月 04, 2015

CONFIG_HAVE_KVM_IRQCHIP is needed to support IRQ routing (along
with irq_comm.c and irqchip.c usage). This is not the case for
arm/arm64 currently.

This patch unsets the flag for both arm and arm64.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

df2bd1ac

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功