提交 · b890d75c4cdc963c96e7774b088120966c23ab8e · openeuler / Kernel

19 4月, 2019 1 次提交

KVM: arm/arm64: Demote kvm_arm_init_arch_resources() to just set up SVE · a3be836d

由 Dave Martin 提交于 4月 12, 2019

The introduction of kvm_arm_init_arch_resources() looks like
premature factoring, since nothing else uses this hook yet and it
is not clear what will use it in the future.

For now, let's not pretend that this is a general thing:

This patch simply renames the function to kvm_arm_init_sve(),
retaining the arm stub version under the new name.
Suggested-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

a3be836d

29 3月, 2019 2 次提交

KVM: arm/arm64: Add KVM_ARM_VCPU_FINALIZE ioctl · 7dd32a0d

由 Dave Martin 提交于 12月 19, 2018

Some aspects of vcpu configuration may be too complex to be
completed inside KVM_ARM_VCPU_INIT.  Thus, there may be a
requirement for userspace to do some additional configuration
before various other ioctls will work in a consistent way.

In particular this will be the case for SVE, where userspace will
need to negotiate the set of vector lengths to be made available to
the guest before the vcpu becomes fully usable.

In order to provide an explicit way for userspace to confirm that
it has finished setting up a particular vcpu feature, this patch
adds a new ioctl KVM_ARM_VCPU_FINALIZE.

When userspace has opted into a feature that requires finalization,
typically by means of a feature flag passed to KVM_ARM_VCPU_INIT, a
matching call to KVM_ARM_VCPU_FINALIZE is now required before
KVM_RUN or KVM_GET_REG_LIST is allowed.  Individual features may
impose additional restrictions where appropriate.

No existing vcpu features are affected by this, so current
userspace implementations will continue to work exactly as before,
with no need to issue KVM_ARM_VCPU_FINALIZE.

As implemented in this patch, KVM_ARM_VCPU_FINALIZE is currently a
placeholder: no finalizable features exist yet, so ioctl is not
required and will always yield EINVAL.  Subsequent patches will add
the finalization logic to make use of this ioctl for SVE.

No functional change for existing userspace.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

7dd32a0d

KVM: arm/arm64: Add hook for arch-specific KVM initialisation · 0f062bfe

由 Dave Martin 提交于 2月 28, 2019

This patch adds a kvm_arm_init_arch_resources() hook to perform
subarch-specific initialisation when starting up KVM.

This will be used in a subsequent patch for global SVE-related
setup on arm64.

No functional change.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

0f062bfe

20 2月, 2019 5 次提交

KVM: arm/arm64: fix spelling mistake: "auxilary" -> "auxiliary" · a37f0c3c

由 Colin Ian King 提交于 2月 17, 2019

There is a spelling mistake in a kvm_err error message. Fix it.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

a37f0c3c

KVM: arm/arm64: Simplify bg_timer programming · accb99bc

由 Christoffer Dall 提交于 11月 26, 2018

Instead of calling into kvm_timer_[un]schedule from the main kvm
blocking path, test if the VCPU is on the wait queue from the load/put
path and perform the background timer setup/cancel in this path.

This has the distinct advantage that we no longer race between load/put
and schedule/unschedule and programming and canceling of the bg_timer
always happens when the timer state is not loaded.

Note that we must now remove the checks in kvm_timer_blocking that do
not schedule a background timer if one of the timers can fire, because
we no longer have a guarantee that kvm_vcpu_check_block() will be called
before kvm_timer_blocking.
Reported-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

accb99bc

KVM: arm/arm64: Factor out VMID into struct kvm_vmid · e329fb75

由 Christoffer Dall 提交于 12月 11, 2018

In preparation for nested virtualization where we are going to have more
than a single VMID per VM, let's factor out the VMID data into a
separate VMID data structure and change the VMID allocator to operate on
this new structure instead of using a struct kvm.

This also means that udate_vttbr now becomes update_vmid, and that the
vttbr itself is generated on the fly based on the stage 2 page table
base address and the vmid.

We cache the physical address of the pgd when allocating the pgd to
avoid doing the calculation on every entry to the guest and to avoid
calling into potentially non-hyp-mapped code from hyp/EL2.

If we wanted to merge the VMID allocator with the arm64 ASID allocator
at some point in the future, it should actually become easier to do that
after this patch.

Note that to avoid mapping the kvm_vmid_bits variable into hyp, we
simply forego the masking of the vmid value in kvm_get_vttbr and rely on
update_vmid to always assign a valid vmid value (within the supported
range).
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
[maz: minor cleanups]
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

e329fb75

arm/arm64: KVM: Statically configure the host's view of MPIDR · 32f13955

由 Marc Zyngier 提交于 1月 19, 2019

We currently eagerly save/restore MPIDR. It turns out to be
slightly pointless:
- On the host, this value is known as soon as we're scheduled on a
  physical CPU
- In the guest, this value cannot change, as it is set by KVM
  (and this is a read-only register)

The result of the above is that we can perfectly avoid the eager
saving of MPIDR_EL1, and only keep the restore. We just have
to setup the host contexts appropriately at boot time.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>

32f13955

arm/arm64: KVM: Introduce kvm_call_hyp_ret() · 7aa8d146

由 Marc Zyngier 提交于 1月 05, 2019

Until now, we haven't differentiated between HYP calls that
have a return value and those who don't. As we're about to
change this, introduce kvm_call_hyp_ret(), and change all
call sites that actually make use of a return value.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>

7aa8d146

07 2月, 2019 1 次提交

arm/arm64: KVM: Allow a VCPU to fully reset itself · 358b28f0

由 Marc Zyngier 提交于 12月 20, 2018

The current kvm_psci_vcpu_on implementation will directly try to
manipulate the state of the VCPU to reset it.  However, since this is
not done on the thread that runs the VCPU, we can end up in a strangely
corrupted state when the source and target VCPUs are running at the same
time.

Fix this by factoring out all reset logic from the PSCI implementation
and forwarding the required information along with a request to the
target VCPU.
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>

358b28f0

18 12月, 2018 2 次提交

KVM: arm/arm64: Fix VMID alloc race by reverting to lock-less · fb544d1c

由 Christoffer Dall 提交于 12月 11, 2018

We recently addressed a VMID generation race by introducing a read/write
lock around accesses and updates to the vmid generation values.

However, kvm_arch_vcpu_ioctl_run() also calls need_new_vmid_gen() but
does so without taking the read lock.

As far as I can tell, this can lead to the same kind of race:

  VM 0, VCPU 0			VM 0, VCPU 1
  ------------			------------
  update_vttbr (vmid 254)
  				update_vttbr (vmid 1) // roll over
				read_lock(kvm_vmid_lock);
				force_vm_exit()
  local_irq_disable
  need_new_vmid_gen == false //because vmid gen matches

  enter_guest (vmid 254)
  				kvm_arch.vttbr = <PGD>:<VMID 1>
				read_unlock(kvm_vmid_lock);

  				enter_guest (vmid 1)

Which results in running two VCPUs in the same VM with different VMIDs
and (even worse) other VCPUs from other VMs could now allocate clashing
VMID 254 from the new generation as long as VCPU 0 is not exiting.

Attempt to solve this by making sure vttbr is updated before another CPU
can observe the updated VMID generation.

Cc: stable@vger.kernel.org
Fixes: f0cf47d9 "KVM: arm/arm64: Close VMID generation race"
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

fb544d1c

arm64: KVM: Consistently advance singlestep when emulating instructions · bd7d95ca

由 Mark Rutland 提交于 11月 09, 2018

When we emulate a guest instruction, we don't advance the hardware
singlestep state machine, and thus the guest will receive a software
step exception after a next instruction which is not emulated by the
host.

We bodge around this in an ad-hoc fashion. Sometimes we explicitly check
whether userspace requested a single step, and fake a debug exception
from within the kernel. Other times, we advance the HW singlestep state
rely on the HW to generate the exception for us. Thus, the observed step
behaviour differs for host and guest.

Let's make this simpler and consistent by always advancing the HW
singlestep state machine when we skip an instruction. Thus we can rely
on the hardware to generate the singlestep exception for us, and never
need to explicitly check for an active-pending step, nor do we need to
fake a debug exception from the guest.

Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

bd7d95ca

14 12月, 2018 2 次提交

kvm: introduce manual dirty log reprotect · 2a31b9db

由 Paolo Bonzini 提交于 10月 23, 2018

There are two problems with KVM_GET_DIRTY_LOG.  First, and less important,
it can take kvm->mmu_lock for an extended period of time.  Second, its user
can actually see many false positives in some cases.  The latter is due
to a benign race like this:

  1. KVM_GET_DIRTY_LOG returns a set of dirty pages and write protects
     them.
  2. The guest modifies the pages, causing them to be marked ditry.
  3. Userspace actually copies the pages.
  4. KVM_GET_DIRTY_LOG returns those pages as dirty again, even though
     they were not written to since (3).

This is especially a problem for large guests, where the time between
(1) and (3) can be substantial.  This patch introduces a new
capability which, when enabled, makes KVM_GET_DIRTY_LOG not
write-protect the pages it returns.  Instead, userspace has to
explicitly clear the dirty log bits just before using the content
of the page.  The new KVM_CLEAR_DIRTY_LOG ioctl can also operate on a
64-page granularity rather than requiring to sync a full memslot;
this way, the mmu_lock is taken for small amounts of time, and
only a small amount of time will pass between write protection
of pages and the sending of their content.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2a31b9db

kvm: rename last argument to kvm_get_dirty_log_protect · 8fe65a82

由 Paolo Bonzini 提交于 10月 23, 2018

When manual dirty log reprotect will be enabled, kvm_get_dirty_log_protect's
pointer argument will always be false on exit, because no TLB flush is needed
until the manual re-protection operation. Rename it from "is_dirty" to "flush",
which more accurately tells the caller what they have to do with it.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8fe65a82

10 12月, 2018 1 次提交

KVM: arm64: Rework detection of SVE, !VHE systems · 33e5f4e5

由 Marc Zyngier 提交于 12月 06, 2018

An SVE system is so far the only case where we mandate VHE. As we're
starting to grow this requirements, let's slightly rework the way we
deal with that situation, allowing for easy extension of this check.
Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

33e5f4e5

18 10月, 2018 3 次提交

arm/arm64: KVM: Enable 32 bits kvm vcpu events support · 58bf437f

由 Dongjiu Geng 提交于 10月 13, 2018

The commit 539aee0e ("KVM: arm64: Share the parts of
get/set events useful to 32bit") shares the get/set events
helper for arm64 and arm32, but forgot to share the cap
extension code.

User space will check whether KVM supports vcpu events by
checking the KVM_CAP_VCPU_EVENTS extension
Acked-by: NJames Morse <james.morse@arm.com>
Reviewed-by : Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

58bf437f

arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension() · 375bdd3b

由 Dongjiu Geng 提交于 10月 13, 2018

Rename kvm_arch_dev_ioctl_check_extension() to
kvm_arch_vm_ioctl_check_extension(), because it does
not have any relationship with device.

Renaming this function can make code readable.

Cc: James Morse <james.morse@arm.com>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

375bdd3b

KVM: arm64: Fix caching of host MDCR_EL2 value · da5a3ce6

由 Mark Rutland 提交于 10月 17, 2018

At boot time, KVM stashes the host MDCR_EL2 value, but only does this
when the kernel is not running in hyp mode (i.e. is non-VHE). In these
cases, the stashed value of MDCR_EL2.HPMN happens to be zero, which can
lead to CONSTRAINED UNPREDICTABLE behaviour.

Since we use this value to derive the MDCR_EL2 value when switching
to/from a guest, after a guest have been run, the performance counters
do not behave as expected. This has been observed to result in accesses
via PMXEVTYPER_EL0 and PMXEVCNTR_EL0 not affecting the relevant
counters, resulting in events not being counted. In these cases, only
the fixed-purpose cycle counter appears to work as expected.

Fix this by always stashing the host MDCR_EL2 value, regardless of VHE.

Cc: Christopher Dall <christoffer.dall@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Fixes: 1e947bad ("arm64: KVM: Skip HYP setup when already running in HYP")
Tested-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

da5a3ce6

03 10月, 2018 3 次提交

KVM: arm64: Drop __cpu_init_stage2 on the VHE path · 9d47bb0d

由 Marc Zyngier 提交于 10月 01, 2018

__cpu_init_stage2 doesn't do anything anymore on arm64, and is
totally non-sensical if running VHE (as VHE is 64bit only).
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

9d47bb0d

KVM: arm/arm64: Rename kvm_arm_config_vm to kvm_arm_setup_stage2 · bca607eb

由 Marc Zyngier 提交于 10月 01, 2018

VM tends to be a very overloaded term in KVM, so let's keep it
to describe the virtual machine. For the virtual memory setup,
let's use the "stage2" suffix.
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

bca607eb

kvm: arm64: Set a limit on the IPA size · 0f62f0e9

由 Suzuki K Poulose 提交于 9月 26, 2018

So far we have restricted the IPA size of the VM to the default
value (40bits). Now that we can manage the IPA size per VM and
support dynamic stage2 page tables, we can allow VMs to have
larger IPA. This patch introduces a the maximum IPA size
supported on the host. This is decided by the following factors :

 1) Maximum PARange supported by the CPUs - This can be inferred
    from the system wide safe value.
 2) Maximum PA size supported by the host kernel (48 vs 52)
 3) Number of levels in the host page table (as we base our
    stage2 tables on the host table helpers).

Since the stage2 page table code is dependent on the stage1
page table, we always ensure that :

  Number of Levels at Stage1 >= Number of Levels at Stage2

So we limit the IPA to make sure that the above condition
is satisfied. This will affect the following combinations
of VA_BITS and IPA for different page sizes.

  Host configuration | Unsupported IPA ranges
  39bit VA, 4K       | [44, 48]
  36bit VA, 16K      | [41, 48]
  42bit VA, 64K      | [47, 52]

Supporting the above combinations need independent stage2
page table manipulation code, which would need substantial
changes. We could purse the solution independently and
switch the page table code once we have it ready.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@kernel.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

0f62f0e9

01 10月, 2018 2 次提交

kvm: arm/arm64: Prepare for VM specific stage2 translations · e55cac5b

由 Suzuki K Poulose 提交于 9月 26, 2018

Right now the stage2 page table for a VM is hard coded, assuming
an IPA of 40bits. As we are about to add support for per VM IPA,
prepare the stage2 page table helpers to accept the kvm instance
to make the right decision for the VM. No functional changes.
Adds stage2_pgd_size(kvm) to replace S2_PGD_SIZE. Also, moves
some of the definitions in arm32 to align with the arm64.
Also drop the _AC() specifier constants wherever possible.

Cc: Christoffer Dall <cdall@kernel.org>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

e55cac5b

kvm: arm/arm64: Allow arch specific configurations for VM · 5b6c6742

由 Suzuki K Poulose 提交于 9月 26, 2018

Allow the arch backends to perform VM specific initialisation.
This will be later used to handle IPA size configuration and per-VM
VTCR configuration on arm64.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@kernel.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

5b6c6742

18 9月, 2018 1 次提交

arm64: KVM: Enable Common Not Private translations · ab510027

由 Vladimir Murzin 提交于 7月 31, 2018

We rely on cpufeature framework to detect and enable CNP so for KVM we
need to patch hyp to set CNP bit just before TTBR0_EL2 gets written.

For the guest we encode CNP bit while building vttbr, so we don't need
to bother with that in a world switch.
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

ab510027

21 7月, 2018 3 次提交

KVM: arm: Add 32bit get/set events support · b0960b95

由 James Morse 提交于 7月 19, 2018

arm64's new use of KVMs get_events/set_events API calls isn't just
or RAS, it allows an SError that has been made pending by KVM as
part of its device emulation to be migrated.

Wire this up for 32bit too.

We only need to read/write the HCR_VA bit, and check that no esr has
been provided, as we don't yet support VDFSR.
Signed-off-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NDongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

b0960b95

KVM: arm64: Share the parts of get/set events useful to 32bit · 539aee0e

由 James Morse 提交于 7月 19, 2018

The get/set events helpers to do some work to check reserved
and padding fields are zero. This is useful on 32bit too.

Move this code into virt/kvm/arm/arm.c, and give the arch
code some underscores.

This is temporarily hidden behind __KVM_HAVE_VCPU_EVENTS until
32bit is wired up.
Signed-off-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NDongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

539aee0e

arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS · b7b27fac

由 Dongjiu Geng 提交于 7月 19, 2018

For the migrating VMs, user space may need to know the exception
state. For example, in the machine A, KVM make an SError pending,
when migrate to B, KVM also needs to pend an SError.

This new IOCTL exports user-invisible states related to SError.
Together with appropriate user space changes, user space can get/set
the SError exception state to do migrate/snapshot/suspend.
Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
[expanded documentation wording]
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

b7b27fac

09 7月, 2018 1 次提交

KVM: arm/arm64: Enable adaptative WFE trapping · de737089

由 Marc Zyngier 提交于 6月 21, 2018

Trapping blocking WFE is extremely beneficial in situations where
the system is oversubscribed, as it allows another thread to run
while being blocked. In a non-oversubscribed environment, this is
the complete opposite, and trapping WFE is just unnecessary overhead.

Let's only enable WFE trapping if the CPU has more than a single task
to run (that is, more than just the vcpu thread).
Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

de737089

20 6月, 2018 1 次提交

sched/swait: Rename to exclusive · b3dae109

由 Peter Zijlstra 提交于 6月 12, 2018

Since swait basically implemented exclusive waits only, make sure
the API reflects that.

  $ git grep -l -e "\<swake_up\>"
		-e "\<swait_event[^ (]*"
		-e "\<prepare_to_swait\>" | while read file;
    do
	sed -i -e 's/\<swake_up\>/&_one/g'
	       -e 's/\<swait_event[^ (]*/&_exclusive/g'
	       -e 's/\<prepare_to_swait\>/&_exclusive/g' $file;
    done

With a few manual touch-ups.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: bigeasy@linutronix.de
Cc: oleg@redhat.com
Cc: paulmck@linux.vnet.ibm.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180612083909.261946548@infradead.org

b3dae109

02 6月, 2018 2 次提交

kvm: Make VM ioctl do valloc for some archs · d1e5b0e9

由 Marc Orr 提交于 5月 15, 2018

The kvm struct has been bloating. For example, it's tens of kilo-bytes
for x86, which turns out to be a large amount of memory to allocate
contiguously via kzalloc. Thus, this patch does the following:
1. Uses architecture-specific routines to allocate the kvm struct via
   vzalloc for x86.
2. Switches arm to __KVM_HAVE_ARCH_VM_ALLOC so that it can use vzalloc
   when has_vhe() is true.

Other architectures continue to default to kalloc, as they have a
dependency on kalloc or have a small-enough struct kvm.
Signed-off-by: NMarc Orr <marcorr@google.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d1e5b0e9

kvm: Change return type to vm_fault_t · 1499fa80

由 Souptick Joarder 提交于 4月 19, 2018

Use new return type vm_fault_t for fault handler. For
now, this is just documenting that the function returns
a VM_FAULT value rather than an errno. Once all instances
are converted, vm_fault_t will become a distinct type.

commit 1c8f4220 ("mm: change return type to vm_fault_t")
Signed-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1499fa80

01 6月, 2018 1 次提交

arm64: KVM: Add ARCH_WORKAROUND_2 support for guests · 55e3748e

由 Marc Zyngier 提交于 5月 29, 2018

In order to offer ARCH_WORKAROUND_2 support to guests, we need
a bit of infrastructure.

Let's add a flag indicating whether or not the guest uses
SSBD mitigation. Depending on the state of this flag, allow
KVM to disable ARCH_WORKAROUND_2 before entering the guest,
and enable it when exiting it.
Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

55e3748e

25 5月, 2018 4 次提交

KVM: arm/arm64: Remove kvm_vgic_vcpu_early_init · 5ec17fba

由 Eric Auger 提交于 5月 22, 2018

kvm_vgic_vcpu_early_init gets called after kvm_vgic_cpu_init which
is confusing. The call path is as follows:
kvm_vm_ioctl_create_vcpu
|_ kvm_arch_cpu_create
   |_ kvm_vcpu_init
      |_ kvm_arch_vcpu_init
         |_ kvm_vgic_vcpu_init
|_ kvm_arch_vcpu_postcreate
   |_ kvm_vgic_vcpu_early_init

Static initialization currently done in kvm_vgic_vcpu_early_init()
can be moved to kvm_vgic_vcpu_init(). So let's move the code and
remove kvm_vgic_vcpu_early_init(). kvm_arch_vcpu_postcreate() does
nothing.
Signed-off-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

5ec17fba

KVM: arm64: Remove eager host SVE state saving · 21cdd7fd

由 Dave Martin 提交于 4月 20, 2018

Now that the host SVE context can be saved on demand from Hyp,
there is no longer any need to save this state in advance before
entering the guest.

This patch removes the relevant call to
kvm_fpsimd_flush_cpu_state().

Since the problem that function was intended to solve now no longer
exists, the function and its dependencies are also deleted.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

21cdd7fd

KVM: arm64: Save host SVE context as appropriate · 85acda3b

由 Dave Martin 提交于 4月 20, 2018

This patch adds SVE context saving to the hyp FPSIMD context switch
path.  This means that it is no longer necessary to save the host
SVE state in advance of entering the guest, when in use.

In order to avoid adding pointless complexity to the code, VHE is
assumed if SVE is in use.  VHE is an architectural prerequisite for
SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
kernels that support both SVE and KVM.

Historically, software models exist that can expose the
architecturally invalid configuration of SVE without VHE, so if
this situation is detected at kvm_init() time then KVM will be
disabled.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

85acda3b

KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing · e6b673b7

由 Dave Martin 提交于 4月 06, 2018

This patch refactors KVM to align the host and guest FPSIMD
save/restore logic with each other for arm64.  This reduces the
number of redundant save/restore operations that must occur, and
reduces the common-case IRQ blackout time during guest exit storms
by saving the host state lazily and optimising away the need to
restore the host state before returning to the run loop.

Four hooks are defined in order to enable this:

 * kvm_arch_vcpu_run_map_fp():
   Called on PID change to map necessary bits of current to Hyp.

 * kvm_arch_vcpu_load_fp():
   Set up FP/SIMD for entering the KVM run loop (parse as
   "vcpu_load fp").

 * kvm_arch_vcpu_ctxsync_fp():
   Get FP/SIMD into a safe state for re-enabling interrupts after a
   guest exit back to the run loop.

   For arm64 specifically, this involves updating the host kernel's
   FPSIMD context tracking metadata so that kernel-mode NEON use
   will cause the vcpu's FPSIMD state to be saved back correctly
   into the vcpu struct.  This must be done before re-enabling
   interrupts because kernel-mode NEON may be used by softirqs.

 * kvm_arch_vcpu_put_fp():
   Save guest FP/SIMD state back to memory and dissociate from the
   CPU ("vcpu_put fp").

Also, the arm64 FPSIMD context switch code is updated to enable it
to save back FPSIMD state for a vcpu, not just current.  A few
helpers drive this:

 * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
   mark this CPU as having context fp (which may belong to a vcpu)
   currently loaded in its registers.  This is the non-task
   equivalent of the static function fpsimd_bind_to_cpu() in
   fpsimd.c.

 * task_fpsimd_save():
   exported to allow KVM to save the guest's FPSIMD state back to
   memory on exit from the run loop.

 * fpsimd_flush_state():
   invalidate any context's FPSIMD state that is currently loaded.
   Used to disassociate the vcpu from the CPU regs on run loop exit.

These changes allow the run loop to enable interrupts (and thus
softirqs that may use kernel-mode NEON) without having to save the
guest's FPSIMD state eagerly.

Some new vcpu_arch fields are added to make all this work.  Because
host FPSIMD state can now be saved back directly into current's
thread_struct as appropriate, host_cpu_context is no longer used
for preserving the FPSIMD state.  However, it is still needed for
preserving other things such as the host's system registers.  To
avoid ABI churn, the redundant storage space in host_cpu_context is
not removed for now.

arch/arm is not addressed by this patch and continues to use its
current save/restore logic.  It could provide implementations of
the helpers later if desired.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

e6b673b7

17 4月, 2018 1 次提交

KVM: arm/arm64: Close VMID generation race · f0cf47d9

由 Marc Zyngier 提交于 4月 04, 2018

Before entering the guest, we check whether our VMID is still
part of the current generation. In order to avoid taking a lock,
we start with checking that the generation is still current, and
only if not current do we take the lock, recheck, and update the
generation and VMID.

This leaves open a small race: A vcpu can bump up the global
generation number as well as the VM's, but has not updated
the VMID itself yet.

At that point another vcpu from the same VM comes in, checks
the generation (and finds it not needing anything), and jumps
into the guest. At this point, we end-up with two vcpus belonging
to the same VM running with two different VMIDs. Eventually, the
VMID used by the second vcpu will get reassigned, and things will
really go wrong...

A simple solution would be to drop this initial check, and always take
the lock. This is likely to cause performance issues. A middle ground
is to convert the spinlock to a rwlock, and only take the read lock
on the fast path. If the check fails at that point, drop it and
acquire the write lock, rechecking the condition.

This ensures that the above scenario doesn't occur.

Cc: stable@vger.kernel.org
Reported-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NShannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

f0cf47d9

19 3月, 2018 4 次提交

KVM: arm/arm64: Handle VGICv3 save/restore from the main VGIC code on VHE · 771621b0

由 Christoffer Dall 提交于 10月 04, 2017

Just like we can program the GICv2 hypervisor control interface directly
from the core vgic code, we can do the same for the GICv3 hypervisor
control interface on VHE systems.

We do this by simply calling the save/restore functions when we have VHE
and we can then get rid of the save/restore function calls from the VHE
world switch function.

One caveat is that we now write GICv3 system register state before the
potential early exit path in the run loop, and because we sync back
state in the early exit path, we have to ensure that we read a
consistent GIC state from the sync path, even though we have never
actually run the guest with the newly written GIC state.  We solve this
by inserting an ISB in the early exit path.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

771621b0

KVM: arm64: Introduce VHE-specific kvm_vcpu_run · 3f5c90b8

由 Christoffer Dall 提交于 10月 03, 2017

So far this is mostly (see below) a copy of the legacy non-VHE switch
function, but we will start reworking these functions in separate
directions to work on VHE and non-VHE in the most optimal way in later
patches.

The only difference after this patch between the VHE and non-VHE run
functions is that we omit the branch-predictor variant-2 hardening for
QC Falkor CPUs, because this workaround is specific to a series of
non-VHE ARMv8.0 CPUs.
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

3f5c90b8

KVM: arm/arm64: Add kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs · bc192cee

由 Christoffer Dall 提交于 10月 10, 2017

As we are about to move a bunch of save/restore logic for VHE kernels to
the load and put functions, we need some infrastructure to do this.
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

bc192cee

KVM: arm/arm64: Get rid of vcpu->arch.irq_lines · 3df59d8d

由 Christoffer Dall 提交于 8月 03, 2017

We currently have a separate read-modify-write of the HCR_EL2 on entry
to the guest for the sole purpose of setting the VF and VI bits, if set.
Since this is most rarely the case (only when using userspace IRQ chip
and interrupts are in flight), let's get rid of this operation and
instead modify the bits in the vcpu->arch.hcr[_el2] directly when
needed.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

3df59d8d

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功