1. 19 4月, 2019 1 次提交
  2. 29 3月, 2019 2 次提交
    • D
      KVM: arm/arm64: Add KVM_ARM_VCPU_FINALIZE ioctl · 7dd32a0d
      Dave Martin 提交于
      Some aspects of vcpu configuration may be too complex to be
      completed inside KVM_ARM_VCPU_INIT.  Thus, there may be a
      requirement for userspace to do some additional configuration
      before various other ioctls will work in a consistent way.
      
      In particular this will be the case for SVE, where userspace will
      need to negotiate the set of vector lengths to be made available to
      the guest before the vcpu becomes fully usable.
      
      In order to provide an explicit way for userspace to confirm that
      it has finished setting up a particular vcpu feature, this patch
      adds a new ioctl KVM_ARM_VCPU_FINALIZE.
      
      When userspace has opted into a feature that requires finalization,
      typically by means of a feature flag passed to KVM_ARM_VCPU_INIT, a
      matching call to KVM_ARM_VCPU_FINALIZE is now required before
      KVM_RUN or KVM_GET_REG_LIST is allowed.  Individual features may
      impose additional restrictions where appropriate.
      
      No existing vcpu features are affected by this, so current
      userspace implementations will continue to work exactly as before,
      with no need to issue KVM_ARM_VCPU_FINALIZE.
      
      As implemented in this patch, KVM_ARM_VCPU_FINALIZE is currently a
      placeholder: no finalizable features exist yet, so ioctl is not
      required and will always yield EINVAL.  Subsequent patches will add
      the finalization logic to make use of this ioctl for SVE.
      
      No functional change for existing userspace.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      7dd32a0d
    • D
      KVM: arm/arm64: Add hook for arch-specific KVM initialisation · 0f062bfe
      Dave Martin 提交于
      This patch adds a kvm_arm_init_arch_resources() hook to perform
      subarch-specific initialisation when starting up KVM.
      
      This will be used in a subsequent patch for global SVE-related
      setup on arm64.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Tested-by: Nzhang.lei <zhang.lei@jp.fujitsu.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      0f062bfe
  3. 20 2月, 2019 5 次提交
  4. 07 2月, 2019 1 次提交
  5. 18 12月, 2018 2 次提交
    • C
      KVM: arm/arm64: Fix VMID alloc race by reverting to lock-less · fb544d1c
      Christoffer Dall 提交于
      We recently addressed a VMID generation race by introducing a read/write
      lock around accesses and updates to the vmid generation values.
      
      However, kvm_arch_vcpu_ioctl_run() also calls need_new_vmid_gen() but
      does so without taking the read lock.
      
      As far as I can tell, this can lead to the same kind of race:
      
        VM 0, VCPU 0			VM 0, VCPU 1
        ------------			------------
        update_vttbr (vmid 254)
        				update_vttbr (vmid 1) // roll over
      				read_lock(kvm_vmid_lock);
      				force_vm_exit()
        local_irq_disable
        need_new_vmid_gen == false //because vmid gen matches
      
        enter_guest (vmid 254)
        				kvm_arch.vttbr = <PGD>:<VMID 1>
      				read_unlock(kvm_vmid_lock);
      
        				enter_guest (vmid 1)
      
      Which results in running two VCPUs in the same VM with different VMIDs
      and (even worse) other VCPUs from other VMs could now allocate clashing
      VMID 254 from the new generation as long as VCPU 0 is not exiting.
      
      Attempt to solve this by making sure vttbr is updated before another CPU
      can observe the updated VMID generation.
      
      Cc: stable@vger.kernel.org
      Fixes: f0cf47d9 "KVM: arm/arm64: Close VMID generation race"
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      fb544d1c
    • M
      arm64: KVM: Consistently advance singlestep when emulating instructions · bd7d95ca
      Mark Rutland 提交于
      When we emulate a guest instruction, we don't advance the hardware
      singlestep state machine, and thus the guest will receive a software
      step exception after a next instruction which is not emulated by the
      host.
      
      We bodge around this in an ad-hoc fashion. Sometimes we explicitly check
      whether userspace requested a single step, and fake a debug exception
      from within the kernel. Other times, we advance the HW singlestep state
      rely on the HW to generate the exception for us. Thus, the observed step
      behaviour differs for host and guest.
      
      Let's make this simpler and consistent by always advancing the HW
      singlestep state machine when we skip an instruction. Thus we can rely
      on the hardware to generate the singlestep exception for us, and never
      need to explicitly check for an active-pending step, nor do we need to
      fake a debug exception from the guest.
      
      Cc: Peter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      bd7d95ca
  6. 14 12月, 2018 2 次提交
    • P
      kvm: introduce manual dirty log reprotect · 2a31b9db
      Paolo Bonzini 提交于
      There are two problems with KVM_GET_DIRTY_LOG.  First, and less important,
      it can take kvm->mmu_lock for an extended period of time.  Second, its user
      can actually see many false positives in some cases.  The latter is due
      to a benign race like this:
      
        1. KVM_GET_DIRTY_LOG returns a set of dirty pages and write protects
           them.
        2. The guest modifies the pages, causing them to be marked ditry.
        3. Userspace actually copies the pages.
        4. KVM_GET_DIRTY_LOG returns those pages as dirty again, even though
           they were not written to since (3).
      
      This is especially a problem for large guests, where the time between
      (1) and (3) can be substantial.  This patch introduces a new
      capability which, when enabled, makes KVM_GET_DIRTY_LOG not
      write-protect the pages it returns.  Instead, userspace has to
      explicitly clear the dirty log bits just before using the content
      of the page.  The new KVM_CLEAR_DIRTY_LOG ioctl can also operate on a
      64-page granularity rather than requiring to sync a full memslot;
      this way, the mmu_lock is taken for small amounts of time, and
      only a small amount of time will pass between write protection
      of pages and the sending of their content.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2a31b9db
    • P
      kvm: rename last argument to kvm_get_dirty_log_protect · 8fe65a82
      Paolo Bonzini 提交于
      When manual dirty log reprotect will be enabled, kvm_get_dirty_log_protect's
      pointer argument will always be false on exit, because no TLB flush is needed
      until the manual re-protection operation.  Rename it from "is_dirty" to "flush",
      which more accurately tells the caller what they have to do with it.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8fe65a82
  7. 10 12月, 2018 1 次提交
  8. 18 10月, 2018 3 次提交
  9. 03 10月, 2018 3 次提交
  10. 01 10月, 2018 2 次提交
  11. 18 9月, 2018 1 次提交
  12. 21 7月, 2018 3 次提交
  13. 09 7月, 2018 1 次提交
  14. 20 6月, 2018 1 次提交
  15. 02 6月, 2018 2 次提交
  16. 01 6月, 2018 1 次提交
  17. 25 5月, 2018 4 次提交
    • E
      KVM: arm/arm64: Remove kvm_vgic_vcpu_early_init · 5ec17fba
      Eric Auger 提交于
      kvm_vgic_vcpu_early_init gets called after kvm_vgic_cpu_init which
      is confusing. The call path is as follows:
      kvm_vm_ioctl_create_vcpu
      |_ kvm_arch_cpu_create
         |_ kvm_vcpu_init
            |_ kvm_arch_vcpu_init
               |_ kvm_vgic_vcpu_init
      |_ kvm_arch_vcpu_postcreate
         |_ kvm_vgic_vcpu_early_init
      
      Static initialization currently done in kvm_vgic_vcpu_early_init()
      can be moved to kvm_vgic_vcpu_init(). So let's move the code and
      remove kvm_vgic_vcpu_early_init(). kvm_arch_vcpu_postcreate() does
      nothing.
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      5ec17fba
    • D
      KVM: arm64: Remove eager host SVE state saving · 21cdd7fd
      Dave Martin 提交于
      Now that the host SVE context can be saved on demand from Hyp,
      there is no longer any need to save this state in advance before
      entering the guest.
      
      This patch removes the relevant call to
      kvm_fpsimd_flush_cpu_state().
      
      Since the problem that function was intended to solve now no longer
      exists, the function and its dependencies are also deleted.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      21cdd7fd
    • D
      KVM: arm64: Save host SVE context as appropriate · 85acda3b
      Dave Martin 提交于
      This patch adds SVE context saving to the hyp FPSIMD context switch
      path.  This means that it is no longer necessary to save the host
      SVE state in advance of entering the guest, when in use.
      
      In order to avoid adding pointless complexity to the code, VHE is
      assumed if SVE is in use.  VHE is an architectural prerequisite for
      SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
      kernels that support both SVE and KVM.
      
      Historically, software models exist that can expose the
      architecturally invalid configuration of SVE without VHE, so if
      this situation is detected at kvm_init() time then KVM will be
      disabled.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      85acda3b
    • D
      KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing · e6b673b7
      Dave Martin 提交于
      This patch refactors KVM to align the host and guest FPSIMD
      save/restore logic with each other for arm64.  This reduces the
      number of redundant save/restore operations that must occur, and
      reduces the common-case IRQ blackout time during guest exit storms
      by saving the host state lazily and optimising away the need to
      restore the host state before returning to the run loop.
      
      Four hooks are defined in order to enable this:
      
       * kvm_arch_vcpu_run_map_fp():
         Called on PID change to map necessary bits of current to Hyp.
      
       * kvm_arch_vcpu_load_fp():
         Set up FP/SIMD for entering the KVM run loop (parse as
         "vcpu_load fp").
      
       * kvm_arch_vcpu_ctxsync_fp():
         Get FP/SIMD into a safe state for re-enabling interrupts after a
         guest exit back to the run loop.
      
         For arm64 specifically, this involves updating the host kernel's
         FPSIMD context tracking metadata so that kernel-mode NEON use
         will cause the vcpu's FPSIMD state to be saved back correctly
         into the vcpu struct.  This must be done before re-enabling
         interrupts because kernel-mode NEON may be used by softirqs.
      
       * kvm_arch_vcpu_put_fp():
         Save guest FP/SIMD state back to memory and dissociate from the
         CPU ("vcpu_put fp").
      
      Also, the arm64 FPSIMD context switch code is updated to enable it
      to save back FPSIMD state for a vcpu, not just current.  A few
      helpers drive this:
      
       * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
         mark this CPU as having context fp (which may belong to a vcpu)
         currently loaded in its registers.  This is the non-task
         equivalent of the static function fpsimd_bind_to_cpu() in
         fpsimd.c.
      
       * task_fpsimd_save():
         exported to allow KVM to save the guest's FPSIMD state back to
         memory on exit from the run loop.
      
       * fpsimd_flush_state():
         invalidate any context's FPSIMD state that is currently loaded.
         Used to disassociate the vcpu from the CPU regs on run loop exit.
      
      These changes allow the run loop to enable interrupts (and thus
      softirqs that may use kernel-mode NEON) without having to save the
      guest's FPSIMD state eagerly.
      
      Some new vcpu_arch fields are added to make all this work.  Because
      host FPSIMD state can now be saved back directly into current's
      thread_struct as appropriate, host_cpu_context is no longer used
      for preserving the FPSIMD state.  However, it is still needed for
      preserving other things such as the host's system registers.  To
      avoid ABI churn, the redundant storage space in host_cpu_context is
      not removed for now.
      
      arch/arm is not addressed by this patch and continues to use its
      current save/restore logic.  It could provide implementations of
      the helpers later if desired.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      e6b673b7
  18. 17 4月, 2018 1 次提交
    • M
      KVM: arm/arm64: Close VMID generation race · f0cf47d9
      Marc Zyngier 提交于
      Before entering the guest, we check whether our VMID is still
      part of the current generation. In order to avoid taking a lock,
      we start with checking that the generation is still current, and
      only if not current do we take the lock, recheck, and update the
      generation and VMID.
      
      This leaves open a small race: A vcpu can bump up the global
      generation number as well as the VM's, but has not updated
      the VMID itself yet.
      
      At that point another vcpu from the same VM comes in, checks
      the generation (and finds it not needing anything), and jumps
      into the guest. At this point, we end-up with two vcpus belonging
      to the same VM running with two different VMIDs. Eventually, the
      VMID used by the second vcpu will get reassigned, and things will
      really go wrong...
      
      A simple solution would be to drop this initial check, and always take
      the lock. This is likely to cause performance issues. A middle ground
      is to convert the spinlock to a rwlock, and only take the read lock
      on the fast path. If the check fails at that point, drop it and
      acquire the write lock, rechecking the condition.
      
      This ensures that the above scenario doesn't occur.
      
      Cc: stable@vger.kernel.org
      Reported-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NShannon Zhao <zhaoshenglong@huawei.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      f0cf47d9
  19. 19 3月, 2018 4 次提交