1. 25 5月, 2018 2 次提交
    • D
      KVM: arm64: Save host SVE context as appropriate · 85acda3b
      Dave Martin 提交于
      This patch adds SVE context saving to the hyp FPSIMD context switch
      path.  This means that it is no longer necessary to save the host
      SVE state in advance of entering the guest, when in use.
      
      In order to avoid adding pointless complexity to the code, VHE is
      assumed if SVE is in use.  VHE is an architectural prerequisite for
      SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
      kernels that support both SVE and KVM.
      
      Historically, software models exist that can expose the
      architecturally invalid configuration of SVE without VHE, so if
      this situation is detected at kvm_init() time then KVM will be
      disabled.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      85acda3b
    • D
      KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing · e6b673b7
      Dave Martin 提交于
      This patch refactors KVM to align the host and guest FPSIMD
      save/restore logic with each other for arm64.  This reduces the
      number of redundant save/restore operations that must occur, and
      reduces the common-case IRQ blackout time during guest exit storms
      by saving the host state lazily and optimising away the need to
      restore the host state before returning to the run loop.
      
      Four hooks are defined in order to enable this:
      
       * kvm_arch_vcpu_run_map_fp():
         Called on PID change to map necessary bits of current to Hyp.
      
       * kvm_arch_vcpu_load_fp():
         Set up FP/SIMD for entering the KVM run loop (parse as
         "vcpu_load fp").
      
       * kvm_arch_vcpu_ctxsync_fp():
         Get FP/SIMD into a safe state for re-enabling interrupts after a
         guest exit back to the run loop.
      
         For arm64 specifically, this involves updating the host kernel's
         FPSIMD context tracking metadata so that kernel-mode NEON use
         will cause the vcpu's FPSIMD state to be saved back correctly
         into the vcpu struct.  This must be done before re-enabling
         interrupts because kernel-mode NEON may be used by softirqs.
      
       * kvm_arch_vcpu_put_fp():
         Save guest FP/SIMD state back to memory and dissociate from the
         CPU ("vcpu_put fp").
      
      Also, the arm64 FPSIMD context switch code is updated to enable it
      to save back FPSIMD state for a vcpu, not just current.  A few
      helpers drive this:
      
       * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
         mark this CPU as having context fp (which may belong to a vcpu)
         currently loaded in its registers.  This is the non-task
         equivalent of the static function fpsimd_bind_to_cpu() in
         fpsimd.c.
      
       * task_fpsimd_save():
         exported to allow KVM to save the guest's FPSIMD state back to
         memory on exit from the run loop.
      
       * fpsimd_flush_state():
         invalidate any context's FPSIMD state that is currently loaded.
         Used to disassociate the vcpu from the CPU regs on run loop exit.
      
      These changes allow the run loop to enable interrupts (and thus
      softirqs that may use kernel-mode NEON) without having to save the
      guest's FPSIMD state eagerly.
      
      Some new vcpu_arch fields are added to make all this work.  Because
      host FPSIMD state can now be saved back directly into current's
      thread_struct as appropriate, host_cpu_context is no longer used
      for preserving the FPSIMD state.  However, it is still needed for
      preserving other things such as the host's system registers.  To
      avoid ABI churn, the redundant storage space in host_cpu_context is
      not removed for now.
      
      arch/arm is not addressed by this patch and continues to use its
      current save/restore logic.  It could provide implementations of
      the helpers later if desired.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      e6b673b7
  2. 17 4月, 2018 1 次提交
    • M
      KVM: arm/arm64: Close VMID generation race · f0cf47d9
      Marc Zyngier 提交于
      Before entering the guest, we check whether our VMID is still
      part of the current generation. In order to avoid taking a lock,
      we start with checking that the generation is still current, and
      only if not current do we take the lock, recheck, and update the
      generation and VMID.
      
      This leaves open a small race: A vcpu can bump up the global
      generation number as well as the VM's, but has not updated
      the VMID itself yet.
      
      At that point another vcpu from the same VM comes in, checks
      the generation (and finds it not needing anything), and jumps
      into the guest. At this point, we end-up with two vcpus belonging
      to the same VM running with two different VMIDs. Eventually, the
      VMID used by the second vcpu will get reassigned, and things will
      really go wrong...
      
      A simple solution would be to drop this initial check, and always take
      the lock. This is likely to cause performance issues. A middle ground
      is to convert the spinlock to a rwlock, and only take the read lock
      on the fast path. If the check fails at that point, drop it and
      acquire the write lock, rechecking the condition.
      
      This ensures that the above scenario doesn't occur.
      
      Cc: stable@vger.kernel.org
      Reported-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NShannon Zhao <zhaoshenglong@huawei.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      f0cf47d9
  3. 19 3月, 2018 6 次提交
  4. 15 3月, 2018 1 次提交
    • C
      KVM: arm/arm64: Avoid vcpu_load for other vcpu ioctls than KVM_RUN · e21a4f3a
      Christoffer Dall 提交于
      Calling vcpu_load() registers preempt notifiers for this vcpu and calls
      kvm_arch_vcpu_load().  The latter will soon be doing a lot of heavy
      lifting on arm/arm64 and will try to do things such as enabling the
      virtual timer and setting us up to handle interrupts from the timer
      hardware.
      
      Loading state onto hardware registers and enabling hardware to signal
      interrupts can be problematic when we're not actually about to run the
      VCPU, because it makes it difficult to establish the right context when
      handling interrupts from the timer, and it makes the register access
      code difficult to reason about.
      
      Luckily, now when we call vcpu_load in each ioctl implementation, we can
      simply remove the call from the non-KVM_RUN vcpu ioctls, and our
      kvm_arch_vcpu_load() is only used for loading vcpu content to the
      physical CPU when we're actually going to run the vcpu.
      
      Cc: stable@vger.kernel.org
      Fixes: 9b062471 ("KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl")
      Reviewed-by: NJulien Grall <julien.grall@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NAndrew Jones <drjones@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      e21a4f3a
  5. 07 2月, 2018 1 次提交
  6. 31 1月, 2018 1 次提交
  7. 23 1月, 2018 1 次提交
    • J
      KVM: arm/arm64: Handle CPU_PM_ENTER_FAILED · 58d6b15e
      James Morse 提交于
      cpu_pm_enter() calls the pm notifier chain with CPU_PM_ENTER, then if
      there is a failure: CPU_PM_ENTER_FAILED.
      
      When KVM receives CPU_PM_ENTER it calls cpu_hyp_reset() which will
      return us to the hyp-stub. If we subsequently get a CPU_PM_ENTER_FAILED,
      KVM does nothing, leaving the CPU running with the hyp-stub, at odds
      with kvm_arm_hardware_enabled.
      
      Add CPU_PM_ENTER_FAILED as a fallthrough for CPU_PM_EXIT, this reloads
      KVM based on kvm_arm_hardware_enabled. This is safe even if CPU_PM_ENTER
      never gets as far as KVM, as cpu_hyp_reinit() calls cpu_hyp_reset()
      to make sure the hyp-stub is loaded before reloading KVM.
      
      Fixes: 67f69197 ("arm64: kvm: allows kvm cpu hotplug")
      Cc: <stable@vger.kernel.org> # v4.7+
      CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      58d6b15e
  8. 16 1月, 2018 2 次提交
    • J
      KVM: arm64: Handle RAS SErrors from EL1 on guest exit · 3368bd80
      James Morse 提交于
      We expect to have firmware-first handling of RAS SErrors, with errors
      notified via an APEI method. For systems without firmware-first, add
      some minimal handling to KVM.
      
      There are two ways KVM can take an SError due to a guest, either may be a
      RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
      or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
      
      For SError that interrupt a guest and are routed to EL2 the existing
      behaviour is to inject an impdef SError into the guest.
      
      Add code to handle RAS SError based on the ESR. For uncontained and
      uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
      errors compromise the host too. All other error types are contained:
      For the fatal errors the vCPU can't make progress, so we inject a virtual
      SError. We ignore contained errors where we can make progress as if
      we're lucky, we may not hit them again.
      
      If only some of the CPUs support RAS the guest will see the cpufeature
      sanitised version of the id registers, but we may still take RAS SError
      on this CPU. Move the SError handling out of handle_exit() into a new
      handler that runs before we can be preempted. This allows us to use
      this_cpu_has_cap(), via arm64_is_ras_serror().
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      3368bd80
    • J
      KVM: arm/arm64: mask/unmask daif around VHE guests · 4f5abad9
      James Morse 提交于
      Non-VHE systems take an exception to EL2 in order to world-switch into the
      guest. When returning from the guest KVM implicitly restores the DAIF
      flags when it returns to the kernel at EL1.
      
      With VHE none of this exception-level jumping happens, so KVMs
      world-switch code is exposed to the host kernel's DAIF values, and KVM
      spills the guest-exit DAIF values back into the host kernel.
      On entry to a guest we have Debug and SError exceptions unmasked, KVM
      has switched VBAR but isn't prepared to handle these. On guest exit
      Debug exceptions are left disabled once we return to the host and will
      stay this way until we enter user space.
      
      Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
      happen after the hosts VBAR value has been synchronised by the isb in
      __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
      setting KVMs VBAR value, but is kept here for symmetry.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      4f5abad9
  9. 13 1月, 2018 1 次提交
  10. 09 1月, 2018 1 次提交
  11. 02 1月, 2018 2 次提交
  12. 23 12月, 2017 1 次提交
  13. 18 12月, 2017 1 次提交
  14. 14 12月, 2017 4 次提交
  15. 01 12月, 2017 1 次提交
  16. 30 11月, 2017 1 次提交
  17. 29 11月, 2017 1 次提交
  18. 28 11月, 2017 1 次提交
    • J
      KVM: Let KVM_SET_SIGNAL_MASK work as advertised · 20b7035c
      Jan H. Schönherr 提交于
      KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
      "any unblocked signal received [...] will cause KVM_RUN to return with
      -EINTR" and that "the signal will only be delivered if not blocked by
      the original signal mask".
      
      This, however, is only true, when the calling task has a signal handler
      registered for a signal. If not, signal evaluation is short-circuited for
      SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
      returning or the whole process is terminated.
      
      Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
      to that in do_sigtimedwait() to avoid short-circuiting of signals.
      Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      20b7035c
  19. 10 11月, 2017 2 次提交
  20. 07 11月, 2017 2 次提交
  21. 06 11月, 2017 4 次提交
    • C
      KVM: arm/arm64: Rework kvm_timer_should_fire · 1c88ab7e
      Christoffer Dall 提交于
      kvm_timer_should_fire() can be called in two different situations from
      the kvm_vcpu_block().
      
      The first case is before calling kvm_timer_schedule(), used for wait
      polling, and in this case the VCPU thread is running and the timer state
      is loaded onto the hardware so all we have to do is check if the virtual
      interrupt lines are asserted, becasue the timer interrupt handler
      functions will raise those lines as appropriate.
      
      The second case is inside the wait loop of kvm_vcpu_block(), where we
      have already called kvm_timer_schedule() and therefore the hardware will
      be disabled and the software view of the timer state is up to date
      (timer->loaded is false), and so we can simply check if the timer should
      fire by looking at the software state.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      1c88ab7e
    • C
      KVM: arm/arm64: Get rid of kvm_timer_flush_hwstate · 7e90c8e5
      Christoffer Dall 提交于
      Now when both the vtimer and the ptimer when using both the in-kernel
      vgic emulation and a userspace IRQ chip are driven by the timer signals
      and at the vcpu load/put boundaries, instead of recomputing the timer
      state at every entry/exit to/from the guest, we can get entirely rid of
      the flush hwstate function.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      7e90c8e5
    • C
      KVM: arm/arm64: Avoid timer save/restore in vcpu entry/exit · b103cc3f
      Christoffer Dall 提交于
      We don't need to save and restore the hardware timer state and examine
      if it generates interrupts on on every entry/exit to the guest.  The
      timer hardware is perfectly capable of telling us when it has expired
      by signaling interrupts.
      
      When taking a vtimer interrupt in the host, we don't want to mess with
      the timer configuration, we just want to forward the physical interrupt
      to the guest as a virtual interrupt.  We can use the split priority drop
      and deactivate feature of the GIC to do this, which leaves an EOI'ed
      interrupt active on the physical distributor, making sure we don't keep
      taking timer interrupts which would prevent the guest from running.  We
      can then forward the physical interrupt to the VM using the HW bit in
      the LR of the GIC, like we do already, which lets the guest directly
      deactivate both the physical and virtual timer simultaneously, allowing
      the timer hardware to exit the VM and generate a new physical interrupt
      when the timer output is again asserted later on.
      
      We do need to capture this state when migrating VCPUs between physical
      CPUs, however, which we use the vcpu put/load functions for, which are
      called through preempt notifiers whenever the thread is scheduled away
      from the CPU or called directly if we return from the ioctl to
      userspace.
      
      One caveat is that we have to save and restore the timer state in both
      kvm_timer_vcpu_[put/load] and kvm_timer_[schedule/unschedule], because
      we can have the following flows:
      
        1. kvm_vcpu_block
        2. kvm_timer_schedule
        3. schedule
        4. kvm_timer_vcpu_put (preempt notifier)
        5. schedule (vcpu thread gets scheduled back)
        6. kvm_timer_vcpu_load (preempt notifier)
        7. kvm_timer_unschedule
      
      And a version where we don't actually call schedule:
      
        1. kvm_vcpu_block
        2. kvm_timer_schedule
        7. kvm_timer_unschedule
      
      Since kvm_timer_[schedule/unschedule] may not be followed by put/load,
      but put/load also may be called independently, we call the timer
      save/restore functions from both paths.  Since they rely on the loaded
      flag to never save/restore when unnecessary, this doesn't cause any
      harm, and we ensure that all invokations of either set of functions work
      as intended.
      
      An added benefit beyond not having to read and write the timer sysregs
      on every entry and exit is that we no longer have to actively write the
      active state to the physical distributor, because we configured the
      irq for the vtimer to only get a priority drop when handling the
      interrupt in the GIC driver (we called irq_set_vcpu_affinity()), and
      the interrupt stays active after firing on the host.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      b103cc3f
    • C
      KVM: arm/arm64: Move timer/vgic flush/sync under disabled irq · ee9bb9a1
      Christoffer Dall 提交于
      As we are about to play tricks with the timer to be more lazy in saving
      and restoring state, we need to move the timer sync and flush functions
      under a disabled irq section and since we have to flush the vgic state
      after the timer and PMU state, we do the whole flush/sync sequence with
      disabled irqs.
      
      The only downside is a slightly longer delay before being able to
      process hardware interrupts and run softirqs.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      ee9bb9a1
  22. 03 11月, 2017 1 次提交
    • D
      arm64/sve: KVM: Prevent guests from using SVE · 17eed27b
      Dave Martin 提交于
      Until KVM has full SVE support, guests must not be allowed to
      execute SVE instructions.
      
      This patch enables the necessary traps, and also ensures that the
      traps are disabled again on exit from the guest so that the host
      can still use SVE if it wants to.
      
      On guest exit, high bits of the SVE Zn registers may have been
      clobbered as a side-effect the execution of FPSIMD instructions in
      the guest.  The existing KVM host FPSIMD restore code is not
      sufficient to restore these bits, so this patch explicitly marks
      the CPU as not containing cached vector state for any task, thus
      forcing a reload on the next return to userspace.  This is an
      interim measure, in advance of adding full SVE awareness to KVM.
      
      This marking of cached vector state in the CPU as invalid is done
      using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c.  Due
      to the repeated use of this rather obscure operation, it makes
      sense to factor it out as a separate helper with a clearer name.
      This patch factors it out as fpsimd_flush_cpu_state(), and ports
      all callers to use it.
      
      As a side effect of this refactoring, a this_cpu_write() in
      fpsimd_cpu_pm_notifier() is changed to __this_cpu_write().  This
      should be fine, since cpu_pm_enter() is supposed to be called only
      with interrupts disabled.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      17eed27b
  23. 21 10月, 2017 1 次提交
  24. 08 8月, 2017 1 次提交