1. 25 8月, 2019 1 次提交
    • M
      KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block · 8c7053d1
      Marc Zyngier 提交于
      commit 5eeaf10eec394b28fad2c58f1f5c3a5da0e87d1c upstream.
      
      Since commit commit 328e5664 ("KVM: arm/arm64: vgic: Defer
      touching GICH_VMCR to vcpu_load/put"), we leave ICH_VMCR_EL2 (or
      its GICv2 equivalent) loaded as long as we can, only syncing it
      back when we're scheduled out.
      
      There is a small snag with that though: kvm_vgic_vcpu_pending_irq(),
      which is indirectly called from kvm_vcpu_check_block(), needs to
      evaluate the guest's view of ICC_PMR_EL1. At the point were we
      call kvm_vcpu_check_block(), the vcpu is still loaded, and whatever
      changes to PMR is not visible in memory until we do a vcpu_put().
      
      Things go really south if the guest does the following:
      
      	mov x0, #0	// or any small value masking interrupts
      	msr ICC_PMR_EL1, x0
      
      	[vcpu preempted, then rescheduled, VMCR sampled]
      
      	mov x0, #ff	// allow all interrupts
      	msr ICC_PMR_EL1, x0
      	wfi		// traps to EL2, so samping of VMCR
      
      	[interrupt arrives just after WFI]
      
      Here, the hypervisor's view of PMR is zero, while the guest has enabled
      its interrupts. kvm_vgic_vcpu_pending_irq() will then say that no
      interrupts are pending (despite an interrupt being received) and we'll
      block for no reason. If the guest doesn't have a periodic interrupt
      firing once it has blocked, it will stay there forever.
      
      To avoid this unfortuante situation, let's resync VMCR from
      kvm_arch_vcpu_blocking(), ensuring that a following kvm_vcpu_check_block()
      will observe the latest value of PMR.
      
      This has been found by booting an arm64 Linux guest with the pseudo NMI
      feature, and thus using interrupt priorities to mask interrupts instead
      of the usual PSTATE masking.
      
      Cc: stable@vger.kernel.org # 4.12
      Fixes: 328e5664 ("KVM: arm/arm64: vgic: Defer touching GICH_VMCR to vcpu_load/put")
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      8c7053d1
  2. 24 3月, 2019 1 次提交
  3. 12 8月, 2018 1 次提交
  4. 21 7月, 2018 3 次提交
  5. 25 5月, 2018 3 次提交
  6. 27 4月, 2018 1 次提交
    • M
      KVM: arm/arm64: vgic: Fix source vcpu issues for GICv2 SGI · 53692908
      Marc Zyngier 提交于
      Now that we make sure we don't inject multiple instances of the
      same GICv2 SGI at the same time, we've made another bug more
      obvious:
      
      If we exit with an active SGI, we completely lose track of which
      vcpu it came from. On the next entry, we restore it with 0 as a
      source, and if that wasn't the right one, too bad. While this
      doesn't seem to trouble GIC-400, the architectural model gets
      offended and doesn't deactivate the interrupt on EOI.
      
      Another connected issue is that we will happilly make pending
      an interrupt from another vcpu, overriding the above zero with
      something that is just as inconsistent. Don't do that.
      
      The final issue is that we signal a maintenance interrupt when
      no pending interrupts are present in the LR. Assuming we've fixed
      the two issues above, we end-up in a situation where we keep
      exiting as soon as we've reached the active state, and not be
      able to inject the following pending.
      
      The fix comes in 3 parts:
      - GICv2 SGIs have their source vcpu saved if they are active on
        exit, and restored on entry
      - Multi-SGIs cannot go via the Pending+Active state, as this would
        corrupt the source field
      - Multi-SGIs are converted to using MI on EOI instead of NPIE
      
      Fixes: 16ca6a60 ("KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid")
      Reported-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      53692908
  7. 20 4月, 2018 1 次提交
    • M
      arm/arm64: KVM: Add PSCI version selection API · 85bd0ba1
      Marc Zyngier 提交于
      Although we've implemented PSCI 0.1, 0.2 and 1.0, we expose either 0.1
      or 1.0 to a guest, defaulting to the latest version of the PSCI
      implementation that is compatible with the requested version. This is
      no different from doing a firmware upgrade on KVM.
      
      But in order to give a chance to hypothetical badly implemented guests
      that would have a fit by discovering something other than PSCI 0.2,
      let's provide a new API that allows userspace to pick one particular
      version of the API.
      
      This is implemented as a new class of "firmware" registers, where
      we expose the PSCI version. This allows the PSCI version to be
      save/restored as part of a guest migration, and also set to
      any supported version if the guest requires it.
      
      Cc: stable@vger.kernel.org #4.16
      Reviewed-by: NChristoffer Dall <cdall@kernel.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      85bd0ba1
  8. 19 3月, 2018 2 次提交
  9. 15 3月, 2018 1 次提交
    • C
      KVM: arm/arm64: Reset mapped IRQs on VM reset · 413aa807
      Christoffer Dall 提交于
      We currently don't allow resetting mapped IRQs from userspace, because
      their state is controlled by the hardware.  But we do need to reset the
      state when the VM is reset, so we provide a function for the 'owner' of
      the mapped interrupt to reset the interrupt state.
      
      Currently only the timer uses mapped interrupts, so we call this
      function from the timer reset logic.
      
      Cc: stable@vger.kernel.org
      Fixes: 4c60e360 ("KVM: arm/arm64: Provide a get_input_level for the arch timer")
      Signed-off-by: NChristoffer Dall <cdall@kernel.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      413aa807
  10. 07 2月, 2018 5 次提交
  11. 02 1月, 2018 2 次提交
    • C
      KVM: arm/arm64: Provide a get_input_level for the arch timer · 4c60e360
      Christoffer Dall 提交于
      The VGIC can now support the life-cycle of mapped level-triggered
      interrupts, and we no longer have to read back the timer state on every
      exit from the VM if we had an asserted timer interrupt signal, because
      the VGIC already knows if we hit the unlikely case where the guest
      disables the timer without ACKing the virtual timer interrupt.
      
      This means we rework a bit of the code to factor out the functionality
      to snapshot the timer state from vtimer_save_state(), and we can reuse
      this functionality in the sync path when we have an irqchip in
      userspace, and also to support our implementation of the
      get_input_level() function for the timer.
      
      This change also means that we can no longer rely on the timer's view of
      the interrupt line to set the active state, because we no longer
      maintain this state for mapped interrupts when exiting from the guest.
      Instead, we only set the active state if the virtual interrupt is
      active, and otherwise we simply let the timer fire again and raise the
      virtual interrupt from the ISR.
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      4c60e360
    • C
      KVM: arm/arm64: Support a vgic interrupt line level sample function · b6909a65
      Christoffer Dall 提交于
      The GIC sometimes need to sample the physical line of a mapped
      interrupt.  As we know this to be notoriously slow, provide a callback
      function for devices (such as the timer) which can do this much faster
      than talking to the distributor, for example by comparing a few
      in-memory values.  Fall back to the good old method of poking the
      physical GIC if no callback is provided.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      b6909a65
  12. 18 12月, 2017 1 次提交
  13. 29 11月, 2017 1 次提交
  14. 10 11月, 2017 4 次提交
  15. 07 11月, 2017 1 次提交
  16. 06 11月, 2017 6 次提交
    • C
      KVM: arm/arm64: Rework kvm_timer_should_fire · 1c88ab7e
      Christoffer Dall 提交于
      kvm_timer_should_fire() can be called in two different situations from
      the kvm_vcpu_block().
      
      The first case is before calling kvm_timer_schedule(), used for wait
      polling, and in this case the VCPU thread is running and the timer state
      is loaded onto the hardware so all we have to do is check if the virtual
      interrupt lines are asserted, becasue the timer interrupt handler
      functions will raise those lines as appropriate.
      
      The second case is inside the wait loop of kvm_vcpu_block(), where we
      have already called kvm_timer_schedule() and therefore the hardware will
      be disabled and the software view of the timer state is up to date
      (timer->loaded is false), and so we can simply check if the timer should
      fire by looking at the software state.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      1c88ab7e
    • C
      KVM: arm/arm64: Get rid of kvm_timer_flush_hwstate · 7e90c8e5
      Christoffer Dall 提交于
      Now when both the vtimer and the ptimer when using both the in-kernel
      vgic emulation and a userspace IRQ chip are driven by the timer signals
      and at the vcpu load/put boundaries, instead of recomputing the timer
      state at every entry/exit to/from the guest, we can get entirely rid of
      the flush hwstate function.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      7e90c8e5
    • C
      KVM: arm/arm64: Avoid timer save/restore in vcpu entry/exit · b103cc3f
      Christoffer Dall 提交于
      We don't need to save and restore the hardware timer state and examine
      if it generates interrupts on on every entry/exit to the guest.  The
      timer hardware is perfectly capable of telling us when it has expired
      by signaling interrupts.
      
      When taking a vtimer interrupt in the host, we don't want to mess with
      the timer configuration, we just want to forward the physical interrupt
      to the guest as a virtual interrupt.  We can use the split priority drop
      and deactivate feature of the GIC to do this, which leaves an EOI'ed
      interrupt active on the physical distributor, making sure we don't keep
      taking timer interrupts which would prevent the guest from running.  We
      can then forward the physical interrupt to the VM using the HW bit in
      the LR of the GIC, like we do already, which lets the guest directly
      deactivate both the physical and virtual timer simultaneously, allowing
      the timer hardware to exit the VM and generate a new physical interrupt
      when the timer output is again asserted later on.
      
      We do need to capture this state when migrating VCPUs between physical
      CPUs, however, which we use the vcpu put/load functions for, which are
      called through preempt notifiers whenever the thread is scheduled away
      from the CPU or called directly if we return from the ioctl to
      userspace.
      
      One caveat is that we have to save and restore the timer state in both
      kvm_timer_vcpu_[put/load] and kvm_timer_[schedule/unschedule], because
      we can have the following flows:
      
        1. kvm_vcpu_block
        2. kvm_timer_schedule
        3. schedule
        4. kvm_timer_vcpu_put (preempt notifier)
        5. schedule (vcpu thread gets scheduled back)
        6. kvm_timer_vcpu_load (preempt notifier)
        7. kvm_timer_unschedule
      
      And a version where we don't actually call schedule:
      
        1. kvm_vcpu_block
        2. kvm_timer_schedule
        7. kvm_timer_unschedule
      
      Since kvm_timer_[schedule/unschedule] may not be followed by put/load,
      but put/load also may be called independently, we call the timer
      save/restore functions from both paths.  Since they rely on the loaded
      flag to never save/restore when unnecessary, this doesn't cause any
      harm, and we ensure that all invokations of either set of functions work
      as intended.
      
      An added benefit beyond not having to read and write the timer sysregs
      on every entry and exit is that we no longer have to actively write the
      active state to the physical distributor, because we configured the
      irq for the vtimer to only get a priority drop when handling the
      interrupt in the GIC driver (we called irq_set_vcpu_affinity()), and
      the interrupt stays active after firing on the host.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      b103cc3f
    • C
      KVM: arm/arm64: Use separate timer for phys timer emulation · f2a2129e
      Christoffer Dall 提交于
      We were using the same hrtimer for emulating the physical timer and for
      making sure a blocking VCPU thread would be eventually woken up.  That
      worked fine in the previous arch timer design, but as we are about to
      actually use the soft timer expire function for the physical timer
      emulation, change the logic to use a dedicated hrtimer.
      
      This has the added benefit of not having to cancel any work in the sync
      path, which in turn allows us to run the flush and sync with IRQs
      disabled.
      
      Note that the hrtimer used to program the host kernel's timer to
      generate an exit from the guest when the emulated physical timer fires
      never has to inject any work, and to share the soft_timer_cancel()
      function with the bg_timer, we change the function to only cancel any
      pending work if the pointer to the work struct is not null.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      f2a2129e
    • C
      KVM: arm/arm64: Rename soft timer to bg_timer · 14d61fa9
      Christoffer Dall 提交于
      As we are about to introduce a separate hrtimer for the physical timer,
      call this timer bg_timer, because we refer to this timer as the
      background timer in the code and comments elsewhere.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      14d61fa9
    • C
      KVM: arm/arm64: Make timer_arm and timer_disarm helpers more generic · 8409a06f
      Christoffer Dall 提交于
      We are about to add an additional soft timer to the arch timer state for
      a VCPU and would like to be able to reuse the functions to program and
      cancel a timer, so we make them slightly more generic and rename to make
      it more clear that these functions work on soft timers and not the
      hardware resource that this code is managing.
      
      The armed flag on the timer state is only used to assert a condition,
      and we don't rely on this assertion in any meaningful way, so we can
      simply get rid of this flack and slightly reduce complexity.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      8409a06f
  17. 25 7月, 2017 1 次提交
    • A
      KVM: arm/arm64: PMU: Fix overflow interrupt injection · d9f89b4e
      Andrew Jones 提交于
      kvm_pmu_overflow_set() is called from perf's interrupt handler,
      making the call of kvm_vgic_inject_irq() from it introduced with
      "KVM: arm/arm64: PMU: remove request-less vcpu kick" a really bad
      idea, as it's quite easy to try and retake a lock that the
      interrupted context is already holding. The fix is to use a vcpu
      kick, leaving the interrupt injection to kvm_pmu_sync_hwstate(),
      like it was doing before the refactoring. We don't just revert,
      though, because before the kick was request-less, leaving the vcpu
      exposed to the request-less vcpu kick race, and also because the
      kick was used unnecessarily from register access handlers.
      Reviewed-by: NChristoffer Dall <cdall@linaro.org>
      Signed-off-by: NAndrew Jones <drjones@redhat.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      d9f89b4e
  18. 15 6月, 2017 1 次提交
  19. 08 6月, 2017 4 次提交