1. 02 1月, 2018 4 次提交
    • C
      KVM: arm/arm64: Avoid work when userspace iqchips are not used · 61bbe380
      Christoffer Dall 提交于
      We currently check if the VM has a userspace irqchip in several places
      along the critical path, and if so, we do some work which is only
      required for having an irqchip in userspace.  This is unfortunate, as we
      could avoid doing any work entirely, if we didn't have to support
      irqchip in userspace.
      
      Realizing the userspace irqchip on ARM is mostly a developer or hobby
      feature, and is unlikely to be used in servers or other scenarios where
      performance is a priority, we can use a refcounted static key to only
      check the irqchip configuration when we have at least one VM that uses
      an irqchip in userspace.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      61bbe380
    • C
      KVM: arm/arm64: Provide a get_input_level for the arch timer · 4c60e360
      Christoffer Dall 提交于
      The VGIC can now support the life-cycle of mapped level-triggered
      interrupts, and we no longer have to read back the timer state on every
      exit from the VM if we had an asserted timer interrupt signal, because
      the VGIC already knows if we hit the unlikely case where the guest
      disables the timer without ACKing the virtual timer interrupt.
      
      This means we rework a bit of the code to factor out the functionality
      to snapshot the timer state from vtimer_save_state(), and we can reuse
      this functionality in the sync path when we have an irqchip in
      userspace, and also to support our implementation of the
      get_input_level() function for the timer.
      
      This change also means that we can no longer rely on the timer's view of
      the interrupt line to set the active state, because we no longer
      maintain this state for mapped interrupts when exiting from the guest.
      Instead, we only set the active state if the virtual interrupt is
      active, and otherwise we simply let the timer fire again and raise the
      virtual interrupt from the ISR.
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      4c60e360
    • C
      KVM: arm/arm64: Support a vgic interrupt line level sample function · b6909a65
      Christoffer Dall 提交于
      The GIC sometimes need to sample the physical line of a mapped
      interrupt.  As we know this to be notoriously slow, provide a callback
      function for devices (such as the timer) which can do this much faster
      than talking to the distributor, for example by comparing a few
      in-memory values.  Fall back to the good old method of poking the
      physical GIC if no callback is provided.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      b6909a65
    • C
      KVM: arm/arm64: Don't cache the timer IRQ level · 70450a9f
      Christoffer Dall 提交于
      The timer logic was designed after a strict idea of modeling an
      interrupt line level in software, meaning that only transitions in the
      level need to be reported to the VGIC.  This works well for the timer,
      because the arch timer code is in complete control of the device and can
      track the transitions of the line.
      
      However, as we are about to support using the HW bit in the VGIC not
      just for the timer, but also for VFIO which cannot track transitions of
      the interrupt line, we have to decide on an interface between the GIC
      and other subsystems for level triggered mapped interrupts, which both
      the timer and VFIO can use.
      
      VFIO only sees an asserting transition of the physical interrupt line,
      and tells the VGIC when that happens.  That means that part of the
      interrupt flow is offloaded to the hardware.
      
      To use the same interface for VFIO devices and the timer, we therefore
      have to change the timer (we cannot change VFIO because it doesn't know
      the details of the device it is assigning to a VM).
      
      Luckily, changing the timer is simple, we just need to stop 'caching'
      the line level, but instead let the VGIC know the state of the timer
      every time there is a potential change in the line level, and when the
      line level should be asserted from the timer ISR.  The VGIC can ignore
      extra notifications using its validate mechanism.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      70450a9f
  2. 30 11月, 2017 1 次提交
  3. 29 11月, 2017 1 次提交
  4. 07 11月, 2017 1 次提交
  5. 06 11月, 2017 12 次提交
    • C
      arm/arm64: KVM: Load the timer state when enabling the timer · 4a2c4da1
      Christoffer Dall 提交于
      After being lazy with saving/restoring the timer state, we defer that
      work to vcpu_load and vcpu_put, which ensure that the timer state is
      loaded on the hardware timers whenever the VCPU runs.
      
      Unfortunately, we are failing to do that the first time vcpu_load()
      runs, because the timer has not yet been enabled at that time.  As long
      as the initialized timer state matches what happens to be in the
      hardware (a disabled timer, because we never leave the timer screaming),
      this does not show up as a problem, but is nevertheless incorrect.
      
      The solution is simple; disable preemption while setting the timer to be
      enabled, and call the timer load function when first enabling the timer.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      4a2c4da1
    • C
      KVM: arm/arm64: Rework kvm_timer_should_fire · 1c88ab7e
      Christoffer Dall 提交于
      kvm_timer_should_fire() can be called in two different situations from
      the kvm_vcpu_block().
      
      The first case is before calling kvm_timer_schedule(), used for wait
      polling, and in this case the VCPU thread is running and the timer state
      is loaded onto the hardware so all we have to do is check if the virtual
      interrupt lines are asserted, becasue the timer interrupt handler
      functions will raise those lines as appropriate.
      
      The second case is inside the wait loop of kvm_vcpu_block(), where we
      have already called kvm_timer_schedule() and therefore the hardware will
      be disabled and the software view of the timer state is up to date
      (timer->loaded is false), and so we can simply check if the timer should
      fire by looking at the software state.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      1c88ab7e
    • C
      KVM: arm/arm64: Get rid of kvm_timer_flush_hwstate · 7e90c8e5
      Christoffer Dall 提交于
      Now when both the vtimer and the ptimer when using both the in-kernel
      vgic emulation and a userspace IRQ chip are driven by the timer signals
      and at the vcpu load/put boundaries, instead of recomputing the timer
      state at every entry/exit to/from the guest, we can get entirely rid of
      the flush hwstate function.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      7e90c8e5
    • C
      KVM: arm/arm64: Avoid phys timer emulation in vcpu entry/exit · bbdd52cf
      Christoffer Dall 提交于
      There is no need to schedule and cancel a hrtimer when entering and
      exiting the guest, because we know when the physical timer is going to
      fire when the guest programs it, and we can simply program the hrtimer
      at that point.
      
      Now when the register modifications from the guest go through the
      kvm_arm_timer_set/get_reg functions, which always call
      kvm_timer_update_state(), we can simply consider the timer state in this
      function and schedule and cancel the timers as needed.
      
      This avoids looking at the physical timer emulation state when entering
      and exiting the VCPU, allowing for faster servicing of the VM when
      needed.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      bbdd52cf
    • C
      KVM: arm/arm64: Move phys_timer_emulate function · cda93b7a
      Christoffer Dall 提交于
      We are about to call phys_timer_emulate() from kvm_timer_update_state()
      and modify phys_timer_emulate() at the same time.  Moving the function
      and modifying it in a single patch makes the diff hard to read, so do
      this separately first.
      
      No functional change.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      cda93b7a
    • C
      KVM: arm/arm64: Support EL1 phys timer register access in set/get reg · 5c5196da
      Christoffer Dall 提交于
      Add suport for the physical timer registers in kvm_arm_timer_set_reg and
      kvm_arm_timer_get_reg so that these functions can be reused to interact
      with the rest of the system.
      
      Note that this paves part of the way for the physical timer state
      save/restore, but we still need to add those registers to
      KVM_GET_REG_LIST before we support migrating the physical timer state.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      5c5196da
    • C
      KVM: arm/arm64: Avoid timer save/restore in vcpu entry/exit · b103cc3f
      Christoffer Dall 提交于
      We don't need to save and restore the hardware timer state and examine
      if it generates interrupts on on every entry/exit to the guest.  The
      timer hardware is perfectly capable of telling us when it has expired
      by signaling interrupts.
      
      When taking a vtimer interrupt in the host, we don't want to mess with
      the timer configuration, we just want to forward the physical interrupt
      to the guest as a virtual interrupt.  We can use the split priority drop
      and deactivate feature of the GIC to do this, which leaves an EOI'ed
      interrupt active on the physical distributor, making sure we don't keep
      taking timer interrupts which would prevent the guest from running.  We
      can then forward the physical interrupt to the VM using the HW bit in
      the LR of the GIC, like we do already, which lets the guest directly
      deactivate both the physical and virtual timer simultaneously, allowing
      the timer hardware to exit the VM and generate a new physical interrupt
      when the timer output is again asserted later on.
      
      We do need to capture this state when migrating VCPUs between physical
      CPUs, however, which we use the vcpu put/load functions for, which are
      called through preempt notifiers whenever the thread is scheduled away
      from the CPU or called directly if we return from the ioctl to
      userspace.
      
      One caveat is that we have to save and restore the timer state in both
      kvm_timer_vcpu_[put/load] and kvm_timer_[schedule/unschedule], because
      we can have the following flows:
      
        1. kvm_vcpu_block
        2. kvm_timer_schedule
        3. schedule
        4. kvm_timer_vcpu_put (preempt notifier)
        5. schedule (vcpu thread gets scheduled back)
        6. kvm_timer_vcpu_load (preempt notifier)
        7. kvm_timer_unschedule
      
      And a version where we don't actually call schedule:
      
        1. kvm_vcpu_block
        2. kvm_timer_schedule
        7. kvm_timer_unschedule
      
      Since kvm_timer_[schedule/unschedule] may not be followed by put/load,
      but put/load also may be called independently, we call the timer
      save/restore functions from both paths.  Since they rely on the loaded
      flag to never save/restore when unnecessary, this doesn't cause any
      harm, and we ensure that all invokations of either set of functions work
      as intended.
      
      An added benefit beyond not having to read and write the timer sysregs
      on every entry and exit is that we no longer have to actively write the
      active state to the physical distributor, because we configured the
      irq for the vtimer to only get a priority drop when handling the
      interrupt in the GIC driver (we called irq_set_vcpu_affinity()), and
      the interrupt stays active after firing on the host.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      b103cc3f
    • C
      KVM: arm/arm64: Set VCPU affinity for virt timer irq · 40f4cba9
      Christoffer Dall 提交于
      As we are about to take physical interrupts for the virtual timer on the
      host but want to leave those active while running the VM (and let the VM
      deactivate them), we need to set the vtimer PPI affinity accordingly.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      40f4cba9
    • C
      KVM: arm/arm64: Move timer save/restore out of the hyp code · 688c50aa
      Christoffer Dall 提交于
      As we are about to be lazy with saving and restoring the timer
      registers, we prepare by moving all possible timer configuration logic
      out of the hyp code.  All virtual timer registers can be programmed from
      EL1 and since the arch timer is always a level triggered interrupt we
      can safely do this with interrupts disabled in the host kernel on the
      way to the guest without taking vtimer interrupts in the host kernel
      (yet).
      
      The downside is that the cntvoff register can only be programmed from
      hyp mode, so we jump into hyp mode and back to program it.  This is also
      safe, because the host kernel doesn't use the virtual timer in the KVM
      code.  It may add a little performance performance penalty, but only
      until following commits where we move this operation to vcpu load/put.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      688c50aa
    • C
      KVM: arm/arm64: Use separate timer for phys timer emulation · f2a2129e
      Christoffer Dall 提交于
      We were using the same hrtimer for emulating the physical timer and for
      making sure a blocking VCPU thread would be eventually woken up.  That
      worked fine in the previous arch timer design, but as we are about to
      actually use the soft timer expire function for the physical timer
      emulation, change the logic to use a dedicated hrtimer.
      
      This has the added benefit of not having to cancel any work in the sync
      path, which in turn allows us to run the flush and sync with IRQs
      disabled.
      
      Note that the hrtimer used to program the host kernel's timer to
      generate an exit from the guest when the emulated physical timer fires
      never has to inject any work, and to share the soft_timer_cancel()
      function with the bg_timer, we change the function to only cancel any
      pending work if the pointer to the work struct is not null.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      f2a2129e
    • C
      KVM: arm/arm64: Rename soft timer to bg_timer · 14d61fa9
      Christoffer Dall 提交于
      As we are about to introduce a separate hrtimer for the physical timer,
      call this timer bg_timer, because we refer to this timer as the
      background timer in the code and comments elsewhere.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      14d61fa9
    • C
      KVM: arm/arm64: Make timer_arm and timer_disarm helpers more generic · 8409a06f
      Christoffer Dall 提交于
      We are about to add an additional soft timer to the arch timer state for
      a VCPU and would like to be able to reuse the functions to program and
      cancel a timer, so we make them slightly more generic and rename to make
      it more clear that these functions work on soft timers and not the
      hardware resource that this code is managing.
      
      The armed flag on the timer state is only used to assert a condition,
      and we don't rely on this assertion in any meaningful way, so we can
      simply get rid of this flack and slightly reduce complexity.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      8409a06f
  6. 08 6月, 2017 4 次提交
    • C
      KVM: arm/arm64: Disallow userspace control of in-kernel IRQ lines · cb3f0ad8
      Christoffer Dall 提交于
      When injecting an IRQ to the VGIC, you now have to present an owner
      token for that IRQ line to show that you are the owner of that line.
      
      IRQ lines driven from userspace or via an irqfd do not have an owner and
      will simply pass a NULL pointer.
      
      Also get rid of the unused kvm_vgic_inject_mapped_irq prototype.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      cb3f0ad8
    • C
      KVM: arm/arm64: Check if irq lines to the GIC are already used · abcb851d
      Christoffer Dall 提交于
      We check if other in-kernel devices have already been connected to the
      GIC for a particular interrupt line when possible.
      
      For the PMU, we can do this whenever setting the PMU interrupt number
      from userspace.
      
      For the timers, we have to wait until we try to enable the timer,
      because we have a concept of default IRQ numbers that userspace
      shouldn't have to work around in the initialization phase.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      abcb851d
    • C
      KVM: arm/arm64: Allow setting the timer IRQ numbers from userspace · 99a1db7a
      Christoffer Dall 提交于
      First we define an ABI using the vcpu devices that lets userspace set
      the interrupt numbers for the various timers on both the 32-bit and
      64-bit KVM/ARM implementations.
      
      Second, we add the definitions for the groups and attributes introduced
      by the above ABI.  (We add the PMU define on the 32-bit side as well for
      symmetry and it may get used some day.)
      
      Third, we set up the arch-specific vcpu device operation handlers to
      call into the timer code for anything related to the
      KVM_ARM_VCPU_TIMER_CTRL group.
      
      Fourth, we implement support for getting and setting the timer interrupt
      numbers using the above defined ABI in the arch timer code.
      
      Fifth, we introduce error checking upon enabling the arch timer (which
      is called when first running a VCPU) to check that all VCPUs are
      configured to use the same PPI for the timer (as mandated by the
      architecture) and that the virtual and physical timers are not
      configured to use the same IRQ number.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      99a1db7a
    • C
      KVM: arm/arm64: Move timer IRQ default init to arch_timer.c · 85e69ad7
      Christoffer Dall 提交于
      We currently initialize the arch timer IRQ numbers from the reset code,
      presumably because we once intended to model multiple CPU or SoC types
      from within the kernel and have hard-coded reset values in the reset
      code.
      
      As we are moving towards userspace being in charge of more fine-grained
      CPU emulation and stitching together the pieces needed to emulate a
      particular type of CPU, we should no longer have a tight coupling
      between resetting a VCPU and setting IRQ numbers.
      
      Therefore, move the logic to define and use the default IRQ numbers to
      the timer code and set the IRQ number immediately when creating the
      VCPU.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      85e69ad7
  7. 04 6月, 2017 1 次提交
  8. 09 4月, 2017 3 次提交
    • C
      KVM: arm/arm64: Report PMU overflow interrupts to userspace irqchip · 3dbbdf78
      Christoffer Dall 提交于
      When not using an in-kernel VGIC, but instead emulating an interrupt
      controller in userspace, we should report the PMU overflow status to
      that userspace interrupt controller using the KVM_CAP_ARM_USER_IRQ
      feature.
      Reviewed-by: NAlexander Graf <agraf@suse.de>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3dbbdf78
    • A
      KVM: arm/arm64: Support arch timers with a userspace gic · d9e13977
      Alexander Graf 提交于
      If you're running with a userspace gic or other interrupt controller
      (that is no vgic in the kernel), then you have so far not been able to
      use the architected timers, because the output of the architected
      timers, which are driven inside the kernel, was a kernel-only construct
      between the arch timer code and the vgic.
      
      This patch implements the new KVM_CAP_ARM_USER_IRQ feature, where we use a
      side channel on the kvm_run structure, run->s.regs.device_irq_level, to
      always notify userspace of the timer output levels when using a userspace
      irqchip.
      
      This works by ensuring that before we enter the guest, if the timer
      output level has changed compared to what we last told userspace, we
      don't enter the guest, but instead return to userspace to notify it of
      the new level.  If we are exiting, because of an MMIO for example, and
      the level changed at the same time, the value is also updated and
      userspace can sample the line as it needs.  This is nicely achieved
      simply always updating the timer_irq_level field after the main run
      loop.
      
      Note that the kvm_timer_update_irq trace event is changed to show the
      host IRQ number for the timer instead of the guest IRQ number, because
      the kernel no longer know which IRQ userspace wires up the timer signal
      to.
      
      Also note that this patch implements all required functionality but does
      not yet advertise the capability.
      Reviewed-by: NAlexander Graf <agraf@suse.de>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      d9e13977
    • C
      KVM: arm/arm64: Cleanup the arch timer code's irqchip checking · b22e7df2
      Christoffer Dall 提交于
      Currently we check if we have an in-kernel irqchip and if the vgic was
      properly implemented several places in the arch timer code.  But, we
      already predicate our enablement of the arm timers on having a valid
      and initialized gic, so we can simply check if the timers are enabled or
      not.
      
      This also gets rid of the ugly "error that's not an error but used to
      signal that the timer shouldn't poke the gic" construct we have.
      Reviewed-by: NAlexander Graf <agraf@suse.de>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      b22e7df2
  9. 08 2月, 2017 8 次提交
  10. 01 2月, 2017 1 次提交
  11. 13 1月, 2017 2 次提交
    • J
      KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems · 488f94d7
      Jintack Lim 提交于
      Current KVM world switch code is unintentionally setting wrong bits to
      CNTHCTL_EL2 when E2H == 1, which may allow guest OS to access physical
      timer.  Bit positions of CNTHCTL_EL2 are changing depending on
      HCR_EL2.E2H bit.  EL1PCEN and EL1PCTEN are 1st and 0th bits when E2H is
      not set, but they are 11th and 10th bits respectively when E2H is set.
      
      In fact, on VHE we only need to set those bits once, not for every world
      switch. This is because the host kernel runs in EL2 with HCR_EL2.TGE ==
      1, which makes those bits have no effect for the host kernel execution.
      So we just set those bits once for guests, and that's it.
      Signed-off-by: NJintack Lim <jintack@cs.columbia.edu>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      488f94d7
    • C
      KVM: arm/arm64: Fix occasional warning from the timer work function · 63e41226
      Christoffer Dall 提交于
      When a VCPU blocks (WFI) and has programmed the vtimer, we program a
      soft timer to expire in the future to wake up the vcpu thread when
      appropriate.  Because such as wake up involves a vcpu kick, and the
      timer expire function can get called from interrupt context, and the
      kick may sleep, we have to schedule the kick in the work function.
      
      The work function currently has a warning that gets raised if it turns
      out that the timer shouldn't fire when it's run, which was added because
      the idea was that in that case the work should never have been cancelled.
      
      However, it turns out that this whole thing is racy and we can get
      spurious warnings.  The problem is that we clear the armed flag in the
      work function, which may run in parallel with the
      kvm_timer_unschedule->timer_disarm() call.  This results in a possible
      situation where the timer_disarm() call does not call
      cancel_work_sync(), which effectively synchronizes the completion of the
      work function with running the VCPU.  As a result, the VCPU thread
      proceeds before the work function completees, causing changes to the
      timer state such that kvm_timer_should_fire(vcpu) returns false in the
      work function.
      
      All we do in the work function is to kick the VCPU, and an occasional
      rare extra kick never harmed anyone.  Since the race above is extremely
      rare, we don't bother checking if the race happens but simply remove the
      check and the clearing of the armed flag from the work function.
      Reported-by: NMatthias Brugger <mbrugger@suse.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      63e41226
  12. 25 12月, 2016 2 次提交