1. 17 9月, 2015 1 次提交
  2. 16 9月, 2015 1 次提交
    • M
      arm: KVM: Fix incorrect device to IPA mapping · ca09f02f
      Marek Majtyka 提交于
      A critical bug has been found in device memory stage1 translation for
      VMs with more then 4GB of address space. Once vm_pgoff size is smaller
      then pa (which is true for LPAE case, u32 and u64 respectively) some
      more significant bits of pa may be lost as a shift operation is performed
      on u32 and later cast onto u64.
      
      Example: vm_pgoff(u32)=0x00210030, PAGE_SHIFT=12
              expected pa(u64):   0x0000002010030000
              produced pa(u64):   0x0000000010030000
      
      The fix is to change the order of operations (casting first onto phys_addr_t
      and then shifting).
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      [maz: fixed changelog and patch formatting]
      Cc: stable@vger.kernel.org
      Signed-off-by: NMarek Majtyka <marek.majtyka@tieto.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      ca09f02f
  3. 05 9月, 2015 1 次提交
  4. 20 8月, 2015 1 次提交
  5. 12 8月, 2015 4 次提交
  6. 21 7月, 2015 3 次提交
  7. 17 6月, 2015 6 次提交
    • M
      arm/arm64: KVM: vgic: Do not save GICH_HCR / ICH_HCR_EL2 · 4642019d
      Marc Zyngier 提交于
      The GIC Hypervisor Configuration Register is used to enable
      the delivery of virtual interupts to a guest, as well as to
      define in which conditions maintenance interrupts are delivered
      to the host.
      
      This register doesn't contain any information that we need to
      read back (the EOIcount is utterly useless for us).
      
      So let's save ourselves some cycles, and not save it before
      writing zero to it.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      4642019d
    • L
      ARM: kvm: psci: fix handling of unimplemented functions · e2d99736
      Lorenzo Pieralisi 提交于
      According to the PSCI specification and the SMC/HVC calling
      convention, PSCI function_ids that are not implemented must
      return NOT_SUPPORTED as return value.
      
      Current KVM implementation takes an unhandled PSCI function_id
      as an error and injects an undefined instruction into the guest
      if PSCI implementation is called with a function_id that is not
      handled by the resident PSCI version (ie it is not implemented),
      which is not the behaviour expected by a guest when calling a
      PSCI function_id that is not implemented.
      
      This patch fixes this issue by returning NOT_SUPPORTED whenever
      the kvm PSCI call is executed for a function_id that is not
      implemented by the PSCI kvm layer.
      
      Cc: <stable@vger.kernel.org> # 3.18+
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Acked-by: NSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      e2d99736
    • K
      KVM: arm/arm64: Enable the KVM-VFIO device · 8889583c
      Kim Phillips 提交于
      The KVM-VFIO device is used by the QEMU VFIO device. It is used to
      record the list of in-use VFIO groups so that KVM can manipulate
      them.
      Signed-off-by: NKim Phillips <kim.phillips@linaro.org>
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      8889583c
    • C
      arm/arm64: KVM: Properly account for guest CPU time · 1b3d546d
      Christoffer Dall 提交于
      Until now we have been calling kvm_guest_exit after re-enabling
      interrupts when we come back from the guest, but this has the
      unfortunate effect that CPU time accounting done in the context of timer
      interrupts occurring while the guest is running doesn't properly notice
      that the time since the last tick was spent in the guest.
      
      Inspired by the comment in the x86 code, move the kvm_guest_exit() call
      below the local_irq_enable() call and change __kvm_guest_exit() to
      kvm_guest_exit(), because we are now calling this function with
      interrupts enabled.  We have to now explicitly disable preemption and
      not enable preemption before we've called kvm_guest_exit(), since
      otherwise we could be preempted and everything happening before we
      eventually get scheduled again would be accounted for as guest time.
      
      At the same time, move the trace_kvm_exit() call outside of the atomic
      section, since there is no reason for us to do that with interrupts
      disabled.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      1b3d546d
    • T
      kvm: remove one useless check extension · ea2c6d97
      Tiejun Chen 提交于
      We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD,
      
      kvm_vm_ioctl_check_extension_generic()
          |
          + switch (arg) {
          +   ...
          +   #ifdef CONFIG_HAVE_KVM_IRQFD
          +       case KVM_CAP_IRQFD:
          +   #endif
          +   ...
          +   return 1;
          +   ...
          + }
          |
          + kvm_vm_ioctl_check_extension()
      
      So its not necessary to check this in arch again, and also fix one typo,
      s/emlation/emulation.
      Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      ea2c6d97
    • M
      arm: KVM: force execution of HCPTR access on VM exit · 85e84ba3
      Marc Zyngier 提交于
      On VM entry, we disable access to the VFP registers in order to
      perform a lazy save/restore of these registers.
      
      On VM exit, we restore access, test if we did enable them before,
      and save/restore the guest/host registers if necessary. In this
      sequence, the FPEXC register is always accessed, irrespective
      of the trapping configuration.
      
      If the guest didn't touch the VFP registers, then the HCPTR access
      has now enabled such access, but we're missing a barrier to ensure
      architectural execution of the new HCPTR configuration. If the HCPTR
      access has been delayed/reordered, the subsequent access to FPEXC
      will cause a trap, which we aren't prepared to handle at all.
      
      The same condition exists when trapping to enable VFP for the guest.
      
      The fix is to introduce a barrier after enabling VFP access. In the
      vmexit case, it can be relaxed to only takes place if the guest hasn't
      accessed its view of the VFP registers, making the access to FPEXC safe.
      
      The set_hcptr macro is modified to deal with both vmenter/vmexit and
      vmtrap operations, and now takes an optional label that is branched to
      when the guest hasn't touched the VFP registers.
      Reported-by: NVikram Sethi <vikrams@codeaurora.org>
      Cc: stable@kernel.org	# v3.9+
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      85e84ba3
  8. 10 6月, 2015 1 次提交
  9. 28 5月, 2015 1 次提交
  10. 27 5月, 2015 1 次提交
  11. 26 5月, 2015 3 次提交
  12. 09 5月, 2015 1 次提交
  13. 07 5月, 2015 1 次提交
  14. 22 4月, 2015 1 次提交
    • A
      KVM: arm/arm64: check IRQ number on userland injection · fd1d0ddf
      Andre Przywara 提交于
      When userland injects a SPI via the KVM_IRQ_LINE ioctl we currently
      only check it against a fixed limit, which historically is set
      to 127. With the new dynamic IRQ allocation the effective limit may
      actually be smaller (64).
      So when now a malicious or buggy userland injects a SPI in that
      range, we spill over on our VGIC bitmaps and bytemaps memory.
      I could trigger a host kernel NULL pointer dereference with current
      mainline by injecting some bogus IRQ number from a hacked kvmtool:
      -----------------
      ....
      DEBUG: kvm_vgic_inject_irq(kvm, cpu=0, irq=114, level=1)
      DEBUG: vgic_update_irq_pending(kvm, cpu=0, irq=114, level=1)
      DEBUG: IRQ #114 still in the game, writing to bytemap now...
      Unable to handle kernel NULL pointer dereference at virtual address 00000000
      pgd = ffffffc07652e000
      [00000000] *pgd=00000000f658b003, *pud=00000000f658b003, *pmd=0000000000000000
      Internal error: Oops: 96000006 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 1053 Comm: lkvm-msi-irqinj Not tainted 4.0.0-rc7+ #3027
      Hardware name: FVP Base (DT)
      task: ffffffc0774e9680 ti: ffffffc0765a8000 task.ti: ffffffc0765a8000
      PC is at kvm_vgic_inject_irq+0x234/0x310
      LR is at kvm_vgic_inject_irq+0x30c/0x310
      pc : [<ffffffc0000ae0a8>] lr : [<ffffffc0000ae180>] pstate: 80000145
      .....
      
      So this patch fixes this by checking the SPI number against the
      actual limit. Also we remove the former legacy hard limit of
      127 in the ioctl code.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      CC: <stable@vger.kernel.org> # 4.0, 3.19, 3.18
      [maz: wrap KVM_ARM_IRQ_GIC_MAX with #ifndef __KERNEL__,
      as suggested by Christopher Covington]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      fd1d0ddf
  15. 31 3月, 2015 2 次提交
    • N
      KVM: arm/arm64: enable KVM_CAP_IOEVENTFD · d44758c0
      Nikolay Nikolaev 提交于
      As the infrastructure for eventfd has now been merged, report the
      ioeventfd capability as being supported.
      Signed-off-by: NNikolay Nikolaev <n.nikolaev@virtualopensystems.com>
      [maz: grouped the case entry with the others, fixed commit log]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      d44758c0
    • A
      KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO bus · 950324ab
      Andre Przywara 提交于
      Currently we have struct kvm_exit_mmio for encapsulating MMIO abort
      data to be passed on from syndrome decoding all the way down to the
      VGIC register handlers. Now as we switch the MMIO handling to be
      routed through the KVM MMIO bus, it does not make sense anymore to
      use that structure already from the beginning. So we keep the data in
      local variables until we put them into the kvm_io_bus framework.
      Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private
      structure. On that way we replace the data buffer in that structure
      with a pointer pointing to a single location in a local variable, so
      we get rid of some copying on the way.
      With all of the virtual GIC emulation code now being registered with
      the kvm_io_bus, we can remove all of the old MMIO handling code and
      its dispatching functionality.
      
      I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something),
      because that touches a lot of code lines without any good reason.
      
      This is based on an original patch by Nikolay.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      950324ab
  16. 27 3月, 2015 2 次提交
  17. 23 3月, 2015 1 次提交
  18. 20 3月, 2015 1 次提交
    • A
      ARM, arm64: kvm: get rid of the bounce page · 06f75a1f
      Ard Biesheuvel 提交于
      The HYP init bounce page is a runtime construct that ensures that the
      HYP init code does not cross a page boundary. However, this is something
      we can do perfectly well at build time, by aligning the code appropriately.
      
      For arm64, we just align to 4 KB, and enforce that the code size is less
      than 4 KB, regardless of the chosen page size.
      
      For ARM, the whole code is less than 256 bytes, so we tweak the linker
      script to align at a power of 2 upper bound of the code size
      
      Note that this also fixes a benign off-by-one error in the original bounce
      page code, where a bounce page would be allocated unnecessarily if the code
      was exactly 1 page in size.
      
      On ARM, it also fixes an issue with very large kernels reported by Arnd
      Bergmann, where stub sections with linker emitted veneers could erroneously
      trigger the size/alignment ASSERT() in the linker script.
      Tested-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      06f75a1f
  19. 14 3月, 2015 2 次提交
    • C
      arm/arm64: KVM: Fix migration race in the arch timer · 1a748478
      Christoffer Dall 提交于
      When a VCPU is no longer running, we currently check to see if it has a
      timer scheduled in the future, and if it does, we schedule a host
      hrtimer to notify is in case the timer expires while the VCPU is still
      not running.  When the hrtimer fires, we mask the guest's timer and
      inject the timer IRQ (still relying on the guest unmasking the time when
      it receives the IRQ).
      
      This is all good and fine, but when migration a VM (checkpoint/restore)
      this introduces a race.  It is unlikely, but possible, for the following
      sequence of events to happen:
      
       1. Userspace stops the VM
       2. Hrtimer for VCPU is scheduled
       3. Userspace checkpoints the VGIC state (no pending timer interrupts)
       4. The hrtimer fires, schedules work in a workqueue
       5. Workqueue function runs, masks the timer and injects timer interrupt
       6. Userspace checkpoints the timer state (timer masked)
      
      At restore time, you end up with a masked timer without any timer
      interrupts and your guest halts never receiving timer interrupts.
      
      Fix this by only kicking the VCPU in the workqueue function, and sample
      the expired state of the timer when entering the guest again and inject
      the interrupt and mask the timer only then.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      1a748478
    • A
      arm/arm64: KVM: export VCPU power state via MP_STATE ioctl · ecccf0cc
      Alex Bennée 提交于
      To cleanly restore an SMP VM we need to ensure that the current pause
      state of each vcpu is correctly recorded. Things could get confused if
      the CPU starts running after migration restore completes when it was
      paused before it state was captured.
      
      We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
      interface is a lot simpler as the only valid states are
      KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_STOPPED.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      ecccf0cc
  20. 13 3月, 2015 3 次提交
  21. 12 3月, 2015 3 次提交