1. 12 5月, 2016 3 次提交
    • P
      KVM: PPC: Book3S HV: Re-enable XICS fast path for irqfd-generated interrupts · b1a4286b
      Paul Mackerras 提交于
      Commit c9a5ecca ("kvm/eventfd: add arch-specific set_irq",
      2015-10-16) added the possibility for architecture-specific code
      to handle the generation of virtual interrupts in atomic context
      where possible, without having to schedule a work function.
      
      Since we can easily generate virtual interrupts on XICS without
      having to do anything worse than take a spinlock, we define a
      kvm_arch_set_irq_inatomic() for XICS.  We also remove kvm_set_msi()
      since it is not used any more.
      
      The one slightly tricky thing is that with the new interface, we
      don't get told whether the interrupt is an MSI (or other edge
      sensitive interrupt) vs. level-sensitive.  The difference as far
      as interrupt generation is concerned is that for LSIs we have to
      set the asserted flag so it will continue to fire until it is
      explicitly cleared.
      
      In fact the XICS code gets told which interrupts are LSIs by userspace
      when it configures the interrupt via the KVM_DEV_XICS_GRP_SOURCES
      attribute group on the XICS device.  To store this information, we add
      a new "lsi" field to struct ics_irq_state.  With that we can also do a
      better job of returning accurate values when reading the attribute
      group.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b1a4286b
    • A
      kvm: Conditionally register IRQ bypass consumer · 14717e20
      Alex Williamson 提交于
      If we don't support a mechanism for bypassing IRQs, don't register as
      a consumer.  This eliminates meaningless dev_info()s when the connect
      fails between producer and consumer, such as on AMD systems where
      kvm_x86_ops->update_pi_irte is not implemented
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      14717e20
    • G
      kvm: introduce KVM_MAX_VCPU_ID · 0b1b1dfd
      Greg Kurz 提交于
      The KVM_MAX_VCPUS define provides the maximum number of vCPUs per guest, and
      also the upper limit for vCPU ids. This is okay for all archs except PowerPC
      which can have higher ids, depending on the cpu/core/thread topology. In the
      worst case (single threaded guest, host with 8 threads per core), it limits
      the maximum number of vCPUS to KVM_MAX_VCPUS / 8.
      
      This patch separates the vCPU numbering from the total number of vCPUs, with
      the introduction of KVM_MAX_VCPU_ID, as the maximal valid value for vCPU ids
      plus one.
      
      The corresponding KVM_CAP_MAX_VCPU_ID allows userspace to validate vCPU ids
      before passing them to KVM_CREATE_VCPU.
      
      This patch only implements KVM_MAX_VCPU_ID with a specific value for PowerPC.
      Other archs continue to return KVM_MAX_VCPUS instead.
      Suggested-by: NRadim Krcmar <rkrcmar@redhat.com>
      Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
      Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0b1b1dfd
  2. 11 5月, 2016 4 次提交
  3. 10 5月, 2016 6 次提交
    • J
      MIPS: KVM: Add missing disable FPU hazard barriers · 4ac33429
      James Hogan 提交于
      Add the necessary hazard barriers after disabling the FPU in
      kvm_lose_fpu(), just to be safe.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4ac33429
    • J
      MIPS: KVM: Fix preemption warning reading FPU capability · 556f2a52
      James Hogan 提交于
      Reading the KVM_CAP_MIPS_FPU capability returns cpu_has_fpu, however
      this uses smp_processor_id() to read the current CPU capabilities (since
      some old MIPS systems could have FPUs present on only a subset of CPUs).
      
      We don't support any such systems, so work around the warning by using
      raw_cpu_has_fpu instead.
      
      We should probably instead claim not to support FPU at all if any one
      CPU is lacking an FPU, but this should do for now.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      556f2a52
    • J
      MIPS: KVM: Fix preemptable kvm_mips_get_*_asid() calls · f049729c
      James Hogan 提交于
      There are a couple of places in KVM fault handling code which implicitly
      use smp_processor_id() via kvm_mips_get_kernel_asid() and
      kvm_mips_get_user_asid() from preemptable context. This is unsafe as a
      preemption could cause the guest kernel ASID to be changed, resulting in
      a host TLB entry being written with the wrong ASID.
      
      Fix by disabling preemption around the kvm_mips_get_*_asid() call and
      the corresponding kvm_mips_host_tlb_write().
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f049729c
    • J
      MIPS: KVM: Fix timer IRQ race when writing CP0_Compare · b45bacd2
      James Hogan 提交于
      Writing CP0_Compare clears the timer interrupt pending bit
      (CP0_Cause.TI), but this wasn't being done atomically. If a timer
      interrupt raced with the write of the guest CP0_Compare, the timer
      interrupt could end up being pending even though the new CP0_Compare is
      nowhere near CP0_Count.
      
      We were already updating the hrtimer expiry with
      kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
      kvm_mips_resume_hrtimer(). Close the race window by expanding out
      kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
      CP0_Compare between the freeze and resume. Since the pending timer
      interrupt should not be cleared when CP0_Compare is written via the KVM
      user API, an ack argument is added to distinguish the source of the
      write.
      
      Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.16.x-
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b45bacd2
    • J
      MIPS: KVM: Fix timer IRQ race when freezing timer · 4355c44f
      James Hogan 提交于
      There's a particularly narrow and subtle race condition when the
      software emulated guest timer is frozen which can allow a guest timer
      interrupt to be missed.
      
      This happens due to the hrtimer expiry being inexact, so very
      occasionally the freeze time will be after the moment when the emulated
      CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
      be generated), but before the moment when the hrtimer is due to expire
      (so no IRQ is generated). The IRQ won't be generated when the timer is
      resumed either, since the resume CP0_Count will already match CP0_Compare.
      
      With VZ guests in particular this is far more likely to happen, since
      the soft timer may be frozen frequently in order to restore the timer
      state to the hardware guest timer. This happens after 5-10 hours of
      guest soak testing, resulting in an overflow in guest kernel timekeeping
      calculations, hanging the guest. A more focussed test case to
      intentionally hit the race (with the help of a new hypcall to cause the
      timer state to migrated between hardware & software) hits the condition
      fairly reliably within around 30 seconds.
      
      Instead of relying purely on the inexact hrtimer expiry to determine
      whether an IRQ should be generated, read the guest CP0_Compare and
      directly check whether the freeze time is before or after it. Only if
      CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
      determine whether the last IRQ has already been generated (which will
      have pushed back the expiry by one timer period).
      
      Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.16.x-
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4355c44f
    • C
      kvm: arm64: Enable hardware updates of the Access Flag for Stage 2 page tables · 06485053
      Catalin Marinas 提交于
      The ARMv8.1 architecture extensions introduce support for hardware
      updates of the access and dirty information in page table entries. With
      VTCR_EL2.HA enabled (bit 21), when the CPU accesses an IPA with the
      PTE_AF bit cleared in the stage 2 page table, instead of raising an
      Access Flag fault to EL2 the CPU sets the actual page table entry bit
      (10). To ensure that kernel modifications to the page table do not
      inadvertently revert a bit set by hardware updates, certain Stage 2
      software pte/pmd operations must be performed atomically.
      
      The main user of the AF bit is the kvm_age_hva() mechanism. The
      kvm_age_hva_handler() function performs a "test and clear young" action
      on the pte/pmd. This needs to be atomic in respect of automatic hardware
      updates of the AF bit. Since the AF bit is in the same position for both
      Stage 1 and Stage 2, the patch reuses the existing
      ptep_test_and_clear_young() functionality if
      __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG is defined. Otherwise, the
      existing pte_young/pte_mkold mechanism is preserved.
      
      The kvm_set_s2pte_readonly() (and the corresponding pmd equivalent) have
      to perform atomic modifications in order to avoid a race with updates of
      the AF bit. The arm64 implementation has been re-written using
      exclusives.
      
      Currently, kvm_set_s2pte_writable() (and pmd equivalent) take a pointer
      argument and modify the pte/pmd in place. However, these functions are
      only used on local variables rather than actual page table entries, so
      it makes more sense to follow the pte_mkwrite() approach for stage 1
      attributes. The change to kvm_s2pte_mkwrite() makes it clear that these
      functions do not modify the actual page table entries.
      
      The (pte|pmd)_mkyoung() uses on Stage 2 entries (setting the AF bit
      explicitly) do not need to be modified since hardware updates of the
      dirty status are not supported by KVM, so there is no possibility of
      losing such information.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      06485053
  4. 09 5月, 2016 8 次提交
  5. 04 5月, 2016 2 次提交
  6. 03 5月, 2016 1 次提交
  7. 29 4月, 2016 2 次提交
  8. 21 4月, 2016 14 次提交