1. 29 7月, 2014 1 次提交
  2. 28 7月, 2014 2 次提交
    • A
      KVM: Allow KVM_CHECK_EXTENSION on the vm fd · 92b591a4
      Alexander Graf 提交于
      The KVM_CHECK_EXTENSION is only available on the kvm fd today. Unfortunately
      on PPC some of the capabilities change depending on the way a VM was created.
      
      So instead we need a way to expose capabilities as VM ioctl, so that we can
      see which VM type we're using (HV or PR). To enable this, add the
      KVM_CHECK_EXTENSION ioctl to our vm ioctl portfolio.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      92b591a4
    • P
      KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling · 699a0ea0
      Paul Mackerras 提交于
      This provides a way for userspace controls which sPAPR hcalls get
      handled in the kernel.  Each hcall can be individually enabled or
      disabled for in-kernel handling, except for H_RTAS.  The exception
      for H_RTAS is because userspace can already control whether
      individual RTAS functions are handled in-kernel or not via the
      KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for
      H_RTAS is out of the normal sequence of hcall numbers.
      
      Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the
      KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM.
      The args field of the struct kvm_enable_cap specifies the hcall number
      in args[0] and the enable/disable flag in args[1]; 0 means disable
      in-kernel handling (so that the hcall will always cause an exit to
      userspace) and 1 means enable.  Enabling or disabling in-kernel
      handling of an hcall is effective across the whole VM.
      
      The ability for KVM_ENABLE_CAP to be used on a VM file descriptor
      on PowerPC is new, added by this commit.  The KVM_CAP_ENABLE_CAP_VM
      capability advertises that this ability exists.
      
      When a VM is created, an initial set of hcalls are enabled for
      in-kernel handling.  The set that is enabled is the set that have
      an in-kernel implementation at this point.  Any new hcall
      implementations from this point onwards should not be added to the
      default set without a good reason.
      
      No distinction is made between real-mode and virtual-mode hcall
      implementations; the one setting controls them both.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      699a0ea0
  3. 10 7月, 2014 2 次提交
  4. 30 5月, 2014 1 次提交
    • A
      KVM: PPC: Add CAP to indicate hcall fixes · f2e91042
      Alexander Graf 提交于
      We worked around some nasty KVM magic page hcall breakages:
      
        1) NX bit not honored, so ignore NX when we detect it
        2) LE guests swizzle hypercall instruction
      
      Without these fixes in place, there's no way it would make sense to expose kvm
      hypercalls to a guest. Chances are immensely high it would trip over and break.
      
      So add a new CAP that gives user space a hint that we have workarounds for the
      bugs above in place. It can use those as hint to disable PV hypercalls when
      the guest CPU is anything POWER7 or higher and the host does not have fixes
      in place.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f2e91042
  5. 06 5月, 2014 1 次提交
  6. 30 4月, 2014 2 次提交
  7. 22 4月, 2014 1 次提交
  8. 18 4月, 2014 2 次提交
    • M
      KVM: VMX: speed up wildcard MMIO EVENTFD · 68c3b4d1
      Michael S. Tsirkin 提交于
      With KVM, MMIO is much slower than PIO, due to the need to
      do page walk and emulation. But with EPT, it does not have to be: we
      know the address from the VMCS so if the address is unique, we can look
      up the eventfd directly, bypassing emulation.
      
      Unfortunately, this only works if userspace does not need to match on
      access length and data.  The implementation adds a separate FAST_MMIO
      bus internally. This serves two purposes:
          - minimize overhead for old userspace that does not use eventfd with lengtth = 0
          - minimize disruption in other code (since we don't know the length,
            devices on the MMIO bus only get a valid address in write, this
            way we don't need to touch all devices to teach them to handle
            an invalid length)
      
      At the moment, this optimization only has effect for EPT on x86.
      
      It will be possible to speed up MMIO for NPT and MMU using the same
      idea in the future.
      
      With this patch applied, on VMX MMIO EVENTFD is essentially as fast as PIO.
      I was unable to detect any measureable slowdown to non-eventfd MMIO.
      
      Making MMIO faster is important for the upcoming virtio 1.0 which
      includes an MMIO signalling capability.
      
      The idea was suggested by Peter Anvin.  Lots of thanks to Gleb for
      pre-review and suggestions.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      68c3b4d1
    • M
      KVM: support any-length wildcard ioeventfd · f848a5a8
      Michael S. Tsirkin 提交于
      It is sometimes benefitial to ignore IO size, and only match on address.
      In hindsight this would have been a better default than matching length
      when KVM_IOEVENTFD_FLAG_DATAMATCH is not set, In particular, this kind
      of access can be optimized on VMX: there no need to do page lookups.
      This can currently be done with many ioeventfds but in a suboptimal way.
      
      However we can't change kernel/userspace ABI without risk of breaking
      some applications.
      Use len = 0 to mean "ignore length for matching" in a more optimal way.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      f848a5a8
  9. 21 3月, 2014 2 次提交
  10. 13 3月, 2014 1 次提交
    • G
      kvm: x86: ignore ioapic polarity · 100943c5
      Gabriel L. Somlo 提交于
      Both QEMU and KVM have already accumulated a significant number of
      optimizations based on the hard-coded assumption that ioapic polarity
      will always use the ActiveHigh convention, where the logical and
      physical states of level-triggered irq lines always match (i.e.,
      active(asserted) == high == 1, inactive == low == 0). QEMU guests
      are expected to follow directions given via ACPI and configure the
      ioapic with polarity 0 (ActiveHigh). However, even when misbehaving
      guests (e.g. OS X <= 10.9) set the ioapic polarity to 1 (ActiveLow),
      QEMU will still use the ActiveHigh signaling convention when
      interfacing with KVM.
      
      This patch modifies KVM to completely ignore ioapic polarity as set by
      the guest OS, enabling misbehaving guests to work alongside those which
      comply with the ActiveHigh polarity specified by QEMU's ACPI tables.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NGabriel L. Somlo <somlo@cmu.edu>
      [Move documentation to KVM_IRQ_LINE, add ia64. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      100943c5
  11. 30 1月, 2014 3 次提交
  12. 17 1月, 2014 1 次提交
    • V
      add support for Hyper-V reference time counter · e984097b
      Vadim Rozenfeld 提交于
      Signed-off: Peter Lieven <pl@kamp.de>
      Signed-off: Gleb Natapov
      Signed-off: Vadim Rozenfeld <vrozenfe@redhat.com>
      
      After some consideration I decided to submit only Hyper-V reference
      counters support this time. I will submit iTSC support as a separate
      patch as soon as it is ready.
      
      v1 -> v2
      1. mark TSC page dirty as suggested by
          Eric Northup <digitaleric@google.com> and Gleb
      2. disable local irq when calling get_kernel_ns,
          as it was done by Peter Lieven <pl@amp.de>
      3. move check for TSC page enable from second patch
          to this one.
      
      v3 -> v4
          Get rid of ref counter offset.
      
      v4 -> v5
          replace __copy_to_user with kvm_write_guest
          when updateing iTSC page.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e984097b
  13. 22 12月, 2013 1 次提交
    • C
      KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC · 7330672b
      Christoffer Dall 提交于
      Support creating the ARM VGIC device through the KVM_CREATE_DEVICE
      ioctl, which can then later be leveraged to use the
      KVM_{GET/SET}_DEVICE_ATTR, which is useful both for setting addresses in
      a more generic API than the ARM-specific one and is useful for
      save/restore of VGIC state.
      
      Adds KVM_CAP_DEVICE_CTRL to ARM capabilities.
      
      Note that we change the check for creating a VGIC from bailing out if
      any VCPUs were created, to bailing out if any VCPUs were ever run.  This
      is an important distinction that shouldn't break anything, but allows
      creating the VGIC after the VCPUs have been created.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      7330672b
  14. 31 10月, 2013 2 次提交
    • A
      kvm: Add VFIO device · ec53500f
      Alex Williamson 提交于
      So far we've succeeded at making KVM and VFIO mostly unaware of each
      other, but areas are cropping up where a connection beyond eventfds
      and irqfds needs to be made.  This patch introduces a KVM-VFIO device
      that is meant to be a gateway for such interaction.  The user creates
      the device and can add and remove VFIO groups to it via file
      descriptors.  When a group is added, KVM verifies the group is valid
      and gets a reference to it via the VFIO external user interface.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ec53500f
    • B
      kvm: Add KVM_GET_EMULATED_CPUID · 9c15bb1d
      Borislav Petkov 提交于
      Add a kvm ioctl which states which system functionality kvm emulates.
      The format used is that of CPUID and we return the corresponding CPUID
      bits set for which we do emulate functionality.
      
      Make sure ->padding is being passed on clean from userspace so that we
      can use it for something in the future, after the ioctl gets cast in
      stone.
      
      s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at
      it.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9c15bb1d
  15. 18 10月, 2013 1 次提交
  16. 03 10月, 2013 1 次提交
  17. 26 8月, 2013 1 次提交
  18. 12 6月, 2013 1 次提交
  19. 11 6月, 2013 1 次提交
  20. 07 6月, 2013 1 次提交
  21. 02 5月, 2013 1 次提交
    • P
      KVM: PPC: Book3S: Add API for in-kernel XICS emulation · 5975a2e0
      Paul Mackerras 提交于
      This adds the API for userspace to instantiate an XICS device in a VM
      and connect VCPUs to it.  The API consists of a new device type for
      the KVM_CREATE_DEVICE ioctl, a new capability KVM_CAP_IRQ_XICS, which
      functions similarly to KVM_CAP_IRQ_MPIC, and the KVM_IRQ_LINE ioctl,
      which is used to assert and deassert interrupt inputs of the XICS.
      
      The XICS device has one attribute group, KVM_DEV_XICS_GRP_SOURCES.
      Each attribute within this group corresponds to the state of one
      interrupt source.  The attribute number is the same as the interrupt
      source number.
      
      This does not support irq routing or irqfd yet.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5975a2e0
  22. 28 4月, 2013 1 次提交
  23. 27 4月, 2013 5 次提交
    • M
      KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls · 8e591cb7
      Michael Ellerman 提交于
      For pseries machine emulation, in order to move the interrupt
      controller code to the kernel, we need to intercept some RTAS
      calls in the kernel itself.  This adds an infrastructure to allow
      in-kernel handlers to be registered for RTAS services by name.
      A new ioctl, KVM_PPC_RTAS_DEFINE_TOKEN, then allows userspace to
      associate token values with those service names.  Then, when the
      guest requests an RTAS service with one of those token values, it
      will be handled by the relevant in-kernel handler rather than being
      passed up to userspace as at present.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix warning]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8e591cb7
    • S
      kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC · eb1e4f43
      Scott Wood 提交于
      Enabling this capability connects the vcpu to the designated in-kernel
      MPIC.  Using explicit connections between vcpus and irqchips allows
      for flexibility, but the main benefit at the moment is that it
      simplifies the code -- KVM doesn't need vm-global state to remember
      which MPIC object is associated with this vm, and it doesn't need to
      care about ordering between irqchip creation and vcpu creation.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      [agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      eb1e4f43
    • S
      kvm/ppc/mpic: in-kernel MPIC emulation · 5df554ad
      Scott Wood 提交于
      Hook the MPIC code up to the KVM interfaces, add locking, etc.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      [agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5df554ad
    • S
      kvm: add device control API · 852b6d57
      Scott Wood 提交于
      Currently, devices that are emulated inside KVM are configured in a
      hardcoded manner based on an assumption that any given architecture
      only has one way to do it.  If there's any need to access device state,
      it is done through inflexible one-purpose-only IOCTLs (e.g.
      KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
      cumbersome and depletes a limited numberspace.
      
      This API provides a mechanism to instantiate a device of a certain
      type, returning an ID that can be used to set/get attributes of the
      device.  Attributes may include configuration parameters (e.g.
      register base address), device state, operational commands, etc.  It
      is similar to the ONE_REG API, except that it acts on devices rather
      than vcpus.
      
      Both device types and individual attributes can be tested without having
      to create the device or get/set the attribute, without the need for
      separately managing enumerated capabilities.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      852b6d57
    • A
      KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing · 948a902c
      Alexander Graf 提交于
      We have a capability enquire system that allows user space to ask kvm
      whether a feature is available.
      
      The point behind this system is that we can have different kernel
      configurations with different capabilities and user space can adjust
      accordingly.
      
      Because features can always be non existent, we can drop any #ifdefs
      on CAP defines that could be used generically, like the irq routing
      bits. These can be easily reused for non-IOAPIC systems as well.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      948a902c
  24. 06 3月, 2013 1 次提交
  25. 12 2月, 2013 1 次提交
  26. 24 1月, 2013 3 次提交