1. 24 1月, 2013 3 次提交
    • C
      KVM: ARM: User space API for getting/setting co-proc registers · 1138245c
      Christoffer Dall 提交于
      The following three ioctls are implemented:
       -  KVM_GET_REG_LIST
       -  KVM_GET_ONE_REG
       -  KVM_SET_ONE_REG
      
      Now we have a table for all the cp15 registers, we can drive a generic
      API.
      
      The register IDs carry the following encoding:
      
      ARM registers are mapped using the lower 32 bits.  The upper 16 of that
      is the register group type, or coprocessor number:
      
      ARM 32-bit CP15 registers have the following id bit patterns:
        0x4002 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
      
      ARM 64-bit CP15 registers have the following id bit patterns:
        0x4003 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
      
      For futureproofing, we need to tell QEMU about the CP15 registers the
      host lets the guest access.
      
      It will need this information to restore a current guest on a future
      CPU or perhaps a future KVM which allow some of these to be changed.
      
      We use a separate table for these, as they're only for the userspace API.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      1138245c
    • C
      KVM: ARM: Inject IRQs and FIQs from userspace · 86ce8535
      Christoffer Dall 提交于
      All interrupt injection is now based on the VM ioctl KVM_IRQ_LINE.  This
      works semantically well for the GIC as we in fact raise/lower a line on
      a machine component (the gic).  The IOCTL uses the follwing struct.
      
      struct kvm_irq_level {
      	union {
      		__u32 irq;     /* GSI */
      		__s32 status;  /* not used for KVM_IRQ_LEVEL */
      	};
      	__u32 level;           /* 0 or 1 */
      };
      
      ARM can signal an interrupt either at the CPU level, or at the in-kernel irqchip
      (GIC), and for in-kernel irqchip can tell the GIC to use PPIs designated for
      specific cpus.  The irq field is interpreted like this:
      
        bits:  | 31 ... 24 | 23  ... 16 | 15    ...    0 |
        field: | irq_type  | vcpu_index |   irq_number   |
      
      The irq_type field has the following values:
      - irq_type[0]: out-of-kernel GIC: irq_number 0 is IRQ, irq_number 1 is FIQ
      - irq_type[1]: in-kernel GIC: SPI, irq_number between 32 and 1019 (incl.)
                     (the vcpu_index field is ignored)
      - irq_type[2]: in-kernel GIC: PPI, irq_number between 16 and 31 (incl.)
      
      The irq_number thus corresponds to the irq ID in as in the GICv2 specs.
      
      This is documented in Documentation/kvm/api.txt.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      86ce8535
    • C
      KVM: ARM: Initial skeleton to compile KVM support · 749cf76c
      Christoffer Dall 提交于
      Targets KVM support for Cortex A-15 processors.
      
      Contains all the framework components, make files, header files, some
      tracing functionality, and basic user space API.
      
      Only supported core is Cortex-A15 for now.
      
      Most functionality is in arch/arm/kvm/* or arch/arm/include/asm/kvm_*.h.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      749cf76c
  2. 06 12月, 2012 2 次提交
    • M
      KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface · 352df1de
      Mihai Caraman 提交于
      Implement ONE_REG interface for EPCR register adding KVM_REG_PPC_EPCR to
      the list of ONE_REG PPC supported registers.
      Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
      [agraf: remove HV dependency, use get/put_user]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      352df1de
    • P
      KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT · a2932923
      Paul Mackerras 提交于
      A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor.  Reads on
      this fd return the contents of the HPT (hashed page table), writes
      create and/or remove entries in the HPT.  There is a new capability,
      KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl.  The ioctl
      takes an argument structure with the index of the first HPT entry to
      read out and a set of flags.  The flags indicate whether the user is
      intending to read or write the HPT, and whether to return all entries
      or only the "bolted" entries (those with the bolted bit, 0x10, set in
      the first doubleword).
      
      This is intended for use in implementing qemu's savevm/loadvm and for
      live migration.  Therefore, on reads, the first pass returns information
      about all HPTEs (or all bolted HPTEs).  When the first pass reaches the
      end of the HPT, it returns from the read.  Subsequent reads only return
      information about HPTEs that have changed since they were last read.
      A read that finds no changed HPTEs in the HPT following where the last
      read finished will return 0 bytes.
      
      The format of the data provides a simple run-length compression of the
      invalid entries.  Each block of data starts with a header that indicates
      the index (position in the HPT, which is just an array), the number of
      valid entries starting at that index (may be zero), and the number of
      invalid entries following those valid entries.  The valid entries, 16
      bytes each, follow the header.  The invalid entries are not explicitly
      represented.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix documentation]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a2932923
  3. 30 10月, 2012 1 次提交
  4. 11 10月, 2012 1 次提交
  5. 06 10月, 2012 5 次提交
    • P
      KVM: PPC: Book3S HV: Provide a way for userspace to get/set per-vCPU areas · 55b665b0
      Paul Mackerras 提交于
      The PAPR paravirtualization interface lets guests register three
      different types of per-vCPU buffer areas in its memory for communication
      with the hypervisor.  These are called virtual processor areas (VPAs).
      Currently the hypercalls to register and unregister VPAs are handled
      by KVM in the kernel, and userspace has no way to know about or save
      and restore these registrations across a migration.
      
      This adds "register" codes for these three areas that userspace can
      use with the KVM_GET/SET_ONE_REG ioctls to see what addresses have
      been registered, and to register or unregister them.  This will be
      needed for guest hibernation and migration, and is also needed so
      that userspace can unregister them on reset (otherwise we corrupt
      guest memory after reboot by writing to the VPAs registered by the
      previous kernel).
      
      The "register" for the VPA is a 64-bit value containing the address,
      since the length of the VPA is fixed.  The "registers" for the SLB
      shadow buffer and dispatch trace log (DTL) are 128 bits long,
      consisting of the guest physical address in the high (first) 64 bits
      and the length in the low 64 bits.
      
      This also fixes a bug where we were calling init_vpa unconditionally,
      leading to an oops when unregistering the VPA.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      55b665b0
    • P
      KVM: PPC: Book3S: Get/set guest FP regs using the GET/SET_ONE_REG interface · a8bd19ef
      Paul Mackerras 提交于
      This enables userspace to get and set all the guest floating-point
      state using the KVM_[GS]ET_ONE_REG ioctls.  The floating-point state
      includes all of the traditional floating-point registers and the
      FPSCR (floating point status/control register), all the VMX/Altivec
      vector registers and the VSCR (vector status/control register), and
      on POWER7, the vector-scalar registers (note that each FP register
      is the high-order half of the corresponding VSR).
      
      Most of these are implemented in common Book 3S code, except for VSX
      on POWER7.  Because HV and PR differ in how they store the FP and VSX
      registers on POWER7, the code for these cases is not common.  On POWER7,
      the FP registers are the upper halves of the VSX registers vsr0 - vsr31.
      PR KVM stores vsr0 - vsr31 in two halves, with the upper halves in the
      arch.fpr[] array and the lower halves in the arch.vsr[] array, whereas
      HV KVM on POWER7 stores the whole VSX register in arch.vsr[].
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix whitespace, vsx compilation]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a8bd19ef
    • P
      KVM: PPC: Book3S: Get/set guest SPRs using the GET/SET_ONE_REG interface · a136a8bd
      Paul Mackerras 提交于
      This enables userspace to get and set various SPRs (special-purpose
      registers) using the KVM_[GS]ET_ONE_REG ioctls.  With this, userspace
      can get and set all the SPRs that are part of the guest state, either
      through the KVM_[GS]ET_REGS ioctls, the KVM_[GS]ET_SREGS ioctls, or
      the KVM_[GS]ET_ONE_REG ioctls.
      
      The SPRs that are added here are:
      
      - DABR:  Data address breakpoint register
      - DSCR:  Data stream control register
      - PURR:  Processor utilization of resources register
      - SPURR: Scaled PURR
      - DAR:   Data address register
      - DSISR: Data storage interrupt status register
      - AMR:   Authority mask register
      - UAMOR: User authority mask override register
      - MMCR0, MMCR1, MMCRA: Performance monitor unit control registers
      - PMC1..PMC8: Performance monitor unit counter registers
      
      In order to reduce code duplication between PR and HV KVM code, this
      moves the kvm_vcpu_ioctl_[gs]et_one_reg functions into book3s.c and
      centralizes the copying between user and kernel space there.  The
      registers that are handled differently between PR and HV, and those
      that exist only in one flavor, are handled in kvmppc_[gs]et_one_reg()
      functions that are specific to each flavor.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: minimal style fixes]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a136a8bd
    • B
      Document IACx/DACx registers access using ONE_REG API · 2e232702
      Bharat Bhushan 提交于
      Patch to access the debug registers (IACx/DACx) using ONE_REG api
      was sent earlier. But that missed the respective documentation.
      
      Also corrected the index number referencing in section 4.69
      Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2e232702
    • L
      KVM: PPC: Add support for ePAPR idle hcall in host kernel · 9202e076
      Liu Yu-B13201 提交于
      And add a new flag definition in kvm_ppc_pvinfo to indicate
      whether the host supports the EV_IDLE hcall.
      Signed-off-by: NLiu Yu <yu.liu@freescale.com>
      [stuart.yoder@freescale.com: cleanup,fixes for conditions allowing idle]
      Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com>
      [agraf: fix typo]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9202e076
  6. 23 9月, 2012 1 次提交
    • A
      KVM: Add resampling irqfds for level triggered interrupts · 7a84428a
      Alex Williamson 提交于
      To emulate level triggered interrupts, add a resample option to
      KVM_IRQFD.  When specified, a new resamplefd is provided that notifies
      the user when the irqchip has been resampled by the VM.  This may, for
      instance, indicate an EOI.  Also in this mode, posting of an interrupt
      through an irqfd only asserts the interrupt.  On resampling, the
      interrupt is automatically de-asserted prior to user notification.
      This enables level triggered interrupts to be posted and re-enabled
      from vfio with no userspace intervention.
      
      All resampling irqfds can make use of a single irq source ID, so we
      reserve a new one for this interface.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      7a84428a
  7. 09 9月, 2012 1 次提交
  8. 22 8月, 2012 1 次提交
  9. 03 7月, 2012 1 次提交
  10. 30 5月, 2012 1 次提交
    • P
      KVM: PPC: Book3S HV: Make the guest hash table size configurable · 32fad281
      Paul Mackerras 提交于
      This adds a new ioctl to enable userspace to control the size of the guest
      hashed page table (HPT) and to clear it out when resetting the guest.
      The KVM_PPC_ALLOCATE_HTAB ioctl is a VM ioctl and takes as its parameter
      a pointer to a u32 containing the desired order of the HPT (log base 2
      of the size in bytes), which is updated on successful return to the
      actual order of the HPT which was allocated.
      
      There must be no vcpus running at the time of this ioctl.  To enforce
      this, we now keep a count of the number of vcpus running in
      kvm->arch.vcpus_running.
      
      If the ioctl is called when a HPT has already been allocated, we don't
      reallocate the HPT but just clear it out.  We first clear the
      kvm->arch.rma_setup_done flag, which has two effects: (a) since we hold
      the kvm->lock mutex, it will prevent any vcpus from starting to run until
      we're done, and (b) it means that the first vcpu to run after we're done
      will re-establish the VRMA if necessary.
      
      If userspace doesn't call this ioctl before running the first vcpu, the
      kernel will allocate a default-sized HPT at that point.  We do it then
      rather than when creating the VM, as the code did previously, so that
      userspace has a chance to do the ioctl if it wants.
      
      When allocating the HPT, we can allocate either from the kernel page
      allocator, or from the preallocated pool.  If userspace is asking for
      a different size from the preallocated HPTs, we first try to allocate
      using the kernel page allocator.  Then we try to allocate from the
      preallocated pool, and then if that fails, we try allocating decreasing
      sizes from the kernel page allocator, down to the minimum size allowed
      (256kB).  Note that the kernel page allocator limits allocations to
      1 << CONFIG_FORCE_MAX_ZONEORDER pages, which by default corresponds to
      16MB (on 64-bit powerpc, at least).
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix module compilation]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      32fad281
  11. 06 5月, 2012 1 次提交
  12. 28 4月, 2012 3 次提交
  13. 24 4月, 2012 1 次提交
  14. 08 4月, 2012 1 次提交
  15. 08 3月, 2012 1 次提交
  16. 05 3月, 2012 9 次提交
  17. 27 12月, 2011 1 次提交
  18. 26 12月, 2011 2 次提交
    • J
      KVM: Don't automatically expose the TSC deadline timer in cpuid · 4d25a066
      Jan Kiszka 提交于
      Unlike all of the other cpuid bits, the TSC deadline timer bit is set
      unconditionally, regardless of what userspace wants.
      
      This is broken in several ways:
       - if userspace doesn't use KVM_CREATE_IRQCHIP, and doesn't emulate the TSC
         deadline timer feature, a guest that uses the feature will break
       - live migration to older host kernels that don't support the TSC deadline
         timer will cause the feature to be pulled from under the guest's feet;
         breaking it
       - guests that are broken wrt the feature will fail.
      
      Fix by not enabling the feature automatically; instead report it to userspace.
      Because the feature depends on KVM_CREATE_IRQCHIP, which we cannot guarantee
      will be called, we expose it via a KVM_CAP_TSC_DEADLINE_TIMER and not
      KVM_GET_SUPPORTED_CPUID.
      
      Fixes the Illumos guest kernel, which uses the TSC deadline timer feature.
      
      [avi: add the KVM_CAP + documentation]
      Reported-by: NAlexey Zaytsev <alexey.zaytsev@gmail.com>
      Tested-by: NAlexey Zaytsev <alexey.zaytsev@gmail.com>
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      4d25a066
    • A
      KVM: Device assignment permission checks · 3d27e23b
      Alex Williamson 提交于
      Only allow KVM device assignment to attach to devices which:
      
       - Are not bridges
       - Have BAR resources (assume others are special devices)
       - The user has permissions to use
      
      Assigning a bridge is a configuration error, it's not supported, and
      typically doesn't result in the behavior the user is expecting anyway.
      Devices without BAR resources are typically chipset components that
      also don't have host drivers.  We don't want users to hold such devices
      captive or cause system problems by fencing them off into an iommu
      domain.  We determine "permission to use" by testing whether the user
      has access to the PCI sysfs resource files.  By default a normal user
      will not have access to these files, so it provides a good indication
      that an administration agent has granted the user access to the device.
      
      [Yang Bai: add missing #include]
      [avi: fix comment style]
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NYang Bai <hamo.by@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      3d27e23b
  19. 25 12月, 2011 1 次提交
  20. 26 9月, 2011 3 次提交