1. 12 7月, 2011 7 次提交
    • D
      KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode · 54738c09
      David Gibson 提交于
      This improves I/O performance for guests using the PAPR
      paravirtualization interface by making the H_PUT_TCE hcall faster, by
      implementing it in real mode.  H_PUT_TCE is used for updating virtual
      IOMMU tables, and is used both for virtual I/O and for real I/O in the
      PAPR interface.
      
      Since this moves the IOMMU tables into the kernel, we define a new
      KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables.  The
      ioctl returns a file descriptor which can be used to mmap the newly
      created table.  The qemu driver models use them in the same way as
      userspace managed tables, but they can be updated directly by the
      guest with a real-mode H_PUT_TCE implementation, reducing the number
      of host/guest context switches during guest IO.
      
      There are certain circumstances where it is useful for userland qemu
      to write to the TCE table even if the kernel H_PUT_TCE path is used
      most of the time.  Specifically, allowing this will avoid awkwardness
      when we need to reset the table.  More importantly, we will in the
      future need to write the table in order to restore its state after a
      checkpoint resume or migration.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      54738c09
    • P
      KVM: PPC: Add support for Book3S processors in hypervisor mode · de56a948
      Paul Mackerras 提交于
      This adds support for KVM running on 64-bit Book 3S processors,
      specifically POWER7, in hypervisor mode.  Using hypervisor mode means
      that the guest can use the processor's supervisor mode.  That means
      that the guest can execute privileged instructions and access privileged
      registers itself without trapping to the host.  This gives excellent
      performance, but does mean that KVM cannot emulate a processor
      architecture other than the one that the hardware implements.
      
      This code assumes that the guest is running paravirtualized using the
      PAPR (Power Architecture Platform Requirements) interface, which is the
      interface that IBM's PowerVM hypervisor uses.  That means that existing
      Linux distributions that run on IBM pSeries machines will also run
      under KVM without modification.  In order to communicate the PAPR
      hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
      to include/linux/kvm.h.
      
      Currently the choice between book3s_hv support and book3s_pr support
      (i.e. the existing code, which runs the guest in user mode) has to be
      made at kernel configuration time, so a given kernel binary can only
      do one or the other.
      
      This new book3s_hv code doesn't support MMIO emulation at present.
      Since we are running paravirtualized guests, this isn't a serious
      restriction.
      
      With the guest running in supervisor mode, most exceptions go straight
      to the guest.  We will never get data or instruction storage or segment
      interrupts, alignment interrupts, decrementer interrupts, program
      interrupts, single-step interrupts, etc., coming to the hypervisor from
      the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
      exception entry path so that we don't have to do the KVM test on entry
      to those exception handlers.
      
      We do however get hypervisor decrementer, hypervisor data storage,
      hypervisor instruction storage, and hypervisor emulation assist
      interrupts, so we have to handle those.
      
      In hypervisor mode, real-mode accesses can access all of RAM, not just
      a limited amount.  Therefore we put all the guest state in the vcpu.arch
      and use the shadow_vcpu in the PACA only for temporary scratch space.
      We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
      anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
      We don't have a shared page with the guest, but we still need a
      kvm_vcpu_arch_shared struct to store the values of various registers,
      so we include one in the vcpu_arch struct.
      
      The POWER7 processor has a restriction that all threads in a core have
      to be in the same partition.  MMU-on kernel code counts as a partition
      (partition 0), so we have to do a partition switch on every entry to and
      exit from the guest.  At present we require the host and guest to run
      in single-thread mode because of this hardware restriction.
      
      This code allocates a hashed page table for the guest and initializes
      it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
      require that the guest memory is allocated using 16MB huge pages, in
      order to simplify the low-level memory management.  This also means that
      we can get away without tracking paging activity in the host for now,
      since huge pages can't be paged or swapped.
      
      This also adds a few new exports needed by the book3s_hv code.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      de56a948
    • J
      KVM: Fix KVM_ASSIGN_SET_MSIX_ENTRY documentation · 58f0964e
      Jan Kiszka 提交于
      The documented behavior did not match the implemented one (which also
      never changed).
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      58f0964e
    • J
      KVM: Clarify KVM_ASSIGN_PCI_DEVICE documentation · 91e3d71d
      Jan Kiszka 提交于
      Neither host_irq nor the guest_msi struct are used anymore today.
      Tag the former, drop the latter to avoid confusion.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      91e3d71d
    • J
      KVM: Fixup documentation section numbering · 7f4382e8
      Jan Kiszka 提交于
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      7f4382e8
    • S
      KVM: Document KVM_IOEVENTFD · 55399a02
      Sasha Levin 提交于
      Document KVM_IOEVENTFD that can be used to receive
      notifications of PIO/MMIO events without triggering
      an exit.
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      55399a02
    • A
      KVM: Document KVM_GET_LAPIC, KVM_SET_LAPIC ioctl · e7677933
      Avi Kivity 提交于
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      e7677933
  2. 07 5月, 2011 1 次提交
  3. 15 2月, 2011 1 次提交
  4. 12 1月, 2011 1 次提交
  5. 02 11月, 2010 1 次提交
  6. 24 10月, 2010 3 次提交
  7. 02 8月, 2010 2 次提交
  8. 01 8月, 2010 5 次提交
  9. 17 5月, 2010 6 次提交
  10. 25 4月, 2010 3 次提交
  11. 01 3月, 2010 1 次提交
  12. 27 12月, 2009 1 次提交
  13. 03 12月, 2009 3 次提交
    • J
      KVM: x86: Add KVM_GET/SET_VCPU_EVENTS · 3cfc3092
      Jan Kiszka 提交于
      This new IOCTL exports all yet user-invisible states related to
      exceptions, interrupts, and NMIs. Together with appropriate user space
      changes, this fixes sporadic problems of vmsave/restore, live migration
      and system reset.
      
      [avi: future-proof abi by adding a flags field]
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      3cfc3092
    • G
      KVM: allow userspace to adjust kvmclock offset · afbcf7ab
      Glauber Costa 提交于
      When we migrate a kvm guest that uses pvclock between two hosts, we may
      suffer a large skew. This is because there can be significant differences
      between the monotonic clock of the hosts involved. When a new host with
      a much larger monotonic time starts running the guest, the view of time
      will be significantly impacted.
      
      Situation is much worse when we do the opposite, and migrate to a host with
      a smaller monotonic clock.
      
      This proposed ioctl will allow userspace to inform us what is the monotonic
      clock value in the source host, so we can keep the time skew short, and
      more importantly, never goes backwards. Userspace may also need to trigger
      the current data, since from the first migration onwards, it won't be
      reflected by a simple call to clock_gettime() anymore.
      
      [marcelo: future-proof abi with a flags field]
      [jan: fix KVM_GET_CLOCK by clearing flags field instead of checking it]
      Signed-off-by: NGlauber Costa <glommer@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      afbcf7ab
    • E
      KVM: Xen PV-on-HVM guest support · ffde22ac
      Ed Swierk 提交于
      Support for Xen PV-on-HVM guests can be implemented almost entirely in
      userspace, except for handling one annoying MSR that maps a Xen
      hypercall blob into guest address space.
      
      A generic mechanism to delegate MSR writes to userspace seems overkill
      and risks encouraging similar MSR abuse in the future.  Thus this patch
      adds special support for the Xen HVM MSR.
      
      I implemented a new ioctl, KVM_XEN_HVM_CONFIG, that lets userspace tell
      KVM which MSR the guest will write to, as well as the starting address
      and size of the hypercall blobs (one each for 32-bit and 64-bit) that
      userspace has loaded from files.  When the guest writes to the MSR, KVM
      copies one page of the blob from userspace to the guest.
      
      I've tested this patch with a hacked-up version of Gerd's userspace
      code, booting a number of guests (CentOS 5.3 i386 and x86_64, and
      FreeBSD 8.0-RC1 amd64) and exercising PV network and block devices.
      
      [jan: fix i386 build warning]
      [avi: future proof abi with a flags field]
      Signed-off-by: NEd Swierk <eswierk@aristanetworks.com>
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      ffde22ac
  14. 10 9月, 2009 2 次提交