1. 16 6月, 2016 1 次提交
    • J
      MIPS: KVM: Add KScratch registers · 05108709
      James Hogan 提交于
      Allow up to 6 KVM guest KScratch registers to be enabled and accessed
      via the KVM guest register API and from the guest itself (the fallback
      reading and writing of commpage registers is sufficient for KScratch
      registers to work as expected).
      
      User mode can expose the registers by setting the appropriate bits of
      the guest Config4.KScrExist field. KScratch registers that aren't usable
      won't be writeable via the KVM Ioctl API.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      05108709
  2. 14 6月, 2016 1 次提交
  3. 10 6月, 2016 3 次提交
    • D
      KVM: s390: provide CMMA attributes only if available · f9cbd9b0
      David Hildenbrand 提交于
      Let's not provide the device attribute for cmma enabling and clearing
      if the hardware doesn't support it.
      
      This also helps getting rid of the undocumented return value "-EINVAL"
      in case CMMA is not available when trying to enable it.
      
      Also properly document the meaning of -EINVAL for CMMA clearing.
      Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      f9cbd9b0
    • D
      KVM: s390: interface to query and configure cpu subfunctions · 0a763c78
      David Hildenbrand 提交于
      We have certain instructions that indicate available subfunctions via
      a query subfunction (crypto functions and ptff), or via a test bit
      function (plo).
      
      By exposing these "subfunction blocks" to user space, we allow user space
      to
      1) query available subfunctions and make sure subfunctions won't get lost
         during migration - e.g. properly indicate them via a CPU model
      2) change the subfunctions to be reported to the guest (even adding
         unavailable ones)
      
      This mechanism works just like the way we indicate the stfl(e) list to
      user space.
      
      This way, user space could even emulate some subfunctions in QEMU in the
      future. If this is ever applicable, we have to make sure later on, that
      unsupported subfunctions result in an intercept to QEMU.
      
      Please note that support to indicate them to the guest is still missing
      and requires hardware support. Usually, the IBC takes already care of these
      subfunctions for migration safety. QEMU should make sure to always set
      these bits properly according to the machine generation to be emulated.
      
      Available subfunctions are only valid in combination with STFLE bits
      retrieved via KVM_S390_VM_CPU_MACHINE and enabled via
      KVM_S390_VM_CPU_PROCESSOR. If the applicable bits are available, the
      indicated subfunctions are guaranteed to be correct.
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      0a763c78
    • D
      KVM: s390: interface to query and configure cpu features · 15c9705f
      David Hildenbrand 提交于
      For now, we only have an interface to query and configure facilities
      indicated via STFL(E). However, we also have features indicated via
      SCLP, that have to be indicated to the guest by user space and usually
      require KVM support.
      
      This patch allows user space to query and configure available cpu features
      for the guest.
      
      Please note that disabling a feature doesn't necessarily mean that it is
      completely disabled (e.g. ESOP is mostly handled by the SIE). We will try
      our best to disable it.
      
      Most features (e.g. SCLP) can't directly be forwarded, as most of them need
      in addition to hardware support, support in KVM. As we later on want to
      turn these features in KVM explicitly on/off (to simulate different
      behavior), we have to filter all features provided by the hardware and
      make them configurable.
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      15c9705f
  4. 12 5月, 2016 1 次提交
    • G
      kvm: introduce KVM_MAX_VCPU_ID · 0b1b1dfd
      Greg Kurz 提交于
      The KVM_MAX_VCPUS define provides the maximum number of vCPUs per guest, and
      also the upper limit for vCPU ids. This is okay for all archs except PowerPC
      which can have higher ids, depending on the cpu/core/thread topology. In the
      worst case (single threaded guest, host with 8 threads per core), it limits
      the maximum number of vCPUS to KVM_MAX_VCPUS / 8.
      
      This patch separates the vCPU numbering from the total number of vCPUs, with
      the introduction of KVM_MAX_VCPU_ID, as the maximal valid value for vCPU ids
      plus one.
      
      The corresponding KVM_CAP_MAX_VCPU_ID allows userspace to validate vCPU ids
      before passing them to KVM_CREATE_VCPU.
      
      This patch only implements KVM_MAX_VCPU_ID with a specific value for PowerPC.
      Other archs continue to return KVM_MAX_VCPUS instead.
      Suggested-by: NRadim Krcmar <rkrcmar@redhat.com>
      Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
      Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0b1b1dfd
  5. 09 5月, 2016 1 次提交
  6. 25 4月, 2016 1 次提交
  7. 20 4月, 2016 2 次提交
  8. 10 3月, 2016 1 次提交
    • P
      KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo · 844a5fe2
      Paolo Bonzini 提交于
      Yes, all of these are needed. :) This is admittedly a bit odd, but
      kvm-unit-tests access.flat tests this if you run it with "-cpu host"
      and of course ept=0.
      
      KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
      specially when pte.u=1/pte.w=0/CR0.WP=0.  Such writes cause a fault
      when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
      When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
      restarts execution.  This will still cause a user write to fault, while
      supervisor writes will succeed.  User reads will fault spuriously now,
      and KVM will then flip U and W again in the SPTE (U=1, W=0).  User reads
      will be enabled and supervisor writes disabled, going back to the
      originary situation where supervisor writes fault spuriously.
      
      When SMEP is in effect, however, U=0 will enable kernel execution of
      this page.  To avoid this, KVM also sets NX=1 in the shadow PTE together
      with U=0.  If the guest has not enabled NX, the result is a continuous
      stream of page faults due to the NX bit being reserved.
      
      The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
      switch.  (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
      control, so they do not use user-return notifiers for EFER---if they did,
      EFER.NX would be forced to the same value as the host).
      
      There is another bug in the reserved bit check, which I've split to a
      separate patch for easier application to stable kernels.
      
      Cc: stable@vger.kernel.org
      Cc: Andy Lutomirski <luto@amacapital.net>
      Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
      Fixes: f6577a5fSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      844a5fe2
  9. 04 3月, 2016 1 次提交
  10. 03 3月, 2016 1 次提交
  11. 02 3月, 2016 1 次提交
    • A
      KVM: PPC: Add support for 64bit TCE windows · 58ded420
      Alexey Kardashevskiy 提交于
      The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not
      enough for directly mapped windows as the guest can get more than 4GB.
      
      This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it
      via KVM_CAP_SPAPR_TCE_64 capability. The table size is checked against
      the locked memory limit.
      
      Since 64bit windows are to support Dynamic DMA windows (DDW), let's add
      @bus_offset and @page_shift which are also required by DDW.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      58ded420
  12. 01 3月, 2016 3 次提交
  13. 17 2月, 2016 1 次提交
    • A
      kvm/x86: Hyper-V VMBus hypercall userspace exit · 83326e43
      Andrey Smetanin 提交于
      The patch implements KVM_EXIT_HYPERV userspace exit
      functionality for Hyper-V VMBus hypercalls:
      HV_X64_HCALL_POST_MESSAGE, HV_X64_HCALL_SIGNAL_EVENT.
      
      Changes v3:
      * use vcpu->arch.complete_userspace_io to setup hypercall
      result
      
      Changes v2:
      * use KVM_EXIT_HYPERV for hypercalls
      Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com>
      Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
      CC: Gleb Natapov <gleb@kernel.org>
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Joerg Roedel <joro@8bytes.org>
      CC: "K. Y. Srinivasan" <kys@microsoft.com>
      CC: Haiyang Zhang <haiyangz@microsoft.com>
      CC: Roman Kagan <rkagan@virtuozzo.com>
      CC: Denis V. Lunev <den@openvz.org>
      CC: qemu-devel@nongnu.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      83326e43
  14. 16 2月, 2016 1 次提交
    • A
      KVM: PPC: Add support for multiple-TCE hcalls · d3695aa4
      Alexey Kardashevskiy 提交于
      This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and
      H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO
      devices or emulated PCI. These calls allow adding multiple entries
      (up to 512) into the TCE table in one call which saves time on
      transition between kernel and user space.
      
      The current implementation of kvmppc_h_stuff_tce() allows it to be
      executed in both real and virtual modes so there is one helper.
      The kvmppc_rm_h_put_tce_indirect() needs to translate the guest address
      to the host address and since the translation is different, there are
      2 helpers - one for each mode.
      
      This implements the KVM_CAP_PPC_MULTITCE capability. When present,
      the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE if these
      are enabled by the userspace via KVM_CAP_PPC_ENABLE_HCALL.
      If they can not be handled by the kernel, they are passed on to
      the user space. The user space still has to have an implementation
      for these.
      
      Both HV and PR-syle KVM are supported.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      d3695aa4
  15. 10 2月, 2016 3 次提交
  16. 26 1月, 2016 1 次提交
  17. 16 12月, 2015 1 次提交
  18. 26 11月, 2015 3 次提交
    • T
      KVM: x86: MMU: Consolidate BUG_ON checks for reverse-mapped sptes · 77fbbbd2
      Takuya Yoshikawa 提交于
      At some call sites of rmap_get_first() and rmap_get_next(), BUG_ON is
      placed right after the call to detect unrelated sptes which must not be
      found in the reverse-mapping list.
      
      Move this check in rmap_get_first/next() so that all call sites, not
      just the users of the for_each_rmap_spte() macro, will be checked the
      same way.
      
      One thing to keep in mind is that kvm_mmu_unlink_parents() also uses
      rmap_get_first() to handle parent sptes.  The change will not break it
      because parent sptes are present, at least until drop_parent_pte()
      actually unlinks them, and not mmio-sptes.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      77fbbbd2
    • A
      kvm/x86: Hyper-V kvm exit · db397571
      Andrey Smetanin 提交于
      A new vcpu exit is introduced to notify the userspace of the
      changes in Hyper-V SynIC configuration triggered by guest writing to the
      corresponding MSRs.
      
      Changes v4:
      * exit into userspace only if guest writes into SynIC MSR's
      
      Changes v3:
      * added KVM_EXIT_HYPERV types and structs notes into docs
      Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com>
      Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Gleb Natapov <gleb@kernel.org>
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Roman Kagan <rkagan@virtuozzo.com>
      CC: Denis V. Lunev <den@openvz.org>
      CC: qemu-devel@nongnu.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      db397571
    • A
      kvm/x86: Hyper-V synthetic interrupt controller · 5c919412
      Andrey Smetanin 提交于
      SynIC (synthetic interrupt controller) is a lapic extension,
      which is controlled via MSRs and maintains for each vCPU
       - 16 synthetic interrupt "lines" (SINT's); each can be configured to
         trigger a specific interrupt vector optionally with auto-EOI
         semantics
       - a message page in the guest memory with 16 256-byte per-SINT message
         slots
       - an event flag page in the guest memory with 16 2048-bit per-SINT
         event flag areas
      
      The host triggers a SINT whenever it delivers a new message to the
      corresponding slot or flips an event flag bit in the corresponding area.
      The guest informs the host that it can try delivering a message by
      explicitly asserting EOI in lapic or writing to End-Of-Message (EOM)
      MSR.
      
      The userspace (qemu) triggers interrupts and receives EOM notifications
      via irqfd with resampler; for that, a GSI is allocated for each
      configured SINT, and irq_routing api is extended to support GSI-SINT
      mapping.
      
      Changes v4:
      * added activation of SynIC by vcpu KVM_ENABLE_CAP
      * added per SynIC active flag
      * added deactivation of APICv upon SynIC activation
      
      Changes v3:
      * added KVM_CAP_HYPERV_SYNIC and KVM_IRQ_ROUTING_HV_SINT notes into
      docs
      
      Changes v2:
      * do not use posted interrupts for Hyper-V SynIC AutoEOI vectors
      * add Hyper-V SynIC vectors into EOI exit bitmap
      * Hyper-V SyniIC SINT msr write logic simplified
      Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com>
      Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Gleb Natapov <gleb@kernel.org>
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Roman Kagan <rkagan@virtuozzo.com>
      CC: Denis V. Lunev <den@openvz.org>
      CC: qemu-devel@nongnu.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5c919412
  19. 23 10月, 2015 2 次提交
  20. 12 10月, 2015 1 次提交
  21. 01 10月, 2015 6 次提交
    • F
      KVM: Update Posted-Interrupts Descriptor when vCPU is blocked · bf9f6ac8
      Feng Wu 提交于
      This patch updates the Posted-Interrupts Descriptor when vCPU
      is blocked.
      
      pre-block:
      - Add the vCPU to the blocked per-CPU list
      - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR
      
      post-block:
      - Remove the vCPU from the per-CPU list
      Signed-off-by: NFeng Wu <feng.wu@intel.com>
      [Concentrate invocation of pre/post-block hooks to vcpu_block. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bf9f6ac8
    • J
      kvm: add capability for any-length ioeventfds · e9ea5069
      Jason Wang 提交于
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e9ea5069
    • S
      KVM: x86: Add support for local interrupt requests from userspace · 1c1a9ce9
      Steve Rutherford 提交于
      In order to enable userspace PIC support, the userspace PIC needs to
      be able to inject local interrupts even when the APICs are in the
      kernel.
      
      KVM_INTERRUPT now supports sending local interrupts to an APIC when
      APICs are in the kernel.
      
      The ready_for_interrupt_request flag is now only set when the CPU/APIC
      will immediately accept and inject an interrupt (i.e. APIC has not
      masked the PIC).
      
      When the PIC wishes to initiate an INTA cycle with, say, CPU0, it
      kicks CPU0 out of the guest, and renedezvous with CPU0 once it arrives
      in userspace.
      
      When the CPU/APIC unmasks the PIC, a KVM_EXIT_IRQ_WINDOW_OPEN is
      triggered, so that userspace has a chance to inject a PIC interrupt
      if it had been pending.
      
      Overall, this design can lead to a small number of spurious userspace
      renedezvous. In particular, whenever the PIC transistions from low to
      high while it is masked and whenever the PIC becomes unmasked while
      it is low.
      
      Note: this does not buffer more than one local interrupt in the
      kernel, so the VMM needs to enter the guest in order to complete
      interrupt injection before injecting an additional interrupt.
      
      Compiles for x86.
      
      Can pass the KVM Unit Tests.
      Signed-off-by: NSteve Rutherford <srutherford@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1c1a9ce9
    • S
      KVM: x86: Add EOI exit bitmap inference · b053b2ae
      Steve Rutherford 提交于
      In order to support a userspace IOAPIC interacting with an in kernel
      APIC, the EOI exit bitmaps need to be configurable.
      
      If the IOAPIC is in userspace (i.e. the irqchip has been split), the
      EOI exit bitmaps will be set whenever the GSI Routes are configured.
      In particular, for the low MSI routes are reservable for userspace
      IOAPICs. For these MSI routes, the EOI Exit bit corresponding to the
      destination vector of the route will be set for the destination VCPU.
      
      The intention is for the userspace IOAPICs to use the reservable MSI
      routes to inject interrupts into the guest.
      
      This is a slight abuse of the notion of an MSI Route, given that MSIs
      classically bypass the IOAPIC. It might be worthwhile to add an
      additional route type to improve clarity.
      
      Compile tested for Intel x86.
      Signed-off-by: NSteve Rutherford <srutherford@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b053b2ae
    • S
      KVM: x86: Add KVM exit for IOAPIC EOIs · 7543a635
      Steve Rutherford 提交于
      Adds KVM_EXIT_IOAPIC_EOI which allows the kernel to EOI
      level-triggered IOAPIC interrupts.
      
      Uses a per VCPU exit bitmap to decide whether or not the IOAPIC needs
      to be informed (which is identical to the EOI_EXIT_BITMAP field used
      by modern x86 processors, but can also be used to elide kvm IOAPIC EOI
      exits on older processors).
      
      [Note: A prototype using ResampleFDs found that decoupling the EOI
      from the VCPU's thread made it possible for the VCPU to not see a
      recent EOI after reentering the guest. This does not match real
      hardware.]
      
      Compile tested for Intel x86.
      Signed-off-by: NSteve Rutherford <srutherford@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7543a635
    • S
      KVM: x86: Split the APIC from the rest of IRQCHIP. · 49df6397
      Steve Rutherford 提交于
      First patch in a series which enables the relocation of the
      PIC/IOAPIC to userspace.
      
      Adds capability KVM_CAP_SPLIT_IRQCHIP;
      
      KVM_CAP_SPLIT_IRQCHIP enables the construction of LAPICs without the
      rest of the irqchip.
      
      Compile tested for x86.
      Signed-off-by: NSteve Rutherford <srutherford@google.com>
      Suggested-by: NAndrew Honig <ahonig@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      49df6397
  22. 23 7月, 2015 1 次提交
  23. 21 7月, 2015 3 次提交