1. 16 2月, 2016 1 次提交
  2. 16 1月, 2016 1 次提交
    • D
      kvm: rename pfn_t to kvm_pfn_t · ba049e93
      Dan Williams 提交于
      To date, we have implemented two I/O usage models for persistent memory,
      PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
      userspace).  This series adds a third, DAX-GUP, that allows DAX mappings
      to be the target of direct-i/o.  It allows userspace to coordinate
      DMA/RDMA from/to persistent memory.
      
      The implementation leverages the ZONE_DEVICE mm-zone that went into
      4.3-rc1 (also discussed at kernel summit) to flag pages that are owned
      and dynamically mapped by a device driver.  The pmem driver, after
      mapping a persistent memory range into the system memmap via
      devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus
      page-backed pmem-pfns via flags in the new pfn_t type.
      
      The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the
      resulting pte(s) inserted into the process page tables with a new
      _PAGE_DEVMAP flag.  Later, when get_user_pages() is walking ptes it keys
      off _PAGE_DEVMAP to pin the device hosting the page range active.
      Finally, get_page() and put_page() are modified to take references
      against the device driver established page mapping.
      
      Finally, this need for "struct page" for persistent memory requires
      memory capacity to store the memmap array.  Given the memmap array for a
      large pool of persistent may exhaust available DRAM introduce a
      mechanism to allocate the memmap from persistent memory.  The new
      "struct vmem_altmap *" parameter to devm_memremap_pages() enables
      arch_add_memory() to use reserved pmem capacity rather than the page
      allocator.
      
      This patch (of 18):
      
      The core has developed a need for a "pfn_t" type [1].  Move the existing
      pfn_t in KVM to kvm_pfn_t [2].
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.htmlSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba049e93
  3. 28 5月, 2015 1 次提交
  4. 26 5月, 2015 1 次提交
  5. 21 4月, 2015 1 次提交
    • M
      KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation. · e928e9cb
      Michael Ellerman 提交于
      Some PowerNV systems include a hardware random-number generator.
      This HWRNG is present on POWER7+ and POWER8 chips and is capable of
      generating one 64-bit random number every microsecond.  The random
      numbers are produced by sampling a set of 64 unstable high-frequency
      oscillators and are almost completely entropic.
      
      PAPR defines an H_RANDOM hypercall which guests can use to obtain one
      64-bit random sample from the HWRNG.  This adds a real-mode
      implementation of the H_RANDOM hypercall.  This hypercall was
      implemented in real mode because the latency of reading the HWRNG is
      generally small compared to the latency of a guest exit and entry for
      all the threads in the same virtual core.
      
      Userspace can detect the presence of the HWRNG and the H_RANDOM
      implementation by querying the KVM_CAP_PPC_HWRNG capability.  The
      H_RANDOM hypercall implementation will only be invoked when the guest
      does an H_RANDOM hypercall if userspace first enables the in-kernel
      H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e928e9cb
  6. 17 12月, 2014 1 次提交
    • P
      KVM: PPC: Book3S HV: Remove code for PPC970 processors · c17b98cf
      Paul Mackerras 提交于
      This removes the code that was added to enable HV KVM to work
      on PPC970 processors.  The PPC970 is an old CPU that doesn't
      support virtualizing guest memory.  Removing PPC970 support also
      lets us remove the code for allocating and managing contiguous
      real-mode areas, the code for the !kvm->arch.using_mmu_notifiers
      case, the code for pinning pages of guest memory when first
      accessed and keeping track of which pages have been pinned, and
      the code for handling H_ENTER hypercalls in virtual mode.
      
      Book3S HV KVM is now supported only on POWER7 and POWER8 processors.
      The KVM_CAP_PPC_RMA capability now always returns 0.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      c17b98cf
  7. 24 9月, 2014 1 次提交
    • A
      kvm: Fix page ageing bugs · 57128468
      Andres Lagar-Cavilla 提交于
      1. We were calling clear_flush_young_notify in unmap_one, but we are
      within an mmu notifier invalidate range scope. The spte exists no more
      (due to range_start) and the accessed bit info has already been
      propagated (due to kvm_pfn_set_accessed). Simply call
      clear_flush_young.
      
      2. We clear_flush_young on a primary MMU PMD, but this may be mapped
      as a collection of PTEs by the secondary MMU (e.g. during log-dirty).
      This required expanding the interface of the clear_flush_young mmu
      notifier, so a lot of code has been trivially touched.
      
      3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate
      the access bit by blowing the spte. This requires proper synchronizing
      with MMU notifier consumers, like every other removal of spte's does.
      Signed-off-by: NAndres Lagar-Cavilla <andreslc@google.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      57128468
  8. 22 9月, 2014 3 次提交
  9. 30 7月, 2014 1 次提交
  10. 29 7月, 2014 3 次提交
  11. 28 7月, 2014 7 次提交
  12. 30 5月, 2014 1 次提交
    • A
      KVM: PPC: Make shared struct aka magic page guest endian · 5deb8e7a
      Alexander Graf 提交于
      The shared (magic) page is a data structure that contains often used
      supervisor privileged SPRs accessible via memory to the user to reduce
      the number of exits we have to take to read/write them.
      
      When we actually share this structure with the guest we have to maintain
      it in guest endianness, because some of the patch tricks only work with
      native endian load/store operations.
      
      Since we only share the structure with either host or guest in little
      endian on book3s_64 pr mode, we don't have to worry about booke or book3s hv.
      
      For booke, the shared struct stays big endian. For book3s_64 hv we maintain
      the struct in host native endian, since it never gets shared with the guest.
      
      For book3s_64 pr we introduce a variable that tells us which endianness the
      shared struct is in and route every access to it through helper inline
      functions that evaluate this variable.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5deb8e7a
  13. 28 5月, 2014 1 次提交
  14. 26 3月, 2014 1 次提交
  15. 27 1月, 2014 2 次提交
    • S
      kvm/ppc: IRQ disabling cleanup · 6c85f52b
      Scott Wood 提交于
      Simplify the handling of lazy EE by going directly from fully-enabled
      to hard-disabled.  This replaces the lazy_irq_pending() check
      (including its misplaced kvm_guest_exit() call).
      
      As suggested by Tiejun Chen, move the interrupt disabling into
      kvmppc_prepare_to_enter() rather than have each caller do it.  Also
      move the IRQ enabling on heavyweight exit into
      kvmppc_prepare_to_enter().
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6c85f52b
    • C
      KVM: PPC: Book3S: MMIO emulation support for little endian guests · 73601775
      Cédric Le Goater 提交于
      MMIO emulation reads the last instruction executed by the guest
      and then emulates. If the guest is running in Little Endian order,
      or more generally in a different endian order of the host, the
      instruction needs to be byte-swapped before being emulated.
      
      This patch adds a helper routine which tests the endian order of
      the host and the guest in order to decide whether a byteswap is
      needed or not. It is then used to byteswap the last instruction
      of the guest in the endian order of the host before MMIO emulation
      is performed.
      
      Finally, kvmppc_handle_load() of kvmppc_handle_store() are modified
      to reverse the endianness of the MMIO if required.
      Signed-off-by: NCédric Le Goater <clg@fr.ibm.com>
      [agraf: add booke handling]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      73601775
  16. 18 10月, 2013 2 次提交
  17. 17 10月, 2013 4 次提交
  18. 11 7月, 2013 1 次提交
  19. 08 7月, 2013 2 次提交
  20. 02 5月, 2013 1 次提交
    • P
      KVM: PPC: Book3S: Add API for in-kernel XICS emulation · 5975a2e0
      Paul Mackerras 提交于
      This adds the API for userspace to instantiate an XICS device in a VM
      and connect VCPUs to it.  The API consists of a new device type for
      the KVM_CREATE_DEVICE ioctl, a new capability KVM_CAP_IRQ_XICS, which
      functions similarly to KVM_CAP_IRQ_MPIC, and the KVM_IRQ_LINE ioctl,
      which is used to assert and deassert interrupt inputs of the XICS.
      
      The XICS device has one attribute group, KVM_DEV_XICS_GRP_SOURCES.
      Each attribute within this group corresponds to the state of one
      interrupt source.  The attribute number is the same as the interrupt
      source number.
      
      This does not support irq routing or irqfd yet.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5975a2e0
  21. 27 4月, 2013 4 次提交
    • P
      KVM: PPC: Book3S: Facilities to save/restore XICS presentation ctrler state · 8b78645c
      Paul Mackerras 提交于
      This adds the ability for userspace to save and restore the state
      of the XICS interrupt presentation controllers (ICPs) via the
      KVM_GET/SET_ONE_REG interface.  Since there is one ICP per vcpu, we
      simply define a new 64-bit register in the ONE_REG space for the ICP
      state.  The state includes the CPU priority setting, the pending IPI
      priority, and the priority and source number of any pending external
      interrupt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8b78645c
    • P
      KVM: PPC: Book3S: Add support for ibm,int-on/off RTAS calls · d19bd862
      Paul Mackerras 提交于
      This adds support for the ibm,int-on and ibm,int-off RTAS calls to the
      in-kernel XICS emulation and corrects the handling of the saved
      priority by the ibm,set-xive RTAS call.  With this, ibm,int-off sets
      the specified interrupt's priority in its saved_priority field and
      sets the priority to 0xff (the least favoured value).  ibm,int-on
      restores the saved_priority to the priority field, and ibm,set-xive
      sets both the priority and the saved_priority to the specified
      priority value.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      d19bd862
    • B
      KVM: PPC: Book3S HV: Speed up wakeups of CPUs on HV KVM · 54695c30
      Benjamin Herrenschmidt 提交于
      Currently, we wake up a CPU by sending a host IPI with
      smp_send_reschedule() to thread 0 of that core, which will take all
      threads out of the guest, and cause them to re-evaluate their
      interrupt status on the way back in.
      
      This adds a mechanism to differentiate real host IPIs from IPIs sent
      by KVM for guest threads to poke each other, in order to target the
      guest threads precisely when possible and avoid that global switch of
      the core to host state.
      
      We then use this new facility in the in-kernel XICS code.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      54695c30
    • B
      KVM: PPC: Book3S: Add kernel emulation for the XICS interrupt controller · bc5ad3f3
      Benjamin Herrenschmidt 提交于
      This adds in-kernel emulation of the XICS (eXternal Interrupt
      Controller Specification) interrupt controller specified by PAPR, for
      both HV and PR KVM guests.
      
      The XICS emulation supports up to 1048560 interrupt sources.
      Interrupt source numbers below 16 are reserved; 0 is used to mean no
      interrupt and 2 is used for IPIs.  Internally these are represented in
      blocks of 1024, called ICS (interrupt controller source) entities, but
      that is not visible to userspace.
      
      Each vcpu gets one ICP (interrupt controller presentation) entity,
      used to store the per-vcpu state such as vcpu priority, pending
      interrupt state, IPI request, etc.
      
      This does not include any API or any way to connect vcpus to their
      ICP state; that will be added in later patches.
      
      This is based on an initial implementation by Michael Ellerman
      <michael@ellerman.id.au> reworked by Benjamin Herrenschmidt and
      Paul Mackerras.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix typo, add dependency on !KVM_MPIC]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      bc5ad3f3