1. 28 7月, 2014 7 次提交
    • M
      KVM: PPC: Allow kvmppc_get_last_inst() to fail · 51f04726
      Mihai Caraman 提交于
      On book3e, guest last instruction is read on the exit path using load
      external pid (lwepx) dedicated instruction. This load operation may fail
      due to TLB eviction and execute-but-not-read entries.
      
      This patch lay down the path for an alternative solution to read the guest
      last instruction, by allowing kvmppc_get_lat_inst() function to fail.
      Architecture specific implmentations of kvmppc_load_last_inst() may read
      last guest instruction and instruct the emulation layer to re-execute the
      guest in case of failure.
      
      Make kvmppc_get_last_inst() definition common between architectures.
      Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      51f04726
    • A
      KVM: PPC: Book3S: Make magic page properly 4k mappable · 89b68c96
      Alexander Graf 提交于
      The magic page is defined as a 4k page of per-vCPU data that is shared
      between the guest and the host to accelerate accesses to privileged
      registers.
      
      However, when the host is using 64k page size granularity we weren't quite
      as strict about that rule anymore. Instead, we partially treated all of the
      upper 64k as magic page and mapped only the uppermost 4k with the actual
      magic contents.
      
      This works well enough for Linux which doesn't use any memory in kernel
      space in the upper 64k, but Mac OS X got upset. So this patch makes magic
      page actually stay in a 4k range even on 64k page size hosts.
      
      This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE
      hosts for me.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      89b68c96
    • A
      KVM: PPC: Book3S: Add hack for split real mode · c01e3f66
      Alexander Graf 提交于
      Today we handle split real mode by mapping both instruction and data faults
      into a special virtual address space that only exists during the split mode
      phase.
      
      This is good enough to catch 32bit Linux guests that use split real mode for
      copy_from/to_user. In this case we're always prefixed with 0xc0000000 for our
      instruction pointer and can map the user space process freely below there.
      
      However, that approach fails when we're running KVM inside of KVM. Here the 1st
      level last_inst reader may well be in the same virtual page as a 2nd level
      interrupt handler.
      
      It also fails when running Mac OS X guests. Here we have a 4G/4G split, so a
      kernel copy_from/to_user implementation can easily overlap with user space
      addresses.
      
      The architecturally correct way to fix this would be to implement an instruction
      interpreter in KVM that kicks in whenever we go into split real mode. This
      interpreter however would not receive a great amount of testing and be a lot of
      bloat for a reasonably isolated corner case.
      
      So I went back to the drawing board and tried to come up with a way to make
      split real mode work with a single flat address space. And then I realized that
      we could get away with the same trick that makes it work for Linux:
      
      Whenever we see an instruction address during split real mode that may collide,
      we just move it higher up the virtual address space to a place that hopefully
      does not collide (keep your fingers crossed!).
      
      That approach does work surprisingly well. I am able to successfully run
      Mac OS X guests with KVM and QEMU (no split real mode hacks like MOL) when I
      apply a tiny timing probe hack to QEMU. I'd say this is a win over even more
      broken split real mode :).
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      c01e3f66
    • A
      KVM: PPC: Deflect page write faults properly in kvmppc_st · 17824b5a
      Alexander Graf 提交于
      When we have a page that we're not allowed to write to, xlate() will already
      tell us -EPERM on lookup of that page. With the code as is we change it into
      a "page missing" error which a guest may get confused about. Instead, just
      tell the caller about the -EPERM directly.
      
      This fixes Mac OS X guests when run with DCBZ32 emulation.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      17824b5a
    • P
      KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled · ae2113a4
      Paul Mackerras 提交于
      This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL
      capability is used to enable or disable in-kernel handling of an
      hcall, that the hcall is actually implemented by the kernel.
      If not an EINVAL error is returned.
      
      This also checks the default-enabled list of hcalls and prints a
      warning if any hcall there is not actually implemented.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ae2113a4
    • A
      KVM: PPC: BOOK3S: PR: Emulate instruction counter · 06da28e7
      Aneesh Kumar K.V 提交于
      Writing to IC is not allowed in the privileged mode.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      06da28e7
    • A
      KVM: PPC: BOOK3S: PR: Emulate virtual timebase register · 8f42ab27
      Aneesh Kumar K.V 提交于
      virtual time base register is a per VM, per cpu register that needs
      to be saved and restored on vm exit and entry. Writing to VTB is not
      allowed in the privileged mode.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [agraf: fix compile error]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8f42ab27
  2. 30 5月, 2014 4 次提交
    • A
      KVM: PPC: Book3S PR: Expose EBB registers · 2e23f544
      Alexander Graf 提交于
      POWER8 introduces a new facility called the "Event Based Branch" facility.
      It contains of a few registers that indicate where a guest should branch to
      when a defined event occurs and it's in PR mode.
      
      We don't want to really enable EBB as it will create a big mess with !PR guest
      mode while hardware is in PR and we don't really emulate the PMU anyway.
      
      So instead, let's just leave it at emulation of all its registers.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2e23f544
    • A
      KVM: PPC: Book3S PR: Expose TAR facility to guest · e14e7a1e
      Alexander Graf 提交于
      POWER8 implements a new register called TAR. This register has to be
      enabled in FSCR and then from KVM's point of view is mere storage.
      
      This patch enables the guest to use TAR.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e14e7a1e
    • A
      KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR · 616dff86
      Alexander Graf 提交于
      POWER8 introduced a new interrupt type called "Facility unavailable interrupt"
      which contains its status message in a new register called FSCR.
      
      Handle these exits and try to emulate instructions for unhandled facilities.
      Follow-on patches enable KVM to expose specific facilities into the guest.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      616dff86
    • A
      KVM: PPC: Make shared struct aka magic page guest endian · 5deb8e7a
      Alexander Graf 提交于
      The shared (magic) page is a data structure that contains often used
      supervisor privileged SPRs accessible via memory to the user to reduce
      the number of exits we have to take to read/write them.
      
      When we actually share this structure with the guest we have to maintain
      it in guest endianness, because some of the patch tricks only work with
      native endian load/store operations.
      
      Since we only share the structure with either host or guest in little
      endian on book3s_64 pr mode, we don't have to worry about booke or book3s hv.
      
      For booke, the shared struct stays big endian. For book3s_64 hv we maintain
      the struct in host native endian, since it never gets shared with the guest.
      
      For book3s_64 pr we introduce a variable that tells us which endianness the
      shared struct is in and route every access to it through helper inline
      functions that evaluate this variable.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5deb8e7a
  3. 28 4月, 2014 1 次提交
  4. 09 1月, 2014 2 次提交
  5. 18 10月, 2013 2 次提交
  6. 17 10月, 2013 6 次提交
    • A
      kvm: Add struct kvm arg to memslot APIs · 5587027c
      Aneesh Kumar K.V 提交于
      We will use that in the later patch to find the kvm ops handler
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5587027c
    • A
      kvm: powerpc: book3s: Support building HV and PR KVM as module · 2ba9f0d8
      Aneesh Kumar K.V 提交于
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [agraf: squash in compile fix]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2ba9f0d8
    • A
      kvm: powerpc: book3s: Add is_hv_enabled to kvmppc_ops · 699cc876
      Aneesh Kumar K.V 提交于
      This help us to identify whether we are running with hypervisor mode KVM
      enabled. The change is needed so that we can have both HV and PR kvm
      enabled in the same kernel.
      
      If both HV and PR KVM are included, interrupts come in to the HV version
      of the kvmppc_interrupt code, which then jumps to the PR handler,
      renamed to kvmppc_interrupt_pr, if the guest is a PR guest.
      
      Allowing both PR and HV in the same kernel required some changes to
      kvm_dev_ioctl_check_extension(), since the values returned now can't
      be selected with #ifdefs as much as previously. We look at is_hv_enabled
      to return the right value when checking for capabilities.For capabilities that
      are only provided by HV KVM, we return the HV value only if
      is_hv_enabled is true. For capabilities provided by PR KVM but not HV,
      we return the PR value only if is_hv_enabled is false.
      
      NOTE: in later patch we replace is_hv_enabled with a static inline
      function comparing kvm_ppc_ops
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      699cc876
    • A
      kvm: powerpc: Add kvmppc_ops callback · 3a167bea
      Aneesh Kumar K.V 提交于
      This patch add a new callback kvmppc_ops. This will help us in enabling
      both HV and PR KVM together in the same kernel. The actual change to
      enable them together is done in the later patch in the series.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [agraf: squash in booke changes]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3a167bea
    • P
      KVM: PPC: Book3S PR: Better handling of host-side read-only pages · 93b159b4
      Paul Mackerras 提交于
      Currently we request write access to all pages that get mapped into the
      guest, even if the guest is only loading from the page.  This reduces
      the effectiveness of KSM because it means that we unshare every page we
      access.  Also, we always set the changed (C) bit in the guest HPTE if
      it allows writing, even for a guest load.
      
      This fixes both these problems.  We pass an 'iswrite' flag to the
      mmu.xlate() functions and to kvmppc_mmu_map_page() to indicate whether
      the access is a load or a store.  The mmu.xlate() functions now only
      set C for stores.  kvmppc_gfn_to_pfn() now calls gfn_to_pfn_prot()
      instead of gfn_to_pfn() so that it can indicate whether we need write
      access to the page, and get back a 'writable' flag to indicate whether
      the page is writable or not.  If that 'writable' flag is clear, we then
      make the host HPTE read-only even if the guest HPTE allowed writing.
      
      This means that we can get a protection fault when the guest writes to a
      page that it has mapped read-write but which is read-only on the host
      side (perhaps due to KSM having merged the page).  Thus we now call
      kvmppc_handle_pagefault() for protection faults as well as HPTE not found
      faults.  In kvmppc_handle_pagefault(), if the access was allowed by the
      guest HPTE and we thus need to install a new host HPTE, we then need to
      remove the old host HPTE if there is one.  This is done with a new
      function, kvmppc_mmu_unmap_page(), which uses kvmppc_mmu_pte_vflush() to
      find and remove the old host HPTE.
      
      Since the memslot-related functions require the KVM SRCU read lock to
      be held, this adds srcu_read_lock/unlock pairs around the calls to
      kvmppc_handle_pagefault().
      
      Finally, this changes kvmppc_mmu_book3s_32_xlate_pte() to not ignore
      guest HPTEs that don't permit access, and to return -EPERM for accesses
      that are not permitted by the page protections.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      93b159b4
    • P
      KVM: PPC: Book3S: Add GET/SET_ONE_REG interface for VRSAVE · c0867fd5
      Paul Mackerras 提交于
      The VRSAVE register value for a vcpu is accessible through the
      GET/SET_SREGS interface for Book E processors, but not for Book 3S
      processors.  In order to make this accessible for Book 3S processors,
      this adds a new register identifier for GET/SET_ONE_REG, and adds
      the code to implement it.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      c0867fd5
  7. 27 4月, 2013 3 次提交
    • P
      KVM: PPC: Book3S: Facilities to save/restore XICS presentation ctrler state · 8b78645c
      Paul Mackerras 提交于
      This adds the ability for userspace to save and restore the state
      of the XICS interrupt presentation controllers (ICPs) via the
      KVM_GET/SET_ONE_REG interface.  Since there is one ICP per vcpu, we
      simply define a new 64-bit register in the ONE_REG space for the ICP
      state.  The state includes the CPU priority setting, the pending IPI
      priority, and the priority and source number of any pending external
      interrupt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8b78645c
    • B
      KVM: PPC: Book3S: Add kernel emulation for the XICS interrupt controller · bc5ad3f3
      Benjamin Herrenschmidt 提交于
      This adds in-kernel emulation of the XICS (eXternal Interrupt
      Controller Specification) interrupt controller specified by PAPR, for
      both HV and PR KVM guests.
      
      The XICS emulation supports up to 1048560 interrupt sources.
      Interrupt source numbers below 16 are reserved; 0 is used to mean no
      interrupt and 2 is used for IPIs.  Internally these are represented in
      blocks of 1024, called ICS (interrupt controller source) entities, but
      that is not visible to userspace.
      
      Each vcpu gets one ICP (interrupt controller presentation) entity,
      used to store the per-vcpu state such as vcpu priority, pending
      interrupt state, IPI request, etc.
      
      This does not include any API or any way to connect vcpus to their
      ICP state; that will be added in later patches.
      
      This is based on an initial implementation by Michael Ellerman
      <michael@ellerman.id.au> reworked by Benjamin Herrenschmidt and
      Paul Mackerras.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix typo, add dependency on !KVM_MPIC]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      bc5ad3f3
    • B
      KVM: PPC: debug stub interface parameter defined · 092d62ee
      Bharat Bhushan 提交于
      This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
      ioctl support. Follow up patches will use this for setting up
      hardware breakpoints, watchpoints and software breakpoints.
      
      Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
      This is because I am not sure what is required for book3s. So this ioctl
      behaviour will not change for book3s.
      Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      092d62ee
  8. 17 4月, 2013 1 次提交
  9. 22 3月, 2013 1 次提交
  10. 06 10月, 2012 3 次提交
    • P
      KVM: PPC: Book3S: Get/set guest FP regs using the GET/SET_ONE_REG interface · a8bd19ef
      Paul Mackerras 提交于
      This enables userspace to get and set all the guest floating-point
      state using the KVM_[GS]ET_ONE_REG ioctls.  The floating-point state
      includes all of the traditional floating-point registers and the
      FPSCR (floating point status/control register), all the VMX/Altivec
      vector registers and the VSCR (vector status/control register), and
      on POWER7, the vector-scalar registers (note that each FP register
      is the high-order half of the corresponding VSR).
      
      Most of these are implemented in common Book 3S code, except for VSX
      on POWER7.  Because HV and PR differ in how they store the FP and VSX
      registers on POWER7, the code for these cases is not common.  On POWER7,
      the FP registers are the upper halves of the VSX registers vsr0 - vsr31.
      PR KVM stores vsr0 - vsr31 in two halves, with the upper halves in the
      arch.fpr[] array and the lower halves in the arch.vsr[] array, whereas
      HV KVM on POWER7 stores the whole VSX register in arch.vsr[].
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix whitespace, vsx compilation]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a8bd19ef
    • P
      KVM: PPC: Book3S: Get/set guest SPRs using the GET/SET_ONE_REG interface · a136a8bd
      Paul Mackerras 提交于
      This enables userspace to get and set various SPRs (special-purpose
      registers) using the KVM_[GS]ET_ONE_REG ioctls.  With this, userspace
      can get and set all the SPRs that are part of the guest state, either
      through the KVM_[GS]ET_REGS ioctls, the KVM_[GS]ET_SREGS ioctls, or
      the KVM_[GS]ET_ONE_REG ioctls.
      
      The SPRs that are added here are:
      
      - DABR:  Data address breakpoint register
      - DSCR:  Data stream control register
      - PURR:  Processor utilization of resources register
      - SPURR: Scaled PURR
      - DAR:   Data address register
      - DSISR: Data storage interrupt status register
      - AMR:   Authority mask register
      - UAMOR: User authority mask override register
      - MMCR0, MMCR1, MMCRA: Performance monitor unit control registers
      - PMC1..PMC8: Performance monitor unit counter registers
      
      In order to reduce code duplication between PR and HV KVM code, this
      moves the kvm_vcpu_ioctl_[gs]et_one_reg functions into book3s.c and
      centralizes the copying between user and kernel space there.  The
      registers that are handled differently between PR and HV, and those
      that exist only in one flavor, are handled in kvmppc_[gs]et_one_reg()
      functions that are specific to each flavor.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: minimal style fixes]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a136a8bd
    • B
      KVM: PPC: booke: Add watchdog emulation · f61c94bb
      Bharat Bhushan 提交于
      This patch adds the watchdog emulation in KVM. The watchdog
      emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl.
      The kernel timer are used for watchdog emulation and emulates
      h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU
      if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how
      it is configured.
      Signed-off-by: NLiu Yu <yu.liu@freescale.com>
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      [bharat.bhushan@freescale.com: reworked patch]
      Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
      [agraf: adjust to new request framework]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f61c94bb
  11. 08 4月, 2012 2 次提交
  12. 05 3月, 2012 4 次提交
  13. 27 12月, 2011 1 次提交
  14. 01 11月, 2011 1 次提交
  15. 12 7月, 2011 2 次提交
    • P
      KVM: PPC: Deliver program interrupts right away instead of queueing them · 3cf658b6
      Paul Mackerras 提交于
      Doing so means that we don't have to save the flags anywhere and gets
      rid of the last reference to to_book3s(vcpu) in arch/powerpc/kvm/book3s.c.
      
      Doing so is OK because a program interrupt won't be generated at the
      same time as any other synchronous interrupt.  If a program interrupt
      and an asynchronous interrupt (external or decrementer) are generated
      at the same time, the program interrupt will be delivered, which is
      correct because it has a higher priority, and then the asynchronous
      interrupt will be masked.
      
      We don't ever generate system reset or machine check interrupts to the
      guest, but if we did, then we would need to make sure they got delivered
      rather than the program interrupt.  The current code would be wrong in
      this situation anyway since it would deliver the program interrupt as
      well as the reset/machine check interrupt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3cf658b6
    • P
      KVM: PPC: Split out code from book3s.c into book3s_pr.c · f05ed4d5
      Paul Mackerras 提交于
      In preparation for adding code to enable KVM to use hypervisor mode
      on 64-bit Book 3S processors, this splits book3s.c into two files,
      book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is
      specific to running the guest in problem state (user mode) and book3s.c
      contains code which should apply to all Book 3S processors.
      
      In doing this, we abstract some details, namely the interrupt offset,
      updating the interrupt pending flag, and detecting if the guest is
      in a critical section.  These are all things that will be different
      when we use hypervisor mode.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f05ed4d5