1. 08 4月, 2012 1 次提交
    • S
      KVM: PPC: booke: category E.HV (GS-mode) support · d30f6e48
      Scott Wood 提交于
      Chips such as e500mc that implement category E.HV in Power ISA 2.06
      provide hardware virtualization features, including a new MSR mode for
      guest state.  The guest OS can perform many operations without trapping
      into the hypervisor, including transitions to and from guest userspace.
      
      Since we can use SRR1[GS] to reliably tell whether an exception came from
      guest state, instead of messing around with IVPR, we use DO_KVM similarly
      to book3s.
      
      Current issues include:
       - Machine checks from guest state are not routed to the host handler.
       - The guest can cause a host oops by executing an emulated instruction
         in a page that lacks read permission.  Existing e500/4xx support has
         the same problem.
      
      Includes work by Ashish Kalra <Ashish.Kalra@freescale.com>,
      Varun Sethi <Varun.Sethi@freescale.com>, and
      Liu Yu <yu.liu@freescale.com>.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      [agraf: remove pt_regs usage]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d30f6e48
  2. 05 3月, 2012 2 次提交
    • S
      KVM: PPC: booke: Improve timer register emulation · dfd4d47e
      Scott Wood 提交于
      Decrementers are now properly driven by TCR/TSR, and the guest
      has full read/write access to these registers.
      
      The decrementer keeps ticking (and setting the TSR bit) regardless of
      whether the interrupts are enabled with TCR.
      
      The decrementer stops at zero, rather than going negative.
      
      Decrementers (and FITs, once implemented) are delivered as
      level-triggered interrupts -- dequeued when the TSR bit is cleared, not
      on delivery.
      Signed-off-by: NLiu Yu <yu.liu@freescale.com>
      [scottwood@freescale.com: significant changes]
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      dfd4d47e
    • S
      KVM: PPC: Paravirtualize SPRG4-7, ESR, PIR, MASn · b5904972
      Scott Wood 提交于
      This allows additional registers to be accessed by the guest
      in PR-mode KVM without trapping.
      
      SPRG4-7 are readable from userspace.  On booke, KVM will sync
      these registers when it enters the guest, so that accesses from
      guest userspace will work.  The guest kernel, OTOH, must consistently
      use either the real registers or the shared area between exits.  This
      also applies to the already-paravirted SPRG3.
      
      On non-booke, it's not clear to what extent SPRG4-7 are supported
      (they're not architected for book3s, but exist on at least some classic
      chips).  They are copied in the get/set regs ioctls, but I do not see any
      non-booke emulation.  I also do not see any syncing with real registers
      (in PR-mode) including the user-readable SPRG3.  This patch should not
      make that situation any worse.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      b5904972
  3. 24 10月, 2010 3 次提交
    • A
      KVM: PPC: Convert SRR0 and SRR1 to shared page · de7906c3
      Alexander Graf 提交于
      The SRR0 and SRR1 registers contain cached values of the PC and MSR
      respectively. They get written to by the hypervisor when an interrupt
      occurs or directly by the kernel. They are also used to tell the rfi(d)
      instruction where to jump to.
      
      Because it only gets touched on defined events that, it's very simple to
      share with the guest. Hypervisor and guest both have full r/w access.
      
      This patch converts all users of the current field to the shared page.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      de7906c3
    • A
      KVM: PPC: Convert DAR to shared page. · 5e030186
      Alexander Graf 提交于
      The DAR register contains the address a data page fault occured at. This
      register behaves pretty much like a simple data storage register that gets
      written to on data faults. There is no hypervisor interaction required on
      read or write.
      
      This patch converts all users of the current field to the shared page.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      5e030186
    • A
      KVM: PPC: Convert MSR to shared page · 666e7252
      Alexander Graf 提交于
      One of the most obvious registers to share with the guest directly is the
      MSR. The MSR contains the "interrupts enabled" flag which the guest has to
      toggle in critical sections.
      
      So in order to bring the overhead of interrupt en- and disabling down, let's
      put msr into the shared page. Keep in mind that even though you can fully read
      its contents, writing to it doesn't always update all state. There are a few
      safe fields that don't require hypervisor interaction. See the documentation
      for a list of MSR bits that are safe to be set from inside the guest.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      666e7252
  4. 01 3月, 2010 1 次提交
    • A
      KVM: PPC: Use accessor functions for GPR access · 8e5b26b5
      Alexander Graf 提交于
      All code in PPC KVM currently accesses gprs in the vcpu struct directly.
      
      While there's nothing wrong with that wrt the current way gprs are stored
      and loaded, it doesn't suffice for the PACA acceleration that will follow
      in this patchset.
      
      So let's just create little wrapper inline functions that we call whenever
      a GPR needs to be read from or written to. The compiled code shouldn't really
      change at all for now.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      8e5b26b5
  5. 24 3月, 2009 2 次提交
  6. 31 12月, 2008 5 次提交
    • H
      KVM: ppc: mostly cosmetic updates to the exit timing accounting code · 7b701591
      Hollis Blanchard 提交于
      The only significant changes were to kvmppc_exit_timing_write() and
      kvmppc_exit_timing_show(), both of which were dramatically simplified.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      7b701591
    • H
      KVM: ppc: Implement in-kernel exit timing statistics · 73e75b41
      Hollis Blanchard 提交于
      Existing KVM statistics are either just counters (kvm_stat) reported for
      KVM generally or trace based aproaches like kvm_trace.
      For KVM on powerpc we had the need to track the timings of the different exit
      types. While this could be achieved parsing data created with a kvm_trace
      extension this adds too much overhead (at least on embedded PowerPC) slowing
      down the workloads we wanted to measure.
      
      Therefore this patch adds a in-kernel exit timing statistic to the powerpc kvm
      code. These statistic is available per vm&vcpu under the kvm debugfs directory.
      As this statistic is low, but still some overhead it can be enabled via a
      .config entry and should be off by default.
      
      Since this patch touched all powerpc kvm_stat code anyway this code is now
      merged and simplified together with the exit timing statistic code (still
      working with exit timing disabled in .config).
      Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      73e75b41
    • H
      KVM: ppc: fix userspace mapping invalidation on context switch · fe4e771d
      Hollis Blanchard 提交于
      We used to defer invalidating userspace TLB entries until jumping out of the
      kernel. This was causing MMU weirdness most easily triggered by using a pipe in
      the guest, e.g. "dmesg | tail". I believe the problem was that after the guest
      kernel changed the PID (part of context switch), the old process's mappings
      were still present, and so copy_to_user() on the "return to new process" path
      ended up using stale mappings.
      
      Testing with large pages (64K) exposed the problem, probably because with 4K
      pages, pressure on the TLB faulted all process A's mappings out before the
      guest kernel could insert any for process B.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      fe4e771d
    • H
      KVM: ppc: optimize irq delivery path · d4cf3892
      Hollis Blanchard 提交于
      In kvmppc_deliver_interrupt is just one case left in the switch and it is a
      rare one (less than 8%) when looking at the exit numbers. Therefore we can
      at least drop the switch/case and if an if. I inserted an unlikely too, but
      that's open for discussion.
      
      In kvmppc_can_deliver_interrupt all frequent cases are in the default case.
      I know compilers are smart but we can make it easier for them. By writing
      down all options and removing the default case combined with the fact that
      ithe values are constants 0..15 should allow the compiler to write an easy
      jump table.
      Modifying kvmppc_can_deliver_interrupt pointed me to the fact that gcc seems
      to be unable to reduce priority_exception[x] to a build time constant.
      Therefore I changed the usage of the translation arrays in the interrupt
      delivery path completely. It is now using priority without translation to irq
      on the full irq delivery path.
      To be able to do that ivpr regs are stored by their priority now.
      
      Additionally the decision made in kvmppc_can_deliver_interrupt is already
      sufficient to get the value of interrupt_msr_mask[x]. Therefore we can replace
      the 16x4byte array used here with a single 4byte variable (might still be one
      miss, but the chance to find this in cache should be better than the right
      entry of the whole array).
      Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d4cf3892
    • H
      KVM: ppc: refactor instruction emulation into generic and core-specific pieces · 75f74f0d
      Hollis Blanchard 提交于
      Cores provide 3 emulation hooks, implemented for example in the new
      4xx_emulate.c:
      kvmppc_core_emulate_op
      kvmppc_core_emulate_mtspr
      kvmppc_core_emulate_mfspr
      
      Strictly speaking the last two aren't necessary, but provide for more
      informative error reporting ("unknown SPR").
      
      Long term I'd like to have instruction decoding autogenerated from tables of
      opcodes, and that way we could aggregate universal, Book E, and core-specific
      instructions more easily and without redundant switch statements.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      75f74f0d