1. 21 4月, 2015 11 次提交
    • P
      KVM: PPC: Book3S HV: Accumulate timing information for real-mode code · b6c295df
      Paul Mackerras 提交于
      This reads the timebase at various points in the real-mode guest
      entry/exit code and uses that to accumulate total, minimum and
      maximum time spent in those parts of the code.  Currently these
      times are accumulated per vcpu in 5 parts of the code:
      
      * rm_entry - time taken from the start of kvmppc_hv_entry() until
        just before entering the guest.
      * rm_intr - time from when we take a hypervisor interrupt in the
        guest until we either re-enter the guest or decide to exit to the
        host.  This includes time spent handling hcalls in real mode.
      * rm_exit - time from when we decide to exit the guest until the
        return from kvmppc_hv_entry().
      * guest - time spend in the guest
      * cede - time spent napping in real mode due to an H_CEDE hcall
        while other threads in the same vcore are active.
      
      These times are exposed in debugfs in a directory per vcpu that
      contains a file called "timings".  This file contains one line for
      each of the 5 timings above, with the name followed by a colon and
      4 numbers, which are the count (number of times the code has been
      executed), the total time, the minimum time, and the maximum time,
      all in nanoseconds.
      
      The overhead of the extra code amounts to about 30ns for an hcall that
      is handled in real mode (e.g. H_SET_DABR), which is about 25%.  Since
      production environments may not wish to incur this overhead, the new
      code is conditional on a new config symbol,
      CONFIG_KVM_BOOK3S_HV_EXIT_TIMING.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b6c295df
    • P
      KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT · e23a808b
      Paul Mackerras 提交于
      This creates a debugfs directory for each HV guest (assuming debugfs
      is enabled in the kernel config), and within that directory, a file
      by which the contents of the guest's HPT (hashed page table) can be
      read.  The directory is named vmnnnn, where nnnn is the PID of the
      process that created the guest.  The file is named "htab".  This is
      intended to help in debugging problems in the host's management
      of guest memory.
      
      The contents of the file consist of a series of lines like this:
      
        3f48 4000d032bf003505 0000000bd7ff1196 00000003b5c71196
      
      The first field is the index of the entry in the HPT, the second and
      third are the HPT entry, so the third entry contains the real page
      number that is mapped by the entry if the entry's valid bit is set.
      The fourth field is the guest's view of the second doubleword of the
      entry, so it contains the guest physical address.  (The format of the
      second through fourth fields are described in the Power ISA and also
      in arch/powerpc/include/asm/mmu-hash64.h.)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e23a808b
    • S
      KVM: PPC: Book3S HV: Add ICP real mode counters · 6e0365b7
      Suresh Warrier 提交于
      Add two counters to count how often we generate real-mode ICS resend
      and reject events. The counters provide some performance statistics
      that could be used in the future to consider if the real mode functions
      need further optimizing. The counters are displayed as part of IPC and
      ICP state provided by /sys/debug/kernel/powerpc/kvm* for each VM.
      
      Also added two counters that count (approximately) how many times we
      don't find an ICP or ICS we're looking for. These are not currently
      exposed through sysfs, but can be useful when debugging crashes.
      Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6e0365b7
    • S
      KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode · b0221556
      Suresh Warrier 提交于
      Interrupt-based hypercalls return H_TOO_HARD to inform KVM that it needs
      to switch to the host to complete the rest of hypercall function in
      virtual mode. This patch ports the virtual mode ICS/ICP reject and resend
      functions to be runnable in hypervisor real mode, thus avoiding the need
      to switch to the host to execute these functions in virtual mode. However,
      the hypercalls continue to return H_TOO_HARD for vcpu_wakeup and notify
      events - these events cannot be done in real mode and they will still need
      a switch to host virtual mode.
      
      There are sufficient differences between the real mode code and the
      virtual mode code for the ICS/ICP resend and reject functions that
      for now the code has been duplicated instead of sharing common code.
      In the future, we can look at creating common functions.
      Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b0221556
    • S
      KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock · 34cb7954
      Suresh Warrier 提交于
      Replaces the ICS mutex lock with a spin lock since we will be porting
      these routines to real mode. Note that we need to disable interrupts
      before we take the lock in anticipation of the fact that on the guest
      side, we are running in the context of a hard irq and interrupts are
      disabled (EE bit off) when the lock is acquired. Again, because we
      will be acquiring the lock in hypervisor real mode, we need to use
      an arch_spinlock_t instead of a normal spinlock here as we want to
      avoid running any lockdep code (which may not be safe to execute in
      real mode).
      Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      34cb7954
    • S
      KVM: PPC: Book3S HV: Add guest->host real mode completion counters · 878610fe
      Suresh E. Warrier 提交于
      Add counters to track number of times we switch from guest real mode
      to host virtual mode during an interrupt-related hyper call because the
      hypercall requires actions that cannot be completed in real mode. This
      will help when making optimizations that reduce guest-host transitions.
      
      It is safe to use an ordinary increment rather than an atomic operation
      because there is one ICP per virtual CPU and kvmppc_xics_rm_complete()
      only works on the ICP for the current VCPU.
      
      The counters are displayed as part of IPC and ICP state provided by
      /sys/debug/kernel/powerpc/kvm* for each VM.
      Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      878610fe
    • A
      KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte · a4bd6eb0
      Aneesh Kumar K.V 提交于
      This adds helper routines for locking and unlocking HPTEs, and uses
      them in the rest of the code.  We don't change any locking rules in
      this patch.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a4bd6eb0
    • A
      KVM: PPC: Book3S HV: Remove RMA-related variables from code · 31037eca
      Aneesh Kumar K.V 提交于
      We don't support real-mode areas now that 970 support is removed.
      Remove the remaining details of rma from the code.  Also rename
      rma_setup_done to hpte_setup_done to better reflect the changes.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      31037eca
    • M
      KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation. · e928e9cb
      Michael Ellerman 提交于
      Some PowerNV systems include a hardware random-number generator.
      This HWRNG is present on POWER7+ and POWER8 chips and is capable of
      generating one 64-bit random number every microsecond.  The random
      numbers are produced by sampling a set of 64 unstable high-frequency
      oscillators and are almost completely entropic.
      
      PAPR defines an H_RANDOM hypercall which guests can use to obtain one
      64-bit random sample from the HWRNG.  This adds a real-mode
      implementation of the H_RANDOM hypercall.  This hypercall was
      implemented in real mode because the latency of reading the HWRNG is
      generally small compared to the latency of a guest exit and entry for
      all the threads in the same virtual core.
      
      Userspace can detect the presence of the HWRNG and the H_RANDOM
      implementation by querying the KVM_CAP_PPC_HWRNG capability.  The
      H_RANDOM hypercall implementation will only be invoked when the guest
      does an H_RANDOM hypercall if userspace first enables the in-kernel
      H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e928e9cb
    • D
      kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM · 99342cf8
      David Gibson 提交于
      On POWER, storage caching is usually configured via the MMU - attributes
      such as cache-inhibited are stored in the TLB and the hashed page table.
      
      This makes correctly performing cache inhibited IO accesses awkward when
      the MMU is turned off (real mode).  Some CPU models provide special
      registers to control the cache attributes of real mode load and stores but
      this is not at all consistent.  This is a problem in particular for SLOF,
      the firmware used on KVM guests, which runs entirely in real mode, but
      which needs to do IO to load the kernel.
      
      To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
      and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
      a logical address (aka guest physical address).  SLOF uses these for IO.
      
      However, because these are implemented within qemu, not the host kernel,
      these bypass any IO devices emulated within KVM itself.  The simplest way
      to see this problem is to attempt to boot a KVM guest from a virtio-blk
      device with iothread / dataplane enabled.  The iothread code relies on an
      in kernel implementation of the virtio queue notification, which is not
      triggered by the IO hcalls, and so the guest will stall in SLOF unable to
      load the guest OS.
      
      This patch addresses this by providing in-kernel implementations of the
      2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
      address not handled by the KVM IO bus will cause a VM exit, hitting the
      qemu implementation as before.
      
      Note that a userspace change is also required, in order to enable these
      new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      [agraf: fix compilation]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      99342cf8
    • S
      powerpc: Export __spin_yield · ae75116e
      Suresh E. Warrier 提交于
      Export __spin_yield so that the arch_spin_unlock() function can
      be invoked from a module. This will be required for modules where
      we want to take a lock that is also is acquired in hypervisor
      real mode. Because we want to avoid running any lockdep code
      (which may not be safe in real mode), this lock needs to be
      an arch_spinlock_t instead of a normal spinlock.
      Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ae75116e
  2. 14 4月, 2015 1 次提交
  3. 11 4月, 2015 5 次提交
  4. 10 4月, 2015 4 次提交
  5. 09 4月, 2015 13 次提交
  6. 08 4月, 2015 6 次提交