1. 14 6月, 2016 11 次提交
    • J
      MIPS: KVM: Arrayify struct kvm_mips_tlb::tlb_lo* · 9fbfb06a
      James Hogan 提交于
      The values of the EntryLo0 and EntryLo1 registers for a TLB entry are
      stored in separate members of struct kvm_mips_tlb called tlb_lo0 and
      tlb_lo1 respectively. To allow future code which needs to manipulate
      arbitrary EntryLo data in the TLB entry to be simpler and less
      conditional, replace these members with an array of two elements.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9fbfb06a
    • J
      MIPS: KVM: Restore host EBase from ebase variable · 878edf01
      James Hogan 提交于
      The host kernel's exception vector base address is currently saved in
      the VCPU structure at creation time, and restored on a guest exit.
      However it doesn't change and can already be easily accessed from the
      'ebase' variable (arch/mips/kernel/traps.c), so drop the host_ebase
      member of kvm_vcpu_arch, export the 'ebase' variable to modules and load
      from there instead.
      
      This does result in a single extra instruction (lui) on the guest exit
      path, but simplifies the code a bit and removes the redundant storage of
      the host exception base address.
      
      Credit for the idea goes to Cavium's VZ KVM implementation.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      878edf01
    • J
      MIPS: KVM: Drop unused hpa0/hpa1 args from function · 26ee17ff
      James Hogan 提交于
      The function kvm_mips_handle_mapped_seg_tlb_fault() has two completely
      unused pointer arguments, hpa0 and hpa1, for which all users always pass
      NULL.
      
      Drop these two arguments and update the callers.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      26ee17ff
    • J
      MIPS: KVM: Don't indirect KVM functions · 9befad23
      James Hogan 提交于
      Several KVM module functions are indirected so that they can be accessed
      from tlb.c which is statically built into the kernel. This is no longer
      necessary as the relevant bits of code have moved into mmu.c which is
      part of the KVM module, so drop the indirections.
      
      Note: is_error_pfn() is defined inline in kvm_host.h, so didn't actually
      require the KVM module to be loaded for it to work anyway.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9befad23
    • J
      MIPS: KVM: Move non-TLB handling code out of tlb.c · 403015b3
      James Hogan 提交于
      Various functions in tlb.c perform higher level MMU handling, but don't
      strictly need to be statically built into the kernel as they don't
      directly manipulate TLB entries. Move these functions out into a
      separate mmu.c which will be built into the KVM kernel module. This
      allows them to directly reference KVM functions in the KVM kernel module
      in future.
      
      Module exports of these functions have been removed, since they aren't
      needed outside of KVM.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      403015b3
    • J
      MIPS: KVM: Make various Cause variables 32-bit · 31cf7498
      James Hogan 提交于
      The CP0 Cause register is passed around in KVM quite a bit, often as an
      unsigned long, even though it is always 32-bits long.
      
      Resize it to u32 throughout MIPS KVM.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      31cf7498
    • J
      MIPS: KVM: Convert headers to kernel sized types · bdb7ed86
      James Hogan 提交于
      Convert the MIPS kvm_host.h structs, function declaration prototypes and
      associated definition prototypes to use standard kernel sized types
      (e.g. u32) instead of inttypes.h style ones (e.g. uint32_t).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bdb7ed86
    • J
      MIPS: KVM: Drop unused host_cp0_entryhi · e4e94c0f
      James Hogan 提交于
      The host EntryHi in the KVM VCPU context is virtually unused. It gets
      stored on exceptions, but only ever used in a kvm_debug() when a TLB
      miss occurs.
      
      Drop it entirely, removing that information from the kvm_debug output.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e4e94c0f
    • J
      MIPS: KVM: Drop unused guest_inst from kvm_vcpu_arch · d40dd9e8
      James Hogan 提交于
      The MIPS kvm_vcpu_arch::guest_inst isn't used, so drop it from the
      struct and drop its asm-offsets definition.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d40dd9e8
    • J
      MIPS: KVM: Include bit 31 in segment matches · 7f5a1ddc
      James Hogan 提交于
      When faulting guest addresses are matched against guest segments with
      the KVM_GUEST_KSEGX() macro, change the mask to 0xe0000000 so as to
      include bit 31.
      
      This is mainly for safety's sake, as it prevents a rogue BadVAddr in the
      host kseg2/kseg3 segments (e.g. 0xC*******) after a TLB exception from
      matching the guest kseg0 segment (e.g. 0x4*******), triggering an
      internal KVM error instead of allowing the corresponding guest kseg0
      page to be mapped into the host vmalloc space.
      
      Such a rogue BadVAddr was observed to happen with the host MIPS kernel
      running under QEMU with KVM built as a module, due to a not entirely
      transparent optimisation in the QEMU TLB handling. This has already been
      worked around properly in a previous commit.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7f5a1ddc
    • J
      MIPS: KVM: Fix modular KVM under QEMU · 797179bc
      James Hogan 提交于
      Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
      get a TLB refill exception in it when KVM is built as a module.
      
      This was observed to happen with the host MIPS kernel running under
      QEMU, due to a not entirely transparent optimisation in the QEMU TLB
      handling where TLB entries replaced with TLBWR are copied to a separate
      part of the TLB array. Code in those pages continue to be executable,
      but those mappings persist only until the next ASID switch, even if they
      are marked global.
      
      An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
      switching to the guest exception base. Subsequent TLB mapped kernel
      instructions just prior to switching to the guest trigger a TLB refill
      exception, which enters the guest exception handlers without updating
      EPC. This appears as a guest triggered TLB refill on a host kernel
      mapped (host KSeg2) address, which is not handled correctly as user
      (guest) mode accesses to kernel (host) segments always generate address
      error exceptions.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      797179bc
  2. 13 5月, 2016 2 次提交
    • C
      KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2
      Christian Borntraeger 提交于
      Some wakeups should not be considered a sucessful poll. For example on
      s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
      would be considered runnable - letting all vCPUs poll all the time for
      transactional like workload, even if one vCPU would be enough.
      This can result in huge CPU usage for large guests.
      This patch lets architectures provide a way to qualify wakeups if they
      should be considered a good/bad wakeups in regard to polls.
      
      For s390 the implementation will fence of halt polling for anything but
      known good, single vCPU events. The s390 implementation for floating
      interrupts does a wakeup for one vCPU, but the interrupt will be delivered
      by whatever CPU checks first for a pending interrupt. We prefer the
      woken up CPU by marking the poll of this CPU as "good" poll.
      This code will also mark several other wakeup reasons like IPI or
      expired timers as "good". This will of course also mark some events as
      not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
      we prefer to not poll, unless we are really sure, though.
      
      This patch successfully limits the CPU usage for cases like uperf 1byte
      transactional ping pong workload or wakeup heavy workload like OLTP
      while still providing a proper speedup.
      
      This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
      wakeups that are considered not good for polling.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
      Cc: David Matlack <dmatlack@google.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      [Rename config symbol. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3491caf2
    • P
      MIPS: KVM: Abstract guest ASID mask · ca64c2be
      Paul Burton 提交于
      In preparation for supporting varied widths of ASID mask in the kernel
      in general, switch KVM's guest ASIDs to a new KVM_ENTRYHI_ASID
      definition based on the 8-bit MIPS_ENTRYHI_ASID instead of ASID_MASK.
      
      It could potentially be used to support extended guest ASIDs in the
      future.
      Signed-off-by: NPaul Burton <paul.burton@imgtec.com>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/13207/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      ca64c2be
  3. 10 5月, 2016 1 次提交
    • J
      MIPS: KVM: Fix timer IRQ race when writing CP0_Compare · b45bacd2
      James Hogan 提交于
      Writing CP0_Compare clears the timer interrupt pending bit
      (CP0_Cause.TI), but this wasn't being done atomically. If a timer
      interrupt raced with the write of the guest CP0_Compare, the timer
      interrupt could end up being pending even though the new CP0_Compare is
      nowhere near CP0_Count.
      
      We were already updating the hrtimer expiry with
      kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
      kvm_mips_resume_hrtimer(). Close the race window by expanding out
      kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
      CP0_Compare between the freeze and resume. Since the pending timer
      interrupt should not be cleared when CP0_Compare is written via the KVM
      user API, an ack argument is added to distinguish the source of the
      write.
      
      Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.16.x-
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b45bacd2
  4. 24 1月, 2016 5 次提交
  5. 16 1月, 2016 1 次提交
    • D
      kvm: rename pfn_t to kvm_pfn_t · ba049e93
      Dan Williams 提交于
      To date, we have implemented two I/O usage models for persistent memory,
      PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
      userspace).  This series adds a third, DAX-GUP, that allows DAX mappings
      to be the target of direct-i/o.  It allows userspace to coordinate
      DMA/RDMA from/to persistent memory.
      
      The implementation leverages the ZONE_DEVICE mm-zone that went into
      4.3-rc1 (also discussed at kernel summit) to flag pages that are owned
      and dynamically mapped by a device driver.  The pmem driver, after
      mapping a persistent memory range into the system memmap via
      devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus
      page-backed pmem-pfns via flags in the new pfn_t type.
      
      The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the
      resulting pte(s) inserted into the process page tables with a new
      _PAGE_DEVMAP flag.  Later, when get_user_pages() is walking ptes it keys
      off _PAGE_DEVMAP to pin the device hosting the page range active.
      Finally, get_page() and put_page() are modified to take references
      against the device driver established page mapping.
      
      Finally, this need for "struct page" for persistent memory requires
      memory capacity to store the memmap array.  Given the memmap array for a
      large pool of persistent may exhaust available DRAM introduce a
      mechanism to allocate the memmap from persistent memory.  The new
      "struct vmem_altmap *" parameter to devm_memremap_pages() enables
      arch_add_memory() to use reserved pmem capacity rather than the page
      allocator.
      
      This patch (of 18):
      
      The core has developed a need for a "pfn_t" type [1].  Move the existing
      pfn_t in KVM to kvm_pfn_t [2].
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.htmlSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba049e93
  6. 23 10月, 2015 1 次提交
  7. 25 9月, 2015 1 次提交
  8. 16 9月, 2015 1 次提交
    • P
      KVM: add halt_attempted_poll to VCPU stats · 62bea5bf
      Paolo Bonzini 提交于
      This new statistic can help diagnosing VCPUs that, for any reason,
      trigger bad behavior of halt_poll_ns autotuning.
      
      For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
      like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
      10+20+40+80+160+320+480 = 1110 microseconds out of every
      479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
      is consuming about 30% more CPU than it would use without
      polling.  This would show as an abnormally high number of
      attempted polling compared to the successful polls.
      
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com<
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      62bea5bf
  9. 26 5月, 2015 1 次提交
  10. 28 3月, 2015 11 次提交
    • J
      MIPS: KVM: Add MSA exception handling · c2537ed9
      James Hogan 提交于
      Add guest exception handling for MIPS SIMD Architecture (MSA) floating
      point exceptions and MSA disabled exceptions.
      
      MSA floating point exceptions from the guest need passing to the guest
      kernel, so for these a guest MSAFPE is emulated.
      
      MSA disabled exceptions are normally handled by passing a reserved
      instruction exception to the guest (because no guest MSA was supported),
      but the hypervisor can now handle them if the guest has MSA by passing
      an MSA disabled exception to the guest, or if the guest has MSA enabled
      by transparently restoring the guest MSA context and enabling MSA and
      the FPU.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      c2537ed9
    • J
      MIPS: KVM: Add base guest MSA support · 539cb89f
      James Hogan 提交于
      Add base code for supporting the MIPS SIMD Architecture (MSA) in MIPS
      KVM guests. MSA cannot yet be enabled in the guest, we're just laying
      the groundwork.
      
      As with the FPU, whether the guest's MSA context is loaded is stored in
      another bit in the fpu_inuse vcpu member. This allows MSA to be disabled
      when the guest disables it, but keeping the MSA context loaded so it
      doesn't have to be reloaded if the guest re-enables it.
      
      New assembly code is added for saving and restoring the MSA context,
      restoring only the upper half of the MSA context (for if the FPU context
      is already loaded) and for saving/clearing and restoring MSACSR (which
      can itself cause an MSA FP exception depending on the value). The MSACSR
      is restored before returning to the guest if MSA is already enabled, and
      the existing FP exception die notifier is extended to catch the possible
      MSA FP exception and step over the ctcmsa instruction.
      
      The helper function kvm_own_msa() is added to enable MSA and restore
      the MSA context if it isn't already loaded, which will be used in a
      later patch when the guest attempts to use MSA for the first time and
      triggers an MSA disabled exception.
      
      The existing FPU helpers are extended to handle MSA. kvm_lose_fpu()
      saves the full MSA context if it is loaded (which includes the FPU
      context) and both kvm_lose_fpu() and kvm_drop_fpu() disable MSA.
      
      kvm_own_fpu() also needs to lose any MSA context if FR=0, since there
      would be a risk of getting reserved instruction exceptions if CU1 is
      enabled and we later try and save the MSA context. We shouldn't usually
      hit this case since it will be handled when emulating CU1 changes,
      however there's nothing to stop the guest modifying the Status register
      directly via the comm page, which will cause this case to get hit.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      539cb89f
    • J
      MIPS: KVM: Add FP exception handling · 1c0cd66a
      James Hogan 提交于
      Add guest exception handling for floating point exceptions and
      coprocessor 1 unusable exceptions.
      
      Floating point exceptions from the guest need passing to the guest
      kernel, so for these a guest FPE is emulated.
      
      Also, coprocessor 1 unusable exceptions are normally passed straight
      through to the guest (because no guest FPU was supported), but the
      hypervisor can now handle them if the guest has its FPU enabled by
      restoring the guest FPU context and enabling the FPU.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      1c0cd66a
    • J
      MIPS: KVM: Add base guest FPU support · 98e91b84
      James Hogan 提交于
      Add base code for supporting FPU in MIPS KVM guests. The FPU cannot yet
      be enabled in the guest, we're just laying the groundwork.
      
      Whether the guest's FPU context is loaded is stored in a bit in the
      fpu_inuse vcpu member. This allows the FPU to be disabled when the guest
      disables it, but keeping the FPU context loaded so it doesn't have to be
      reloaded if the guest re-enables it.
      
      An fpu_enabled vcpu member stores whether userland has enabled the FPU
      capability (which will be wired up in a later patch).
      
      New assembly code is added for saving and restoring the FPU context, and
      for saving/clearing and restoring FCSR (which can itself cause an FP
      exception depending on the value). The FCSR is restored before returning
      to the guest if the FPU is already enabled, and a die notifier is
      registered to catch the possible FP exception and step over the ctc1
      instruction.
      
      The helper function kvm_lose_fpu() is added to save FPU context and
      disable the FPU, which is used when saving hardware state before a
      context switch or KVM exit (the vcpu_get_regs() callback).
      
      The helper function kvm_own_fpu() is added to enable the FPU and restore
      the FPU context if it isn't already loaded, which will be used in a
      later patch when the guest attempts to use the FPU for the first time
      and triggers a co-processor unusable exception.
      
      The helper function kvm_drop_fpu() is added to discard the FPU context
      and disable the FPU, which will be used in a later patch when the FPU
      state will become architecturally UNPREDICTABLE (change of FR mode) to
      force a reload of [stale] context in the new FR mode.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      98e91b84
    • J
      MIPS: KVM: Add vcpu_get_regs/vcpu_set_regs callback · b86ecb37
      James Hogan 提交于
      Add a vcpu_get_regs() and vcpu_set_regs() callbacks for loading and
      restoring context which may be in hardware registers. This may include
      floating point and MIPS SIMD Architecture (MSA) state which may be
      accessed directly by the guest (but restored lazily by the hypervisor),
      and also dedicated guest registers as provided by the VZ ASE.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      b86ecb37
    • J
      MIPS: KVM: Add Config4/5 and writing of Config registers · c771607a
      James Hogan 提交于
      Add Config4 and Config5 co-processor 0 registers, and add capability to
      write the Config1, Config3, Config4, and Config5 registers using the KVM
      API.
      
      Only supported bits can be written, to minimise the chances of the guest
      being given a configuration from e.g. QEMU that is inconsistent with
      that being emulated, and as such the handling is in trap_emul.c as it
      may need to be different for VZ. Currently the only modification
      permitted is to make Config4 and Config5 exist via the M bits, but other
      bits will be added for FPU and MSA support in future patches.
      
      Care should be taken by userland not to change bits without fully
      handling the possible extra state that may then exist and which the
      guest may begin to use and depend on.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      c771607a
    • J
      MIPS: KVM: Simplify default guest Config registers · 2211ee81
      James Hogan 提交于
      Various semi-used definitions exist in kvm_host.h for the default guest
      config registers. Remove them and use the appropriate values directly
      when initialising the Config registers.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      2211ee81
    • J
      MIPS: KVM: Clean up register definitions a little · 7bd4acec
      James Hogan 提交于
      Clean up KVM_GET_ONE_REG / KVM_SET_ONE_REG register definitions for
      MIPS, to prepare for adding a new group for FPU & MSA vector registers.
      
      Definitions are added for common bits in each group of registers, e.g.
      KVM_REG_MIPS_CP0 = KVM_REG_MIPS | 0x10000, for the coprocessor 0
      registers.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      7bd4acec
    • J
      MIPS: KVM: Implement PRid CP0 register access · 1068eaaf
      James Hogan 提交于
      Implement access to the guest Processor Identification CP0 register
      using the KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctls. This allows the
      owning process to modify and read back the value that is exposed to the
      guest in this register.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      1068eaaf
    • J
      MIPS: KVM: Handle TRAP exceptions from guest kernel · 0a560427
      James Hogan 提交于
      Trap instructions are used by Linux to implement BUG_ON(), however KVM
      doesn't pass trap exceptions on to the guest if they occur in guest
      kernel mode, instead triggering an internal error "Exception Code: 13,
      not yet handled". The guest kernel then doesn't get a chance to print
      the usual BUG message and stack trace.
      
      Implement handling of the trap exception so that it gets passed to the
      guest and the user is left with a more useful log message.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      0a560427
    • J
      MIPS: KVM: Handle MSA Disabled exceptions from guest · 98119ad5
      James Hogan 提交于
      Guest user mode can generate a guest MSA Disabled exception on an MSA
      capable core by simply trying to execute an MSA instruction. Since this
      exception is unknown to KVM it will be passed on to the guest kernel.
      However guest Linux kernels prior to v3.15 do not set up an exception
      handler for the MSA Disabled exception as they don't support any MSA
      capable cores. This results in a guest OS panic.
      
      Since an older processor ID may be being emulated, and MSA support is
      not advertised to the guest, the correct behaviour is to generate a
      Reserved Instruction exception in the guest kernel so it can send the
      guest process an illegal instruction signal (SIGILL), as would happen
      with a non-MSA-capable core.
      
      Fix this as minimally as reasonably possible by preventing
      kvm_mips_check_privilege() from relaying MSA Disabled exceptions from
      guest user mode to the guest kernel, and handling the MSA Disabled
      exception by emulating a Reserved Instruction exception in the guest,
      via a new handle_msa_disabled() KVM callback.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # v3.15+
      98119ad5
  11. 06 2月, 2015 1 次提交
    • P
      kvm: add halt_poll_ns module parameter · f7819512
      Paolo Bonzini 提交于
      This patch introduces a new module parameter for the KVM module; when it
      is present, KVM attempts a bit of polling on every HLT before scheduling
      itself out via kvm_vcpu_block.
      
      This parameter helps a lot for latency-bound workloads---in particular
      I tested it with O_DSYNC writes with a battery-backed disk in the host.
      In this case, writes are fast (because the data doesn't have to go all
      the way to the platters) but they cannot be merged by either the host or
      the guest.  KVM's performance here is usually around 30% of bare metal,
      or 50% if you use cache=directsync or cache=writethrough (these
      parameters avoid that the guest sends pointless flush requests, and
      at the same time they are not slow because of the battery-backed cache).
      The bad performance happens because on every halt the host CPU decides
      to halt itself too.  When the interrupt comes, the vCPU thread is then
      migrated to a new physical CPU, and in general the latency is horrible
      because the vCPU thread has to be scheduled back in.
      
      With this patch performance reaches 60-65% of bare metal and, more
      important, 99% of what you get if you use idle=poll in the guest.  This
      means that the tunable gets rid of this particular bottleneck, and more
      work can be done to improve performance in the kernel or QEMU.
      
      Of course there is some price to pay; every time an otherwise idle vCPUs
      is interrupted by an interrupt, it will poll unnecessarily and thus
      impose a little load on the host.  The above results were obtained with
      a mostly random value of the parameter (500000), and the load was around
      1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU.
      
      The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll,
      that can be used to tune the parameter.  It counts how many HLT
      instructions received an interrupt during the polling period; each
      successful poll avoids that Linux schedules the VCPU thread out and back
      in, and may also avoid a likely trip to C1 and back for the physical CPU.
      
      While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second.
      Of these halts, almost all are failed polls.  During the benchmark,
      instead, basically all halts end within the polling period, except a more
      or less constant stream of 50 per second coming from vCPUs that are not
      running the benchmark.  The wasted time is thus very low.  Things may
      be slightly different for Windows VMs, which have a ~10 ms timer tick.
      
      The effect is also visible on Marcelo's recently-introduced latency
      test for the TSC deadline timer.  Though of course a non-RT kernel has
      awful latency bounds, the latency of the timer is around 8000-10000 clock
      cycles compared to 20000-120000 without setting halt_poll_ns.  For the TSC
      deadline timer, thus, the effect is both a smaller average latency and
      a smaller variance.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f7819512
  12. 29 8月, 2014 3 次提交
  13. 30 6月, 2014 1 次提交