1. 17 5月, 2012 1 次提交
    • A
      KVM: MMU: Don't use RCU for lockless shadow walking · c142786c
      Avi Kivity 提交于
      Using RCU for lockless shadow walking can increase the amount of memory
      in use by the system, since RCU grace periods are unpredictable.  We also
      have an unconditional write to a shared variable (reader_counter), which
      isn't good for scaling.
      
      Replace that with a scheme similar to x86's get_user_pages_fast(): disable
      interrupts during lockless shadow walk to force the freer
      (kvm_mmu_commit_zap_page()) to wait for the TLB flush IPI to find the
      processor with interrupts enabled.
      
      We also add a new vcpu->mode, READING_SHADOW_PAGE_TABLES, to prevent
      kvm_flush_remote_tlbs() from avoiding the IPI.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      c142786c
  2. 01 5月, 2012 1 次提交
  3. 24 4月, 2012 1 次提交
  4. 20 4月, 2012 1 次提交
    • A
      KVM: Fix page-crossing MMIO · f78146b0
      Avi Kivity 提交于
      MMIO that are split across a page boundary are currently broken - the
      code does not expect to be aborted by the exit to userspace for the
      first MMIO fragment.
      
      This patch fixes the problem by generalizing the current code for handling
      16-byte MMIOs to handle a number of "fragments", and changes the MMIO
      code to create those fragments.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      f78146b0
  5. 12 4月, 2012 1 次提交
    • A
      KVM: unmap pages from the iommu when slots are removed · 32f6daad
      Alex Williamson 提交于
      We've been adding new mappings, but not destroying old mappings.
      This can lead to a page leak as pages are pinned using
      get_user_pages, but only unpinned with put_page if they still
      exist in the memslots list on vm shutdown.  A memslot that is
      destroyed while an iommu domain is enabled for the guest will
      therefore result in an elevated page reference count that is
      never cleared.
      
      Additionally, without this fix, the iommu is only programmed
      with the first translation for a gpa.  This can result in
      peer-to-peer errors if a mapping is destroyed and replaced by a
      new mapping at the same gpa as the iommu will still be pointing
      to the original, pinned memory address.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      32f6daad
  6. 08 4月, 2012 5 次提交
  7. 20 3月, 2012 1 次提交
  8. 08 3月, 2012 4 次提交
  9. 05 3月, 2012 5 次提交
  10. 27 12月, 2011 11 次提交
  11. 26 9月, 2011 4 次提交
    • A
      KVM: Fix simultaneous NMIs · 7460fb4a
      Avi Kivity 提交于
      If simultaneous NMIs happen, we're supposed to queue the second
      and next (collapsing them), but currently we sometimes collapse
      the second into the first.
      
      Fix by using a counter for pending NMIs instead of a bool; since
      the counter limit depends on whether the processor is currently
      in an NMI handler, which can only be checked in vcpu context
      (via the NMI mask), we add a new KVM_REQ_NMI to request recalculation
      of the counter.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      7460fb4a
    • J
      KVM: Clean up and extend rate-limited output · bd80158a
      Jan Kiszka 提交于
      The use of printk_ratelimit is discouraged, replace it with
      pr*_ratelimited or __ratelimit. While at it, convert remaining
      guest-triggerable printks to rate-limited variants.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      bd80158a
    • S
      KVM: Intelligent device lookup on I/O bus · 743eeb0b
      Sasha Levin 提交于
      Currently the method of dealing with an IO operation on a bus (PIO/MMIO)
      is to call the read or write callback for each device registered
      on the bus until we find a device which handles it.
      
      Since the number of devices on a bus can be significant due to ioeventfds
      and coalesced MMIO zones, this leads to a lot of overhead on each IO
      operation.
      
      Instead of registering devices, we now register ranges which points to
      a device. Lookup is done using an efficient bsearch instead of a linear
      search.
      
      Performance test was conducted by comparing exit count per second with
      200 ioeventfds created on one byte and the guest is trying to access a
      different byte continuously (triggering usermode exits).
      Before the patch the guest has achieved 259k exits per second, after the
      patch the guest does 274k exits per second.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      743eeb0b
    • S
      KVM: Make coalesced mmio use a device per zone · 2b3c246a
      Sasha Levin 提交于
      This patch changes coalesced mmio to create one mmio device per
      zone instead of handling all zones in one device.
      
      Doing so enables us to take advantage of existing locking and prevents
      a race condition between coalesced mmio registration/unregistration
      and lookups.
      Suggested-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      2b3c246a
  12. 24 7月, 2011 1 次提交
  13. 14 7月, 2011 1 次提交
    • G
      KVM: Steal time implementation · c9aaa895
      Glauber Costa 提交于
      To implement steal time, we need the hypervisor to pass the guest
      information about how much time was spent running other processes
      outside the VM, while the vcpu had meaningful work to do - halt
      time does not count.
      
      This information is acquired through the run_delay field of
      delayacct/schedstats infrastructure, that counts time spent in a
      runqueue but not running.
      
      Steal time is a per-cpu information, so the traditional MSR-based
      infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the
      memory area address containing information about steal time
      
      This patch contains the hypervisor part of the steal time infrasructure,
      and can be backported independently of the guest portion.
      
      [avi, yongjie: export delayacct_on, to avoid build failures in some configs]
      Signed-off-by: NGlauber Costa <glommer@redhat.com>
      Tested-by: NEric B Munson <emunson@mgebm.net>
      CC: Rik van Riel <riel@redhat.com>
      CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      CC: Peter Zijlstra <peterz@infradead.org>
      CC: Anthony Liguori <aliguori@us.ibm.com>
      Signed-off-by: NYongjie Ren <yongjie.ren@intel.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      c9aaa895
  14. 12 7月, 2011 1 次提交
  15. 22 5月, 2011 2 次提交