1. 03 11月, 2014 1 次提交
  2. 24 10月, 2014 1 次提交
    • W
      kvm: vfio: fix unregister kvm_device_ops of vfio · 571ee1b6
      Wanpeng Li 提交于
      After commit 80ce1639 (KVM: VFIO: register kvm_device_ops dynamically),
      kvm_device_ops of vfio can be registered dynamically. Commit 3c3c29fd
      (kvm-vfio: do not use module_init) move the dynamic register invoked by
      kvm_init in order to fix broke unloading of the kvm module. However,
      kvm_device_ops of vfio is unregistered after rmmod kvm-intel module
      which lead to device type collision detection warning after kvm-intel
      module reinsmod.
      
          WARNING: CPU: 1 PID: 10358 at /root/cathy/kvm/arch/x86/kvm/../../../virt/kvm/kvm_main.c:3289 kvm_init+0x234/0x282 [kvm]()
          Modules linked in: kvm_intel(O+) kvm(O) nfsv3 nfs_acl auth_rpcgss oid_registry nfsv4 dns_resolver nfs fscache lockd sunrpc pci_stub bridge stp llc autofs4 8021q cpufreq_ondemand ipv6 joydev microcode pcspkr igb i2c_algo_bit ehci_pci ehci_hcd e1000e i2c_i801 ixgbe ptp pps_core hwmon mdio tpm_tis tpm ipmi_si ipmi_msghandler acpi_cpufreq isci libsas scsi_transport_sas button dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kvm_intel]
          CPU: 1 PID: 10358 Comm: insmod Tainted: G        W  O   3.17.0-rc1 #2
          Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
           0000000000000cd9 ffff880ff08cfd18 ffffffff814a61d9 0000000000000cd9
           0000000000000000 ffff880ff08cfd58 ffffffff810417b7 ffff880ff08cfd48
           ffffffffa045bcac ffffffffa049c420 0000000000000040 00000000000000ff
          Call Trace:
           [<ffffffff814a61d9>] dump_stack+0x49/0x60
           [<ffffffff810417b7>] warn_slowpath_common+0x7c/0x96
           [<ffffffffa045bcac>] ? kvm_init+0x234/0x282 [kvm]
           [<ffffffff810417e6>] warn_slowpath_null+0x15/0x17
           [<ffffffffa045bcac>] kvm_init+0x234/0x282 [kvm]
           [<ffffffffa016e995>] vmx_init+0x1bf/0x42a [kvm_intel]
           [<ffffffffa016e7d6>] ? vmx_check_processor_compat+0x64/0x64 [kvm_intel]
           [<ffffffff810002ab>] do_one_initcall+0xe3/0x170
           [<ffffffff811168a9>] ? __vunmap+0xad/0xb8
           [<ffffffff8109c58f>] do_init_module+0x2b/0x174
           [<ffffffff8109d414>] load_module+0x43e/0x569
           [<ffffffff8109c6d8>] ? do_init_module+0x174/0x174
           [<ffffffff8109c75a>] ? copy_module_from_user+0x39/0x82
           [<ffffffff8109b7dd>] ? module_sect_show+0x20/0x20
           [<ffffffff8109d65f>] SyS_init_module+0x54/0x81
           [<ffffffff814a9a12>] system_call_fastpath+0x16/0x1b
          ---[ end trace 0626f4a3ddea56f3 ]---
      
      The bug can be reproduced by:
      
          rmmod kvm_intel.ko
          insmod kvm_intel.ko
      
      without rmmod/insmod kvm.ko
      This patch fixes the bug by unregistering kvm_device_ops of vfio when the
      kvm-intel module is removed.
      Reported-by: NLiu Rongrong <rongrongx.liu@intel.com>
      Fixes: 3c3c29fdSigned-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      571ee1b6
  3. 26 9月, 2014 1 次提交
  4. 24 9月, 2014 6 次提交
  5. 22 9月, 2014 1 次提交
  6. 17 9月, 2014 4 次提交
  7. 14 9月, 2014 1 次提交
  8. 05 9月, 2014 3 次提交
  9. 03 9月, 2014 2 次提交
    • D
      kvm: fix potentially corrupt mmio cache · ee3d1570
      David Matlack 提交于
      vcpu exits and memslot mutations can run concurrently as long as the
      vcpu does not aquire the slots mutex. Thus it is theoretically possible
      for memslots to change underneath a vcpu that is handling an exit.
      
      If we increment the memslot generation number again after
      synchronize_srcu_expedited(), vcpus can safely cache memslot generation
      without maintaining a single rcu_dereference through an entire vm exit.
      And much of the x86/kvm code does not maintain a single rcu_dereference
      of the current memslots during each exit.
      
      We can prevent the following case:
      
         vcpu (CPU 0)                             | thread (CPU 1)
      --------------------------------------------+--------------------------
      1  vm exit                                  |
      2  srcu_read_unlock(&kvm->srcu)             |
      3  decide to cache something based on       |
           old memslots                           |
      4                                           | change memslots
                                                  | (increments generation)
      5                                           | synchronize_srcu(&kvm->srcu);
      6  retrieve generation # from new memslots  |
      7  tag cache with new memslot generation    |
      8  srcu_read_unlock(&kvm->srcu)             |
      ...                                         |
         <action based on cache occurs even       |
          though the caching decision was based   |
          on the old memslots>                    |
      ...                                         |
         <action *continues* to occur until next  |
          memslot generation change, which may    |
          be never>                               |
                                                  |
      
      By incrementing the generation after synchronizing with kvm->srcu readers,
      we ensure that the generation retrieved in (6) will become invalid soon
      after (8).
      
      Keeping the existing increment is not strictly necessary, but we
      do keep it and just move it for consistency from update_memslots to
      install_new_memslots.  It invalidates old cached MMIOs immediately,
      instead of having to wait for the end of synchronize_srcu_expedited,
      which makes the code more clearly correct in case CPU 1 is preempted
      right after synchronize_srcu() returns.
      
      To avoid halving the generation space in SPTEs, always presume that the
      low bit of the generation is zero when reconstructing a generation number
      out of an SPTE.  This effectively disables MMIO caching in SPTEs during
      the call to synchronize_srcu_expedited.  Using the low bit this way is
      somewhat like a seqcount---where the protected thing is a cache, and
      instead of retrying we can simply punt if we observe the low bit to be 1.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid Matlack <dmatlack@google.com>
      Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ee3d1570
    • P
      KVM: do not bias the generation number in kvm_current_mmio_generation · 00f034a1
      Paolo Bonzini 提交于
      The next patch will give a meaning (a la seqcount) to the low bit of the
      generation number.  Ensure that it matches between kvm->memslots->generation
      and kvm_current_mmio_generation().
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      00f034a1
  10. 29 8月, 2014 2 次提交
  11. 28 8月, 2014 1 次提交
  12. 22 8月, 2014 1 次提交
  13. 21 8月, 2014 1 次提交
  14. 06 8月, 2014 1 次提交
    • P
      KVM: Move more code under CONFIG_HAVE_KVM_IRQFD · c77dcacb
      Paolo Bonzini 提交于
      Commits e4d57e1e (KVM: Move irq notifier implementation into
      eventfd.c, 2014-06-30) included the irq notifier code unconditionally
      in eventfd.c, while it was under CONFIG_HAVE_KVM_IRQCHIP before.
      
      Similarly, commit 297e2105 (KVM: Give IRQFD its own separate enabling
      Kconfig option, 2014-06-30) moved code from CONFIG_HAVE_IRQ_ROUTING
      to CONFIG_HAVE_KVM_IRQFD but forgot to move the pieces that used to be
      under CONFIG_HAVE_KVM_IRQCHIP.
      
      Together, this broke compilation without CONFIG_KVM_XICS.  Fix by adding
      or changing the #ifdefs so that they point at CONFIG_HAVE_KVM_IRQFD.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c77dcacb
  15. 05 8月, 2014 1 次提交
  16. 28 7月, 2014 2 次提交
  17. 05 6月, 2014 1 次提交
  18. 03 6月, 2014 1 次提交
  19. 05 5月, 2014 1 次提交
    • C
      kvm/irqchip: Speed up KVM_SET_GSI_ROUTING · 719d93cd
      Christian Borntraeger 提交于
      When starting lots of dataplane devices the bootup takes very long on
      Christian's s390 with irqfd patches. With larger setups he is even
      able to trigger some timeouts in some components.  Turns out that the
      KVM_SET_GSI_ROUTING ioctl takes very long (strace claims up to 0.1 sec)
      when having multiple CPUs.  This is caused by the  synchronize_rcu and
      the HZ=100 of s390.  By changing the code to use a private srcu we can
      speed things up.  This patch reduces the boot time till mounting root
      from 8 to 2 seconds on my s390 guest with 100 disks.
      
      Uses of hlist_for_each_entry_rcu, hlist_add_head_rcu, hlist_del_init_rcu
      are fine because they do not have lockdep checks (hlist_for_each_entry_rcu
      uses rcu_dereference_raw rather than rcu_dereference, and write-sides
      do not do rcu lockdep at all).
      
      Note that we're hardly relying on the "sleepable" part of srcu.  We just
      want SRCU's faster detection of grace periods.
      
      Testing was done by Andrew Theurer using netperf tests STREAM, MAERTS
      and RR.  The difference between results "before" and "after" the patch
      has mean -0.2% and standard deviation 0.6%.  Using a paired t-test on the
      data points says that there is a 2.5% probability that the patch is the
      cause of the performance difference (rather than a random fluctuation).
      
      (Restricting the t-test to RR, which is the most likely to be affected,
      changes the numbers to respectively -0.3% mean, 0.7% stdev, and 8%
      probability that the numbers actually say something about the patch.
      The probability increases mostly because there are fewer data points).
      
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # s390
      Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      719d93cd
  20. 24 4月, 2014 1 次提交
  21. 22 4月, 2014 1 次提交
  22. 18 4月, 2014 1 次提交
    • M
      KVM: VMX: speed up wildcard MMIO EVENTFD · 68c3b4d1
      Michael S. Tsirkin 提交于
      With KVM, MMIO is much slower than PIO, due to the need to
      do page walk and emulation. But with EPT, it does not have to be: we
      know the address from the VMCS so if the address is unique, we can look
      up the eventfd directly, bypassing emulation.
      
      Unfortunately, this only works if userspace does not need to match on
      access length and data.  The implementation adds a separate FAST_MMIO
      bus internally. This serves two purposes:
          - minimize overhead for old userspace that does not use eventfd with lengtth = 0
          - minimize disruption in other code (since we don't know the length,
            devices on the MMIO bus only get a valid address in write, this
            way we don't need to touch all devices to teach them to handle
            an invalid length)
      
      At the moment, this optimization only has effect for EPT on x86.
      
      It will be possible to speed up MMIO for NPT and MMU using the same
      idea in the future.
      
      With this patch applied, on VMX MMIO EVENTFD is essentially as fast as PIO.
      I was unable to detect any measureable slowdown to non-eventfd MMIO.
      
      Making MMIO faster is important for the upcoming virtio 1.0 which
      includes an MMIO signalling capability.
      
      The idea was suggested by Peter Anvin.  Lots of thanks to Gleb for
      pre-review and suggestions.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      68c3b4d1
  23. 27 2月, 2014 1 次提交
  24. 18 2月, 2014 1 次提交
  25. 14 2月, 2014 1 次提交
  26. 30 1月, 2014 1 次提交
  27. 15 1月, 2014 1 次提交