- 01 11月, 2011 2 次提交
-
-
由 Paul Gortmaker 提交于
This file has things like module_param_named() and MODULE_PARM_DESC() so it needs the full module.h header present. Without it, you'll get: CC arch/x86/kvm/../../../virt/kvm/iommu.o virt/kvm/iommu.c:37: error: expected ‘)’ before ‘bool’ virt/kvm/iommu.c:39: error: expected ‘)’ before string constant make[3]: *** [arch/x86/kvm/../../../virt/kvm/iommu.o] Error 1 make[2]: *** [arch/x86/kvm] Error 2 Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Paul Gortmaker 提交于
This was coming in via an implicit module.h (and its sub-includes) before, but we'll be cleaning that up shortly. Call out the stat.h include requirement in advance. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 21 10月, 2011 2 次提交
-
-
由 Joerg Roedel 提交于
With per-bus iommu_ops the iommu_found function needs to work on a bus_type too. This patch adds a bus_type parameter to that function and converts all call-places. The function is also renamed to iommu_present because the function now checks if an iommu is present for a given bus and does not check for a global iommu anymore. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
由 Joerg Roedel 提交于
This is necessary to store a pointer to the bus-specific iommu_ops in the iommu-domain structure. It will be used later to call into bus-specific iommu-ops. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
- 26 9月, 2011 6 次提交
-
-
由 Jan Kiszka 提交于
The threaded IRQ handler for MSI-X has almost nothing in common with the INTx/MSI handler. Move its code into a dedicated handler. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Jan Kiszka 提交于
We only perform work in kvm_assigned_dev_ack_irq if the guest IRQ is of INTx type. This completely avoids the callback invocation in non-INTx cases by registering the IRQ ack notifier only for INTx. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Jan Kiszka 提交于
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Sasha Levin 提交于
Currently the method of dealing with an IO operation on a bus (PIO/MMIO) is to call the read or write callback for each device registered on the bus until we find a device which handles it. Since the number of devices on a bus can be significant due to ioeventfds and coalesced MMIO zones, this leads to a lot of overhead on each IO operation. Instead of registering devices, we now register ranges which points to a device. Lookup is done using an efficient bsearch instead of a linear search. Performance test was conducted by comparing exit count per second with 200 ioeventfds created on one byte and the guest is trying to access a different byte continuously (triggering usermode exits). Before the patch the guest has achieved 259k exits per second, after the patch the guest does 274k exits per second. Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Sasha Levin 提交于
This patch changes coalesced mmio to create one mmio device per zone instead of handling all zones in one device. Doing so enables us to take advantage of existing locking and prevents a race condition between coalesced mmio registration/unregistration and lookups. Suggested-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Sasha Levin 提交于
Move the check whether there are available entries to within the spinlock. This allows working with larger amount of VCPUs and reduces premature exits when using a large number of VCPUs. Cc: Avi Kivity <avi@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 24 9月, 2011 1 次提交
-
-
由 Greg Rose 提交于
Device drivers that create and destroy SR-IOV virtual functions via calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic failures if they attempt to destroy VFs while they are assigned to guest virtual machines. By adding a flag for use by the KVM module to indicate that a device is assigned a device driver can check that flag and avoid destroying VFs while they are assigned and avoid system failures. CC: Ian Campbell <ijc@hellion.org.uk> CC: Konrad Wilk <konrad.wilk@oracle.com> Signed-off-by: NGreg Rose <gregory.v.rose@intel.com> Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 24 7月, 2011 3 次提交
-
-
由 Alex Williamson 提交于
IOMMU interrupt remapping support provides a further layer of isolation for device assignment by preventing arbitrary interrupt block DMA writes by a malicious guest from reaching the host. By default, we should require that the platform provides interrupt remapping support, with an opt-in mechanism for existing behavior. Both AMD IOMMU and Intel VT-d2 hardware support interrupt remapping, however we currently only have software support on the Intel side. Users wishing to re-enable device assignment when interrupt remapping is not supported on the platform can use the "allow_unsafe_assigned_interrupts=1" module option. [avi: break long lines] Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
The idea is from Avi: | We could cache the result of a miss in an spte by using a reserved bit, and | checking the page fault error code (or seeing if we get an ept violation or | ept misconfiguration), so if we get repeated mmio on a page, we don't need to | search the slot list/tree. | (https://lkml.org/lkml/2011/2/22/221) When the page fault is caused by mmio, we cache the info in the shadow page table, and also set the reserved bits in the shadow page table, so if the mmio is caused again, we can quickly identify it and emulate it directly Searching mmio gfn in memslots is heavy since we need to walk all memeslots, it can be reduced by this feature, and also avoid walking guest page table for soft mmu. [jan: fix operator precedence issue] Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
If the page fault is caused by mmio, the gfn can not be found in memslots, and 'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify the mmio page fault. And, to clarify the meaning of mmio pfn, we return fault page instead of bad page when the gfn is not allowd to prefetch Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 12 7月, 2011 5 次提交
-
-
由 Gleb Natapov 提交于
Introduce kvm_read_guest_cached() function in addition to write one we already have. [ by glauber: export function signature in kvm header ] Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NGlauber Costa <glommer@redhat.com> Acked-by: NRik van Riel <riel@redhat.com> Tested-by: NEric Munson <emunson@mgebm.net> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Jan Kiszka 提交于
KVM_MAX_MSIX_PER_DEV implies that up to that many MSI-X entries can be requested. But the kernel so far rejected already the upper limit. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
KVM has an ioctl to define which signal mask should be used while running inside VCPU_RUN. At least for big endian systems, this mask is different on 32-bit and 64-bit systems (though the size is identical). Add a compat wrapper that converts the mask to whatever the kernel accepts, allowing 32-bit kvm user space to set signal masks. This patch fixes qemu with --enable-io-thread on ppc64 hosts when running 32-bit user land. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Jan Kiszka 提交于
So far kvm_arch_vcpu_setup is responsible for freeing the vcpu struct if it fails. Move this confusing resonsibility back into the hands of kvm_vm_ioctl_create_vcpu. Only kvm_arch_vcpu_setup of x86 is affected, all other archs cannot fail. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Simply use __copy_to_user/__clear_user to write guest page since we have already verified the user address when the memslot is set Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 06 6月, 2011 1 次提交
-
-
由 Mike Waychison 提交于
It doesn't make sense to ever see a half-initialized kvm structure on mmu notifier callbacks. Previously, 85722cda changed the ordering to ensure that the mmu_lock was initialized before mmu notifier registration, but there is still a race where the mmu notifier could come in and try accessing other portions of struct kvm before they are intialized. Solve this by moving the mmu notifier registration to occur after the structure is completely initialized. Google-Bug-Id: 452199 Signed-off-by: NMike Waychison <mikew@google.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 26 5月, 2011 1 次提交
-
-
由 Heiko Carstens 提交于
fa3d315a "KVM: Validate userspace_addr of memslot when registered" introduced this new warning onn s390: kvm_main.c: In function '__kvm_set_memory_region': kvm_main.c:654:7: warning: passing argument 1 of '__access_ok' makes pointer from integer without a cast arch/s390/include/asm/uaccess.h:53:19: note: expected 'const void *' but argument is of type '__u64' Add the missing cast to get rid of it again... Cc: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 22 5月, 2011 4 次提交
-
-
由 OGAWA Hirofumi 提交于
Like the following, mmu_notifier can be called after registering immediately. So, kvm have to initialize kvm->mmu_lock before it. BUG: spinlock bad magic on CPU#0, kswapd0/342 lock: ffff8800af8c4000, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 Pid: 342, comm: kswapd0 Not tainted 2.6.39-rc5+ #1 Call Trace: [<ffffffff8118ce61>] spin_bug+0x9c/0xa3 [<ffffffff8118ce91>] do_raw_spin_lock+0x29/0x13c [<ffffffff81024923>] ? flush_tlb_others_ipi+0xaf/0xfd [<ffffffff812e22f3>] _raw_spin_lock+0x9/0xb [<ffffffffa0582325>] kvm_mmu_notifier_clear_flush_young+0x2c/0x66 [kvm] [<ffffffff810d3ff3>] __mmu_notifier_clear_flush_young+0x2b/0x57 [<ffffffff810c8761>] page_referenced_one+0x88/0xea [<ffffffff810c89bf>] page_referenced+0x1fc/0x256 [<ffffffff810b2771>] shrink_page_list+0x187/0x53a [<ffffffff810b2ed7>] shrink_inactive_list+0x1e0/0x33d [<ffffffff810acf95>] ? determine_dirtyable_memory+0x15/0x27 [<ffffffff812e90ee>] ? call_function_single_interrupt+0xe/0x20 [<ffffffff810b3356>] shrink_zone+0x322/0x3de [<ffffffff810a9587>] ? zone_watermark_ok_safe+0xe2/0xf1 [<ffffffff810b3928>] kswapd+0x516/0x818 [<ffffffff810b3412>] ? shrink_zone+0x3de/0x3de [<ffffffff81053d17>] kthread+0x7d/0x85 [<ffffffff812e9394>] kernel_thread_helper+0x4/0x10 [<ffffffff81053c9a>] ? __init_kthread_worker+0x37/0x37 [<ffffffff812e9390>] ? gs_change+0xb/0xb Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Takuya Yoshikawa 提交于
This way, we can avoid checking the user space address many times when we read the guest memory. Although we can do the same for write if we check which slots are writable, we do not care write now: reading the guest memory happens more often than writing. [avi: change VERIFY_READ to VERIFY_WRITE] Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Liu Yuan 提交于
Function ioapic_debug() in the ioapic_deliver() misnames one filed by reference. This patch correct it. Signed-off-by: NLiu Yuan <tailai.ly@taobao.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alex Williamson 提交于
Store the device saved state so that we can reload the device back to the original state when it's unassigned. This has the benefit that the state survives across pci_reset_function() calls via the PCI sysfs reset interface while the VM is using the device. Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Acked-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 11 5月, 2011 1 次提交
-
-
由 Xiao Guangrong 提交于
We can get memslot id from memslot->id directly Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 06 4月, 2011 2 次提交
-
-
由 Gleb Natapov 提交于
If asynchronous hva_to_pfn() is requested call GUP with FOLL_NOWAIT to avoid sleeping on IO. Check for hwpoison is done at the same time, otherwise check_user_page_hwpoison() will call GUP again and will put vcpu to sleep. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Michael S. Tsirkin 提交于
irqfd in kvm used flush_work incorrectly: it assumed that work scheduled previously can't run after flush_work, but since kvm uses a non-reentrant workqueue (by means of schedule_work) we need flush_work_sync to get that guarantee. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Reported-by: NJean-Philippe Menil <jean-philippe.menil@univ-nantes.fr> Tested-by: NJean-Philippe Menil <jean-philippe.menil@univ-nantes.fr> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 24 3月, 2011 3 次提交
-
-
由 Akinobu Mita 提交于
As a preparation for removing ext2 non-atomic bit operations from asm/bitops.h. This converts ext2 non-atomic bit operations to little-endian bit operations. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Akinobu Mita 提交于
asm-generic/bitops/le.h is only intended to be included directly from asm-generic/bitops/ext2-non-atomic.h or asm-generic/bitops/minix-le.h which implements generic ext2 or minix bit operations. This stops including asm-generic/bitops/le.h directly and use ext2 non-atomic bit operations instead. It seems odd to use ext2_set_bit() on kvm, but it will replaced with __set_bit_le() after introducing little endian bit operations for all architectures. This indirect step is necessary to maintain bisectability for some architectures which have their own little-endian bit operations. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rafael J. Wysocki 提交于
KVM uses a sysdev class and a sysdev for executing kvm_suspend() after interrupts have been turned off on the boot CPU (during system suspend) and for executing kvm_resume() before turning on interrupts on the boot CPU (during system resume). However, since both of these functions ignore their arguments, the entire mechanism may be replaced with a struct syscore_ops object which is simpler. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Acked-by: NAvi Kivity <avi@redhat.com>
-
- 18 3月, 2011 8 次提交
-
-
由 Michael S. Tsirkin 提交于
The RCU use in kvm_irqfd_deassign is tricky: we have rcu_assign_pointer but no synchronize_rcu: synchronize_rcu is done by kvm_irq_routing_update which we share a spinlock with. Fix up a comment in an attempt to make this clearer. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Jan Kiszka 提交于
Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Rik van Riel 提交于
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to get another VCPU in the same KVM guest to run sooner. This seems to give a 10-15% speedup in certain workloads. Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Rik van Riel 提交于
Keep track of which task is running a KVM vcpu. This helps us figure out later what task to wake up if we want to boost a vcpu that got preempted. Unfortunately there are no guarantees that the same task always keeps the same vcpu, so we can only track the task across a single "run" of the vcpu. Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Huang Ying 提交于
is_hwpoison_address only checks whether the page table entry is hwpoisoned, regardless the memory page mapped. While __get_user_pages will check both. QEMU will clear the poisoned page table entry (via unmap/map) to make it possible to allocate a new memory page for the virtual address across guest rebooting. But it is also possible that the underlying memory page is kept poisoned even after the corresponding page table entry is cleared, that is, a new memory page can not be allocated. __get_user_pages can catch these situations. Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
Now, we have 'vcpu->mode' to judge whether need to send ipi to other cpus, this way is very exact, so checking request bit is needless, then we can drop the spinlock let it's collateral Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Currently we keep track of only two states: guest mode and host mode. This patch adds an "exiting guest mode" state that tells us that an IPI will happen soon, so unless we need to wait for the IPI, we can avoid it completely. Also 1: No need atomically to read/write ->mode in vcpu's thread 2: reorganize struct kvm_vcpu to make ->mode and ->requests in the same cache line explicitly Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Heiko Carstens 提交于
Get rid of this warning: CC arch/s390/kvm/../../../virt/kvm/kvm_main.o arch/s390/kvm/../../../virt/kvm/kvm_main.c:596:12: warning: 'kvm_create_dirty_bitmap' defined but not used The only caller of the function is within a !CONFIG_S390 section, so add the same ifdef around kvm_create_dirty_bitmap() as well. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-