- 08 3月, 2012 3 次提交
-
-
由 Takuya Yoshikawa 提交于
Narrow down the controlled text inside the conditional so that it will include lpage_info and rmap stuff only. For this we change the way we check whether the slot is being created from "if (npages && !new.rmap)" to "if (npages && !old.npages)". We also stop checking if lpage_info is NULL when we create lpage_info because we do it from inside the slot creation code block. Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Takuya Yoshikawa 提交于
This makes it easy to make lpage_info architecture specific. Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Takuya Yoshikawa 提交于
This patch cleans up the code and removes the "(void)level;" warning suppressor. Note that we can also use this for PT_PAGE_TABLE_LEVEL to treat every level uniformly later. Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 05 3月, 2012 5 次提交
-
-
由 Paul Mackerras 提交于
This moves __gfn_to_memslot() and search_memslots() from kvm_main.c to kvm_host.h to reduce the code duplication caused by the need for non-modular code in arch/powerpc/kvm/book3s_hv_rm_mmu.c to call gfn_to_memslot() in real mode. Rather than putting gfn_to_memslot() itself in a header, which would lead to increased code size, this puts __gfn_to_memslot() in a header. Then, the non-modular uses of gfn_to_memslot() are changed to call __gfn_to_memslot() instead. This way there is only one place in the source code that needs to be changed should the gfn_to_memslot() implementation need to be modified. On powerpc, the Book3S HV style of KVM has code that is called from real mode which needs to call gfn_to_memslot() and thus needs this. (Module code is allocated in the vmalloc region, which can't be accessed in real mode.) With this, we can remove builtin_gfn_to_memslot() from book3s_hv_rm_mmu.c. Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Michael S. Tsirkin 提交于
find_index_from_host_irq returns 0 on error but callers assume < 0 on error. This should not matter much: an out of range irq should never happen since irq handler was registered with this irq #, and even if it does we get a spurious msix irq in guest and typically nothing terrible happens. Still, better to make it consistent. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Paul Mackerras 提交于
This adds an smp_wmb in kvm_mmu_notifier_invalidate_range_end() and an smp_rmb in mmu_notifier_retry() so that mmu_notifier_retry() will give the correct answer when called without kvm->mmu_lock being held. PowerPC Book3S HV KVM wants to use a bitlock per guest page rather than a single global spinlock in order to improve the scalability of updates to the guest MMU hashed page table, and so needs this. Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Carsten Otte 提交于
This patch exports the s390 SIE hardware control block to userspace via the mapping of the vcpu file descriptor. In order to do so, a new arch callback named kvm_arch_vcpu_fault is introduced for all architectures. It allows to map architecture specific pages. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Carsten Otte 提交于
This patch introduces a new config option for user controlled kernel virtual machines. It introduces a parameter to KVM_CREATE_VM that allows to set bits that alter the capabilities of the newly created virtual machine. The parameter is passed to kvm_arch_init_vm for all architectures. The only valid modifier bit for now is KVM_VM_S390_UCONTROL. This requires CAP_SYS_ADMIN privileges and creates a user controlled virtual machine on s390 architectures. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 01 2月, 2012 1 次提交
-
-
由 Takuya Yoshikawa 提交于
It is possible that the __set_bit() in mark_page_dirty() is called simultaneously on the same region of memory, which may result in only one bit being set, because some callers do not take mmu_lock before mark_page_dirty(). This problem is hard to produce because when we reach mark_page_dirty() beginning from, e.g., tdp_page_fault(), mmu_lock is being held during __direct_map(): making kvm-unit-tests' dirty log api test write to two pages concurrently was not useful for this reason. So we have confirmed that there can actually be race condition by checking if some callers really reach there without holding mmu_lock using spin_is_locked(): probably they were from kvm_write_guest_page(). To fix this race, this patch changes the bit operation to the atomic version: note that nr_dirty_pages also suffers from the race but we do not need exactly correct numbers for now. Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 13 1月, 2012 1 次提交
-
-
由 Rusty Russell 提交于
module_param(bool) used to counter-intuitively take an int. In fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 27 12月, 2011 14 次提交
-
-
由 Hamo 提交于
by checking the return value from kvm_init_debug, we can ensure that the entries under debugfs for KVM have been created correctly. Signed-off-by: NYang Bai <hamo.by@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Drop bsp_vcpu pointer from kvm struct since its only use is incorrect anyway. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Sasha Levin 提交于
Switch to using memdup_user when possible. This makes code more smaller and compact, and prevents errors. Signed-off-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Sasha Levin 提交于
Switch to kmemdup() in two places to shorten the code and avoid possible bugs. Signed-off-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Julian Stecklina 提交于
This fixes byte accesses to IOAPIC_REG_SELECT as mandated by at least the ICH10 and Intel Series 5 chipset specs. It also makes ioapic_mmio_write consistent with ioapic_mmio_read, which also allows byte and word accesses. Signed-off-by: NJulian Stecklina <js@alien8.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
The operation of getting dirty log is frequent when framebuffer-based displays are used(for example, Xwindow), so, we introduce a mapping table to speed up id_to_memslot() Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Sort memslots base on its size and use line search to find it, so that the larger memslots have better fit The idea is from Avi Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Introduce id_to_memslot to get memslot by slot id Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Introduce kvm_for_each_memslot to walk all valid memslot Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Introduce update_memslots to update slot which will be update to kvm->memslots Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Introduce KVM_MEM_SLOTS_NUM macro to instead of KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Takuya Yoshikawa 提交于
Needed for the next patch which uses this number to decide how to write protect a slot. Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Thomas Meyer 提交于
Use kmemdup rather than duplicating its implementation The semantic patch that makes this change is available in scripts/coccinelle/api/memdup.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/Signed-off-by: NThomas Meyer <thomas@m3y3r.de> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Dan Carpenter 提交于
My testing version of Smatch complains that addr and len come from the user and they can wrap. The path is: -> kvm_vm_ioctl() -> kvm_vm_ioctl_unregister_coalesced_mmio() -> coalesced_mmio_in_range() I don't know what the implications are of wrapping here, but we may as well fix it, if only to silence the warning. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 26 12月, 2011 1 次提交
-
-
由 Alex Williamson 提交于
Only allow KVM device assignment to attach to devices which: - Are not bridges - Have BAR resources (assume others are special devices) - The user has permissions to use Assigning a bridge is a configuration error, it's not supported, and typically doesn't result in the behavior the user is expecting anyway. Devices without BAR resources are typically chipset components that also don't have host drivers. We don't want users to hold such devices captive or cause system problems by fencing them off into an iommu domain. We determine "permission to use" by testing whether the user has access to the PCI sysfs resource files. By default a normal user will not have access to these files, so it provides a good indication that an administration agent has granted the user access to the device. [Yang Bai: add missing #include] [avi: fix comment style] Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NYang Bai <hamo.by@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 25 12月, 2011 1 次提交
-
-
由 Alex Williamson 提交于
This option has no users and it exposes a security hole that we can allow devices to be assigned without iommu protection. Make KVM_DEV_ASSIGN_ENABLE_IOMMU a mandatory option. Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 10 11月, 2011 1 次提交
-
-
由 Ohad Ben-Cohen 提交于
When mapping a memory region, split it to page sizes as supported by the iommu hardware. Always prefer bigger pages, when possible, in order to reduce the TLB pressure. The logic to do that is now added to the IOMMU core, so neither the iommu drivers themselves nor users of the IOMMU API have to duplicate it. This allows a more lenient granularity of mappings; traditionally the IOMMU API took 'order' (of a page) as a mapping size, and directly let the low level iommu drivers handle the mapping, but now that the IOMMU core can split arbitrary memory regions into pages, we can remove this limitation, so users don't have to split those regions by themselves. Currently the supported page sizes are advertised once and they then remain static. That works well for OMAP and MSM but it would probably not fly well with intel's hardware, where the page size capabilities seem to have the potential to be different between several DMA remapping devices. register_iommu() currently sets a default pgsize behavior, so we can convert the IOMMU drivers in subsequent patches. After all the drivers are converted, the temporary default settings will be removed. Mainline users of the IOMMU API (kvm and omap-iovmm) are adopted to deal with bytes instead of page order. Many thanks to Joerg Roedel <Joerg.Roedel@amd.com> for significant review! Signed-off-by: NOhad Ben-Cohen <ohad@wizery.com> Cc: David Brown <davidb@codeaurora.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <Joerg.Roedel@amd.com> Cc: Stepan Moskovchenko <stepanm@codeaurora.org> Cc: KyongHo Cho <pullip.cho@samsung.com> Cc: Hiroshi DOYU <hdoyu@nvidia.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: kvm@vger.kernel.org Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
- 01 11月, 2011 2 次提交
-
-
由 Paul Gortmaker 提交于
This file has things like module_param_named() and MODULE_PARM_DESC() so it needs the full module.h header present. Without it, you'll get: CC arch/x86/kvm/../../../virt/kvm/iommu.o virt/kvm/iommu.c:37: error: expected ‘)’ before ‘bool’ virt/kvm/iommu.c:39: error: expected ‘)’ before string constant make[3]: *** [arch/x86/kvm/../../../virt/kvm/iommu.o] Error 1 make[2]: *** [arch/x86/kvm] Error 2 Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Paul Gortmaker 提交于
This was coming in via an implicit module.h (and its sub-includes) before, but we'll be cleaning that up shortly. Call out the stat.h include requirement in advance. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 21 10月, 2011 2 次提交
-
-
由 Joerg Roedel 提交于
With per-bus iommu_ops the iommu_found function needs to work on a bus_type too. This patch adds a bus_type parameter to that function and converts all call-places. The function is also renamed to iommu_present because the function now checks if an iommu is present for a given bus and does not check for a global iommu anymore. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
由 Joerg Roedel 提交于
This is necessary to store a pointer to the bus-specific iommu_ops in the iommu-domain structure. It will be used later to call into bus-specific iommu-ops. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
- 26 9月, 2011 6 次提交
-
-
由 Jan Kiszka 提交于
The threaded IRQ handler for MSI-X has almost nothing in common with the INTx/MSI handler. Move its code into a dedicated handler. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Jan Kiszka 提交于
We only perform work in kvm_assigned_dev_ack_irq if the guest IRQ is of INTx type. This completely avoids the callback invocation in non-INTx cases by registering the IRQ ack notifier only for INTx. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Jan Kiszka 提交于
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Sasha Levin 提交于
Currently the method of dealing with an IO operation on a bus (PIO/MMIO) is to call the read or write callback for each device registered on the bus until we find a device which handles it. Since the number of devices on a bus can be significant due to ioeventfds and coalesced MMIO zones, this leads to a lot of overhead on each IO operation. Instead of registering devices, we now register ranges which points to a device. Lookup is done using an efficient bsearch instead of a linear search. Performance test was conducted by comparing exit count per second with 200 ioeventfds created on one byte and the guest is trying to access a different byte continuously (triggering usermode exits). Before the patch the guest has achieved 259k exits per second, after the patch the guest does 274k exits per second. Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Sasha Levin 提交于
This patch changes coalesced mmio to create one mmio device per zone instead of handling all zones in one device. Doing so enables us to take advantage of existing locking and prevents a race condition between coalesced mmio registration/unregistration and lookups. Suggested-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Sasha Levin 提交于
Move the check whether there are available entries to within the spinlock. This allows working with larger amount of VCPUs and reduces premature exits when using a large number of VCPUs. Cc: Avi Kivity <avi@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 24 9月, 2011 1 次提交
-
-
由 Greg Rose 提交于
Device drivers that create and destroy SR-IOV virtual functions via calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic failures if they attempt to destroy VFs while they are assigned to guest virtual machines. By adding a flag for use by the KVM module to indicate that a device is assigned a device driver can check that flag and avoid destroying VFs while they are assigned and avoid system failures. CC: Ian Campbell <ijc@hellion.org.uk> CC: Konrad Wilk <konrad.wilk@oracle.com> Signed-off-by: NGreg Rose <gregory.v.rose@intel.com> Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 24 7月, 2011 2 次提交
-
-
由 Alex Williamson 提交于
IOMMU interrupt remapping support provides a further layer of isolation for device assignment by preventing arbitrary interrupt block DMA writes by a malicious guest from reaching the host. By default, we should require that the platform provides interrupt remapping support, with an opt-in mechanism for existing behavior. Both AMD IOMMU and Intel VT-d2 hardware support interrupt remapping, however we currently only have software support on the Intel side. Users wishing to re-enable device assignment when interrupt remapping is not supported on the platform can use the "allow_unsafe_assigned_interrupts=1" module option. [avi: break long lines] Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
The idea is from Avi: | We could cache the result of a miss in an spte by using a reserved bit, and | checking the page fault error code (or seeing if we get an ept violation or | ept misconfiguration), so if we get repeated mmio on a page, we don't need to | search the slot list/tree. | (https://lkml.org/lkml/2011/2/22/221) When the page fault is caused by mmio, we cache the info in the shadow page table, and also set the reserved bits in the shadow page table, so if the mmio is caused again, we can quickly identify it and emulate it directly Searching mmio gfn in memslots is heavy since we need to walk all memeslots, it can be reduced by this feature, and also avoid walking guest page table for soft mmu. [jan: fix operator precedence issue] Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-