- 22 6月, 2016 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate uses when translating, however this information is not available outside the translate context for various checks. This adds a get_min_page_size callback to MemoryRegionIOMMUOps and a wrapper for it so IOMMU users (such as VFIO) can know the minimum actual page size supported by an IOMMU. As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE as fallback. This removes vfio_container_granularity() and uses new helper in memory_region_iommu_replay() when replaying IOMMU mappings on added IOMMU memory region. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Acked-by: NAlex Williamson <alex.williamson@redhat.com> [dwg: Removed an unnecessary calculation] Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
- 07 6月, 2016 3 次提交
-
-
由 Alexey Kardashevskiy 提交于
We are going to have multiple DMA windows at different offsets on a PCI bus. For the sake of migration, we will have as many TCE table objects pre-created as many windows supported. So we need a way to map windows dynamically onto a PCI bus when migration of a table is completed but at this stage a TCE table object does not have access to a PHB to ask it to map a DMA window backed by just migrated TCE table. This adds a "root" memory region (UINT64_MAX long) to the TCE object. This new region is mapped on a PCI bus with enabled overlapping as there will be one root MR per TCE table, each of them mapped at 0. The actual IOMMU memory region is a subregion of the root region and a TCE table enables/disables this subregion and maps it at the specific offset inside the root MR which is 1:1 mapping of a PCI address space. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Reviewed-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Alexey Kardashevskiy 提交于
The source guest could have reallocated the default TCE table and migrate bigger/smaller table. This adds reallocation in post_load() if the default table size is different on source and destination. This adds @bus_offset, @page_shift to the migration stream as a subsection so when DDW is added, migration to older machines will still be possible. As @bus_offset and @page_shift are not used yet, this makes no change in behavior. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Alexey Kardashevskiy 提交于
Currently TCE tables are created once at start and their sizes never change. We are going to change that by introducing a Dynamic DMA windows support where DMA configuration may change during the guest execution. This changes spapr_tce_new_table() to create an empty zero-size IOMMU memory region (IOMMU MR). Only LIOBN is assigned by the time of creation. It still will be called once at the owner object (VIO or PHB) creation. This introduces an "enabled" state for TCE table objects, some helper functions are added: - spapr_tce_table_enable() receives TCE table parameters, stores in sPAPRTCETable and allocates a guest view of the TCE table (in the user space or KVM) and sets the correct size on the IOMMU MR; - spapr_tce_table_disable() disposes the table and resets the IOMMU MR size; it is made public as the following DDW code will be using it. This changes the PHB reset handler to do the default DMA initialization instead of spapr_phb_realize(). This does not make differenct now but later with more than just one DMA window, we will have to remove them all and create the default one on a system reset. No visible change in behaviour is expected except the actual table will be reallocated every reset. We might optimize this later. The other way to implement this would be dynamically create/remove the TCE table QOM objects but this would make migration impossible as the migration code expects all QOM objects to exist at the receiver so we have to have TCE table objects created when migration begins. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
- 27 5月, 2016 2 次提交
-
-
由 Alexey Kardashevskiy 提交于
At the moment presence of vfio-pci devices on a bus affect the way the guest view table is allocated. If there is no vfio-pci on a PHB and the host kernel supports KVM acceleration of H_PUT_TCE, a table is allocated in KVM. However, if there is vfio-pci and we do yet not KVM acceleration for these, the table has to be allocated by the userspace. At the moment the table is allocated once at boot time but next patches will reallocate it. This moves kvmppc_create_spapr_tce/g_malloc0 and their counterparts to helpers. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Alexey Kardashevskiy 提交于
Since a788f227 "memory: Allow replay of IOMMU mapping notifications" when new VFIO listener is added, all existing IOMMU mappings are replayed. However there is a problem that the base address of an IOMMU memory region (IOMMU MR) is ignored which is not a problem for the existing user (which is pseries) with its default 32bit DMA window starting at 0 but it is if there is another DMA window. This stores the IOMMU's offset_within_address_space and adjusts the IOVA before calling vfio_dma_map/vfio_dma_unmap. As the IOMMU notifier expects IOVA offset rather than the absolute address, this also adjusts IOVA in sPAPR H_PUT_TCE handler before calling notifier(s). Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 19 5月, 2016 1 次提交
-
-
由 Paolo Bonzini 提交于
Move the inclusion out of hw/hw.h, most files do not need it. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 29 1月, 2016 1 次提交
-
-
由 Peter Maydell 提交于
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Message-id: 1453832250-766-6-git-send-email-peter.maydell@linaro.org
-
- 23 10月, 2015 2 次提交
-
-
由 David Gibson 提交于
Because of the way non-VFIO guest IOMMU operations are KVM accelerated, not all TCE tables (guest IOMMU contexts) can support VFIO devices. Currently, this is decided at creation time. To support hotplug of VFIO devices, we need to allow a TCE table which previously didn't allow VFIO devices to be switched so that it can. This patch adds an spapr_tce_set_need_vfio() function to do this, by reallocating the table in userspace if necessary. Currently this doesn't allow the KVM acceleration to be re-enabled if all the VFIO devices are removed. That's an optimization for another time. Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
-
由 David Gibson 提交于
The vfio_accel parameter used when creating a new TCE table (guest IOMMU context) has a confusing name. What it really means is whether we need the TCE table created to be able to support VFIO devices. VFIO is relevant, because when available we use in-kernel acceleration of the TCE table, but that may not work with VFIO devices because updates to the table are handled in kernel, bypass qemu and so don't hit qemu's infrastructure for keeping the VFIO host IOMMU state in sync with the guest IOMMU state. Rename the parameter to "need_vfio" throughout. This is a cosmetic change, with no impact on the logic. Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
-
- 07 7月, 2015 3 次提交
-
-
由 Greg Kurz 提交于
The fact that these enums have matching values is pure coincidence. We actually need to translate from the PAPR definition to the QEMU one. This patch doesn't fix any bug, it is only code cleanup. Suggested-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Greg Kurz 提交于
The tce_list variable is not a TCE but the address to a TCE: we shouldn't clear permission bits as we do now. And this is dead code anyway since we check tce_list is 4K aligned a few lines above. This patch doesn't fix any bug, it is only code cleanup. Suggested-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 David Gibson 提交于
The code for -machine pseries maintains a global sPAPREnvironment structure which keeps track of general state information about the guest platform. This predates the existence of the MachineState structure, but performs basically the same function. Now that we have the generic MachineState, fold sPAPREnvironment into sPAPRMachineState, the pseries specific subclass of MachineState. This is mostly a matter of search and replace, although a few places which relied on the global spapr variable are changed to find the structure via qdev_get_machine(). Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 04 6月, 2015 6 次提交
-
-
由 Thomas Huth 提交于
The check "liobn & 0xFFFFFFFF00000000ULL" in spapr_tce_find_by_liobn() is completely useless since liobn is only declared as an uint32_t parameter. Fix this by using target_ulong instead (this is what most of the callers of this function are using, too). Signed-off-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
Useful for debugging. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
At the moment spapr_tce_find_by_liobn() is used by H_PUT_TCE/... handlers to find an IOMMU by LIOBN. We are going to implement Dynamic DMA windows (DDW), new code will go to a new file and we will use spapr_tce_find_by_liobn() there too so let's make it public. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
This is to reduce VIO noise while debugging PCI DMA. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
PAPR is defined as big endian so TCEs need an adjustment so does this patch. This changes code to have ldq_be_phys() in one place. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
The existing KVM_CREATE_SPAPR_TCE ioctl only support 4G windows max as the window size parameter to the kernel ioctl() is 32-bit so there's no way of expressing a TCE window > 4GB. We are going to add huge DMA windows support so this will create small window and unexpectedly fail later. This disables KVM_CREATE_SPAPR_TCE for windows bigger that 4GB. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 09 3月, 2015 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
Instead of tweaking a TCE table device by adding there a bypass flag, let's add an alias to RAM and IOMMU memory region, and enable/disable those according to the selected bypass mode. This way IOMMU memory region can have size of the actual window rather than ram_size which is essential for upcoming DDW support. This moves bypass logic to VIO layer and keeps @bypass flag in TCE table for migration compatibility only. This replaces spapr_tce_set_bypass() calls with explicit assignment to avoid confusion as the function could do something more that just syncing the @bypass flag. This adds a pointer to VIO device into the sPAPRTCETable struct to provide the sPAPRTCETable device a way to update bypass mode for the VIO device. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 17 2月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
Note that even after this patch, most callers of address_space_* functions must still be under the big QEMU lock, otherwise the memory region returned by address_space_translate can disappear as soon as address_space_translate returns. This will be fixed in the next part of this series. Reviewed-by: NFam Zheng <famz@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 1月, 2015 1 次提交
-
-
由 David Gibson 提交于
spapr_tce_table_finalize() can SEGV if the object was not previously realized. In particular this can be triggered by running qemu-system-ppc -device spapr-tce-table,? The basic problem is that we have mismatched initialization versus finalization: spapr_tce_table_finalize() is attempting to undo things that are done in spapr_tce_table_realize(), not an instance_init function. Therefore, replace spapr_tce_table_finalize() with spapr_tce_table_unrealize(). Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Cc: qemu-stable@nongnu.org Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 29 8月, 2014 1 次提交
-
-
由 Le Tan 提交于
Add a bool variable is_write as a parameter to the translate function of MemoryRegionIOMMUOps to indicate the operation of the access. It can be used for correct fault reporting from within the callback. Change the interface of related functions. Signed-off-by: NLe Tan <tamlokveer@gmail.com> Reviewed-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 15 7月, 2014 1 次提交
-
-
由 Gavin Shan 提交于
The permission of TCE entry should exclude physical base address. Otherwise, unmapping TCE entry can be interpreted to mapping TCE entry wrongly for VFIO devices. Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 27 6月, 2014 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating TCE tables in the host kernel memory and handle H_PUT_TCE requests targeted to specific LIOBN (logical bus number) right in the host without switching to QEMU. At the moment this is used for emulated devices only and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE handler finds a LIOBN and corresponding table, it will put a TCE to the table and complete hypercall execution. The user space will not be notified. Upcoming VFIO support is going to use the same sPAPRTCETable device class so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE tables for VFIO are going to be allocated in the host as well. However VFIO operates with real IOMMU tables and simple copying of a TCE to the real hardware TCE table will not work as guest physical to host physical address translation is requited. So until the host kernel gets VFIO support for H_PUT_TCE, we better not to register VFIO's TCE in the host. This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not in upstream yet and being discussed so now it is always false which means that in-kernel VFIO acceleration is not supported. This adds a bool @vfio_accel flag to the sPAPRTCETable device telling that sPAPRTCETable should not try allocating TCE table in the host kernel for VFIO. The flag is false now as at the moment there is no VFIO. This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic is the same. Since there is only emulated PCI and VIO now, the flag is set to false. Upcoming VFIO support will set it to true. This is a preparation patch so no change in behaviour is expected Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 16 6月, 2014 8 次提交
-
-
由 Alexey Kardashevskiy 提交于
This adds @bus_offset into sPAPRTCETable to tell where TCE table starts from. It is set to 0 for emulated devices. Dynamic DMA windows will use other offset. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it where it is possible. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
This removes window_size as it is basically a copy of nb_table shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are going to support windows as big as the entire RAM and this number will be bigger that 32 capacity, we will have to do something about @window_size anyway and removal seems to be the right way to go. This removes dma_window_start/dma_window_size from sPAPRPHBState as they are no longer used. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
qdev_init_nofail() was replaced by object_property_set_bool("realized") all over the QEMU so do we. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
Currently the default DMA window is represented by a single MemoryRegion. However there can be more than just one window so we need a "root" memory region to be separated from the actual DMA window(s). This introduces a "root" IOMMU memory region and adds a subregion for the default DMA 32bit window. Following patches will add other subregion(s). This initializes a default DMA window subregion size to the guest RAM size as this window can be switched into "bypass" mode which implements direct DMA mapping. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
Currently only single TCE entry per request is supported (H_PUT_TCE). However PAPR+ specification allows multiple entry requests such as H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host kernel via ioctls, support of these calls can accelerate IOMMU operations. This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT. This advertises "multi-tce" capability to the guest if the host kernel supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexey Kardashevskiy 提交于
SPAPR IOMMU is a bus-less device and therefore its only ID in migration stream is an instance id which is not reliable ID as it depends on the command line parameters order. Since libvirt may change the order, we need something better than that. This removes VMSD descriptor from the class definitiion and registers it with @liobn as an intance ID to let the destination side find the right device to receive migration data. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Juan Quintela 提交于
After previous Peter patch, they are redundant. This way we don't assign them except when needed. Once there, there were lots of case where the ".fields" indentation was wrong: .fields = (VMStateField []) { and .fields = (VMStateField []) { Change all the combinations to: .fields = (VMStateField[]){ The biggest problem (appart from aesthetics) was that checkpatch complained when we copy&pasted the code from one place to another. Signed-off-by: NJuan Quintela <quintela@redhat.com> Acked-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
-
- 08 5月, 2014 1 次提交
-
-
由 Stefan Weil 提交于
This fixes warnings from the static code analysis (smatch). Signed-off-by: NStefan Weil <sw@weilnetz.de> Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>
-
- 05 3月, 2014 1 次提交
-
-
由 Laurent Dufour 提交于
This patch introduces the hypervisor call H_GET_TCE which is basically the reverse of H_PUT_TCE, as defined in the Power Architecture Platform Requirements (PAPR). The hcall H_GET_TCE is required by the kdump kernel which is calling it to retrieve the TCE set up by the panicing kernel. Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 02 9月, 2013 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
This converts old style fprintf to traces. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> [agraf: change patch subject] Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 29 7月, 2013 1 次提交
-
-
由 Anthony Liguori 提交于
Model TCE tables as a device that's hooked up as a child object to the owner. Besides the code cleanup, we get a few nice benefits: 1) free actually works now (it was dead code before) 2) the TCE information is visible in the device tree 3) we can expose table information as properties such that if we change the window_size, we can use globals to keep migration working. Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com> Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Message-id: 1374175984-8930-6-git-send-email-aliguori@us.ibm.com [dwg: pseries: savevm support for PAPR TCE tables] Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> [alexey: ppc kvm: fix to compile] Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
-
- 04 7月, 2013 2 次提交
-
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 6月, 2013 1 次提交
-
-
由 Paolo Bonzini 提交于
Fetch the root region from the sPAPRTCETable, and use it to build an AddressSpace and DMAContext. Now, everywhere we have a DMAContext we also have access to the corresponding AddressSpace (either because we create it just before the DMAContext, or because dma_context_memory's AddressSpace is trivially address_space_memory). Acked-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-