- 16 7月, 2018 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
We are going to reuse multilevel TCE code for the userspace copy of the TCE table and since it is big endian, let's make the copy big endian too. Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Acked-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 18 5月, 2018 1 次提交
-
-
由 Simon Guo 提交于
Current regs are scattered at kvm_vcpu_arch structure and it will be more neat to organize them into pt_regs structure. Also it will enable reimplementation of MMIO emulation code with analyse_instr() later. Signed-off-by: NSimon Guo <wei.guo.simon@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 17 5月, 2018 2 次提交
-
-
由 Alexey Kardashevskiy 提交于
At the moment we only support in the host the IOMMU page sizes which the guest is aware of, which is 4KB/64KB/16MB. However P9 does not support 16MB IOMMU pages, 2MB and 1GB pages are supported instead. We can still emulate bigger guest pages (for example 16MB) with smaller host pages (4KB/64KB/2MB). This allows the physical IOMMU pages to use a page size smaller or equal than the guest visible IOMMU page size. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Alexey Kardashevskiy 提交于
The other TCE handlers use page shift from the guest visible TCE table (described by kvmppc_spapr_tce_iommu_table) so let's make H_STUFF_TCE handlers do the same thing. This should cause no behavioral change now but soon we will allow the iommu_table::it_page_shift being different from from the emulated table page size so this will play a role. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Acked-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 19 3月, 2018 1 次提交
-
-
由 Paul Mackerras 提交于
Since commit fb1522e0 ("KVM: update to new mmu_notifier semantic v2", 2017-08-31), the MMU notifier code in KVM no longer calls the kvm_unmap_hva callback. This removes the PPC implementations of kvm_unmap_hva(). Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 17 8月, 2017 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Add newer helpers to make the function usage simpler. It is always recommended to use find_current_mm_pte() for walking the page table. If we cannot use find_current_mm_pte(), it should be documented why the said usage of __find_linux_pte() is safe against a parallel THP split. For now we have KVM code using __find_linux_pte(). This is because kvm code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but with PACA soft_enabled = 1. We may want to fix that later and make sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that we can switch kvm to use find_linux_pte(). Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 12 5月, 2017 1 次提交
-
-
由 Paul Mackerras 提交于
POWER9 running a radix guest will take some hypervisor interrupts without going to real mode (turning off the MMU). This means that early hypercall handlers may now be called in virtual mode. Most of the handlers work just fine in both modes, but there are some that can crash the host if called in virtual mode, notably the TCE (IOMMU) hypercalls H_PUT_TCE, H_STUFF_TCE and H_PUT_TCE_INDIRECT. These already have both a real-mode and a virtual-mode version, so we arrange for the real-mode version to return H_TOO_HARD for radix guests, which will result in the virtual-mode version being called. The other hypercall which is sensitive to the MMU mode is H_RANDOM. It doesn't have a virtual-mode version, so this adds code to enable it to be called in either mode. An alternative solution was considered which would refuse to call any of the early hypercall handlers when doing a virtual-mode exit from a radix guest. However, the XICS-on-XIVE code depends on the XICS hypercalls being handled early even for virtual-mode exits, because the handlers need to be called before the XIVE vCPU state has been pulled off the hardware. Therefore that solution would have become quite invasive and complicated, and was rejected in favour of the simpler, though less elegant, solution presented here. Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Tested-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 20 4月, 2017 4 次提交
-
-
由 Alexey Kardashevskiy 提交于
This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO without passing them to user space which saves time on switching to user space and back. This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM. KVM tries to handle a TCE request in the real mode, if failed it passes the request to the virtual mode to complete the operation. If it a virtual mode handler fails, the request is passed to the user space; this is not expected to happen though. To avoid dealing with page use counters (which is tricky in real mode), this only accelerates SPAPR TCE IOMMU v2 clients which are required to pre-register the userspace memory. The very first TCE request will be handled in the VFIO SPAPR TCE driver anyway as the userspace view of the TCE table (iommu_table::it_userspace) is not allocated till the very first mapping happens and we cannot call vmalloc in real mode. If we fail to update a hardware IOMMU table unexpected reason, we just clear it and move on as there is nothing really we can do about it - for example, if we hot plug a VFIO device to a guest, existing TCE tables will be mirrored automatically to the hardware and there is no interface to report to the guest about possible failures. This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd and associates a physical IOMMU table with the SPAPR TCE table (which is a guest view of the hardware IOMMU table). The iommu_table object is cached and referenced so we do not have to look up for it in real mode. This does not implement the UNSET counterpart as there is no use for it - once the acceleration is enabled, the existing userspace won't disable it unless a VFIO container is destroyed; this adds necessary cleanup to the KVM_DEV_VFIO_GROUP_DEL handler. This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user space. This adds real mode version of WARN_ON_ONCE() as the generic version causes problems with rcu_sched. Since we testing what vmalloc_to_phys() returns in the code, this also adds a check for already existing vmalloc_to_phys() call in kvmppc_rm_h_put_tce_indirect(). This finally makes use of vfio_external_user_iommu_id() which was introduced quite some time ago and was considered for removal. Tests show that this patch increases transmission speed from 220MB/s to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card). Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Acked-by: NAlex Williamson <alex.williamson@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Alexey Kardashevskiy 提交于
This reworks helpers for checking TCE update parameters in way they can be used in KVM. This should cause no behavioral change. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Alexey Kardashevskiy 提交于
VFIO on sPAPR already implements guest memory pre-registration when the entire guest RAM gets pinned. This can be used to translate the physical address of a guest page containing the TCE list from H_PUT_TCE_INDIRECT. This makes use of the pre-registrered memory API to access TCE list pages in order to avoid unnecessary locking on the KVM memory reverse map as we know that all of guest memory is pinned and we have a flat array mapping GPA to HPA which makes it simpler and quicker to index into that array (even with looking up the kernel page tables in vmalloc_to_phys) than it is to find the memslot, lock the rmap entry, look up the user page tables, and unlock the rmap entry. Note that the rmap pointer is initialized to NULL where declared (not in this patch). If a requested chunk of memory has not been preregistered, this will fall back to non-preregistered case and lock rmap. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Alexey Kardashevskiy 提交于
The guest view TCE tables are per KVM anyway (not per VCPU) so pass kvm* there. This will be used in the following patches where we will be attaching VFIO containers to LIOBNs via ioctl() to KVM (rather than to VCPU). Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 01 12月, 2016 1 次提交
-
-
由 Paul Mackerras 提交于
This moves the prototypes for functions that are only called from assembler code out of asm/asm-prototypes.h into asm/kvm_ppc.h. The prototypes were added in commit ebe4535f ("KVM: PPC: Book3S HV: sparse: prototypes for functions called from assembler", 2016-10-10), but given that the functions are KVM functions, having them in a KVM header will be better for long-term maintenance. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 21 11月, 2016 2 次提交
-
-
由 Geliang Tang 提交于
Drop duplicate header asm/iommu.h from book3s_64_vio_hv.c. Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Daniel Axtens 提交于
A bunch of KVM functions are only called from assembler. Give them prototypes in asm-prototypes.h This reduces sparse warnings. Signed-off-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 22 3月, 2016 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
Upcoming in-kernel VFIO acceleration needs different handling in real and virtual modes which makes it hard to support both modes in the same handler. This creates a copy of kvmppc_rm_h_stuff_tce and kvmppc_rm_h_put_tce in addition to the existing kvmppc_rm_h_put_tce_indirect. This also fixes linker breakage when only PR KVM was selected (leaving HV KVM off): the kvmppc_h_put_tce/kvmppc_h_stuff_tce functions would not compile at all and the linked would fail. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 3月, 2016 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
No code changes. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 02 3月, 2016 2 次提交
-
-
由 Alexey Kardashevskiy 提交于
This enables userspace view of TCE tables to start from non-zero offset on a bus. This will be used for huge DMA windows. This only changes the internal structure, the user interface needs to change in order to use an offset. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Alexey Kardashevskiy 提交于
At the moment the kvmppc_spapr_tce_table struct can only describe 4GB windows and handle fixed size (4K) pages. Dynamic DMA windows support more so these limits need to be extended. This replaces window_size (in bytes, 4GB max) with page_shift (32bit) and size (64bit, in pages). This should cause no behavioural change as this is changing the internal structures only - the user interface still only allows one to create a 32-bit table with 4KiB pages at this stage. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 16 2月, 2016 5 次提交
-
-
由 Alexey Kardashevskiy 提交于
This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO devices or emulated PCI. These calls allow adding multiple entries (up to 512) into the TCE table in one call which saves time on transition between kernel and user space. The current implementation of kvmppc_h_stuff_tce() allows it to be executed in both real and virtual modes so there is one helper. The kvmppc_rm_h_put_tce_indirect() needs to translate the guest address to the host address and since the translation is different, there are 2 helpers - one for each mode. This implements the KVM_CAP_PPC_MULTITCE capability. When present, the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE if these are enabled by the userspace via KVM_CAP_PPC_ENABLE_HCALL. If they can not be handled by the kernel, they are passed on to the user space. The user space still has to have an implementation for these. Both HV and PR-syle KVM are supported. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Alexey Kardashevskiy 提交于
Upcoming multi-tce support (H_PUT_TCE_INDIRECT/H_STUFF_TCE hypercalls) will validate TCE (not to have unexpected bits) and IO address (to be within the DMA window boundaries). This introduces helpers to validate TCE and IO address. The helpers are exported as they compile into vmlinux (to work in realmode) and will be used later by KVM kernel module in virtual mode. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Alexey Kardashevskiy 提交于
SPAPR_TCE_SHIFT is used in few places only and since IOMMU_PAGE_SHIFT_4K can be easily used instead, remove SPAPR_TCE_SHIFT. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Alexey Kardashevskiy 提交于
At the moment only spapr_tce_tables updates are protected against races but not lookups. This fixes missing protection by using RCU for the list. As lookups also happen in real mode, this uses list_for_each_entry_lockless() (which is expected not to access any vmalloc'd memory). This converts release_spapr_tce_table() to a RCU scheduled handler. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Alexey Kardashevskiy 提交于
This reworks the existing H_PUT_TCE/H_GET_TCE handlers to have following patches applied nicer. This moves the ioba boundaries check to a helper and adds a check for least bits which have to be zeros. The patch is pretty mechanical (only check for least ioba bits is added) so no change in behaviour is expected. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 26 3月, 2014 1 次提交
-
-
由 Laurent Dufour 提交于
This introduces the H_GET_TCE hypervisor call, which is basically the reverse of H_PUT_TCE, as defined in the Power Architecture Platform Requirements (PAPR). The hcall H_GET_TCE is required by the kdump kernel, which uses it to retrieve TCEs set up by the previous (panicked) kernel. Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 17 10月, 2013 1 次提交
-
-
由 Paul Mackerras 提交于
Since the code in book3s_64_vio_hv.c is called from real mode with HV KVM, and therefore has to be built into the main kernel binary, this makes it always built-in rather than part of the KVM module. It gets called from the KVM module by PR KVM, so this adds an EXPORT_SYMBOL_GPL(). Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 06 5月, 2012 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
There is nothing in the code for emulating TCE tables in the kernel that prevents it from working on "PR" KVM... other than ifdef's and location of the code. This and moves the bulk of the code there to a new file called book3s_64_vio.c. This speeds things up a bit on my G5. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [agraf: fix for hv kvm, 32bit, whitespace] Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 12 7月, 2011 1 次提交
-
-
由 David Gibson 提交于
This improves I/O performance for guests using the PAPR paravirtualization interface by making the H_PUT_TCE hcall faster, by implementing it in real mode. H_PUT_TCE is used for updating virtual IOMMU tables, and is used both for virtual I/O and for real I/O in the PAPR interface. Since this moves the IOMMU tables into the kernel, we define a new KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables. The ioctl returns a file descriptor which can be used to mmap the newly created table. The qemu driver models use them in the same way as userspace managed tables, but they can be updated directly by the guest with a real-mode H_PUT_TCE implementation, reducing the number of host/guest context switches during guest IO. There are certain circumstances where it is useful for userland qemu to write to the TCE table even if the kernel H_PUT_TCE path is used most of the time. Specifically, allowing this will avoid awkwardness when we need to reset the table. More importantly, we will in the future need to write the table in order to restore its state after a checkpoint resume or migration. Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-