- 26 9月, 2011 2 次提交
-
-
由 Paul Mackerras 提交于
This makes arch/powerpc/kvm/book3s_rmhandlers.S and arch/powerpc/kvm/book3s_hv_rmhandlers.S be assembled as separate compilation units rather than having them #included in arch/powerpc/kernel/exceptions-64s.S. We no longer have any conditional branches between the exception prologs in exceptions-64s.S and the KVM handlers, so there is no need to keep their contents close together in the vmlinux image. In their current location, they are using up part of the limited space between the first-level interrupt handlers and the firmware NMI data area at offset 0x7000, and with some kernel configurations this area will overflow (e.g. allyesconfig), leading to an "attempt to .org backwards" error when compiling exceptions-64s.S. Moving them out requires that we add some #includes that the book3s_{,hv_}rmhandlers.S code was previously getting implicitly via exceptions-64s.S. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexander Graf 提交于
When running a PAPR guest, we need to handle a few hypercalls in kernel space, most prominently the page table invalidation (to sync the shadows). So this patch adds handling for a few PAPR hypercalls to PR mode KVM. I tried to share the code with HV mode, but it ended up being a lot easier this way around, as the two differ too much in those details. Signed-off-by: NAlexander Graf <agraf@suse.de> --- v1 -> v2: - whitespace fix
-
- 12 7月, 2011 5 次提交
-
-
由 Paul Mackerras 提交于
This adds infrastructure which will be needed to allow book3s_hv KVM to run on older POWER processors, including PPC970, which don't support the Virtual Real Mode Area (VRMA) facility, but only the Real Mode Offset (RMO) facility. These processors require a physically contiguous, aligned area of memory for each guest. When the guest does an access in real mode (MMU off), the address is compared against a limit value, and if it is lower, the address is ORed with an offset value (from the Real Mode Offset Register (RMOR)) and the result becomes the real address for the access. The size of the RMA has to be one of a set of supported values, which usually includes 64MB, 128MB, 256MB and some larger powers of 2. Since we are unlikely to be able to allocate 64MB or more of physically contiguous memory after the kernel has been running for a while, we allocate a pool of RMAs at boot time using the bootmem allocator. The size and number of the RMAs can be set using the kvm_rma_size=xx and kvm_rma_count=xx kernel command line options. KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability of the pool of preallocated RMAs. The capability value is 1 if the processor can use an RMA but doesn't require one (because it supports the VRMA facility), or 2 if the processor requires an RMA for each guest. This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the pool and returns a file descriptor which can be used to map the RMA. It also returns the size of the RMA in the argument structure. Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION ioctl calls from userspace. To cope with this, we now preallocate the kvm->arch.ram_pginfo array when the VM is created with a size sufficient for up to 64GB of guest memory. Subsequently we will get rid of this array and use memory associated with each memslot instead. This moves most of the code that translates the user addresses into host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level to kvmppc_core_prepare_memory_region. Also, instead of having to look up the VMA for each page in order to check the page size, we now check that the pages we get are compound pages of 16MB. However, if we are adding memory that is mapped to an RMA, we don't bother with calling get_user_pages_fast and instead just offset from the base pfn for the RMA. Typically the RMA gets added after vcpus are created, which makes it inconvenient to have the LPCR (logical partition control register) value in the vcpu->arch struct, since the LPCR controls whether the processor uses RMA or VRMA for the guest. This moves the LPCR value into the kvm->arch struct and arranges for the MER (mediated external request) bit, which is the only bit that varies between vcpus, to be set in assembly code when going into the guest if there is a pending external interrupt request. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 David Gibson 提交于
This improves I/O performance for guests using the PAPR paravirtualization interface by making the H_PUT_TCE hcall faster, by implementing it in real mode. H_PUT_TCE is used for updating virtual IOMMU tables, and is used both for virtual I/O and for real I/O in the PAPR interface. Since this moves the IOMMU tables into the kernel, we define a new KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables. The ioctl returns a file descriptor which can be used to mmap the newly created table. The qemu driver models use them in the same way as userspace managed tables, but they can be updated directly by the guest with a real-mode H_PUT_TCE implementation, reducing the number of host/guest context switches during guest IO. There are certain circumstances where it is useful for userland qemu to write to the TCE table even if the kernel H_PUT_TCE path is used most of the time. Specifically, allowing this will avoid awkwardness when we need to reset the table. More importantly, we will in the future need to write the table in order to restore its state after a checkpoint resume or migration. Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
This adds the infrastructure for handling PAPR hcalls in the kernel, either early in the guest exit path while we are still in real mode, or later once the MMU has been turned back on and we are in the full kernel context. The advantage of handling hcalls in real mode if possible is that we avoid two partition switches -- and this will become more important when we support SMT4 guests, since a partition switch means we have to pull all of the threads in the core out of the guest. The disadvantage is that we can only access the kernel linear mapping, not anything vmalloced or ioremapped, since the MMU is off. This also adds code to handle the following hcalls in real mode: H_ENTER Add an HPTE to the hashed page table H_REMOVE Remove an HPTE from the hashed page table H_READ Read HPTEs from the hashed page table H_PROTECT Change the protection bits in an HPTE H_BULK_REMOVE Remove up to 4 HPTEs from the hashed page table H_SET_DABR Set the data address breakpoint register Plus code to handle the following hcalls in the kernel: H_CEDE Idle the vcpu until an interrupt or H_PROD hcall arrives H_PROD Wake up a ceded vcpu H_REGISTER_VPA Register a virtual processor area (VPA) The code that runs in real mode has to be in the base kernel, not in the module, if KVM is compiled as a module. The real-mode code can only access the kernel linear mapping, not vmalloc or ioremap space. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
This adds support for KVM running on 64-bit Book 3S processors, specifically POWER7, in hypervisor mode. Using hypervisor mode means that the guest can use the processor's supervisor mode. That means that the guest can execute privileged instructions and access privileged registers itself without trapping to the host. This gives excellent performance, but does mean that KVM cannot emulate a processor architecture other than the one that the hardware implements. This code assumes that the guest is running paravirtualized using the PAPR (Power Architecture Platform Requirements) interface, which is the interface that IBM's PowerVM hypervisor uses. That means that existing Linux distributions that run on IBM pSeries machines will also run under KVM without modification. In order to communicate the PAPR hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code to include/linux/kvm.h. Currently the choice between book3s_hv support and book3s_pr support (i.e. the existing code, which runs the guest in user mode) has to be made at kernel configuration time, so a given kernel binary can only do one or the other. This new book3s_hv code doesn't support MMIO emulation at present. Since we are running paravirtualized guests, this isn't a serious restriction. With the guest running in supervisor mode, most exceptions go straight to the guest. We will never get data or instruction storage or segment interrupts, alignment interrupts, decrementer interrupts, program interrupts, single-step interrupts, etc., coming to the hypervisor from the guest. Therefore this introduces a new KVMTEST_NONHV macro for the exception entry path so that we don't have to do the KVM test on entry to those exception handlers. We do however get hypervisor decrementer, hypervisor data storage, hypervisor instruction storage, and hypervisor emulation assist interrupts, so we have to handle those. In hypervisor mode, real-mode accesses can access all of RAM, not just a limited amount. Therefore we put all the guest state in the vcpu.arch and use the shadow_vcpu in the PACA only for temporary scratch space. We allocate the vcpu with kzalloc rather than vzalloc, and we don't use anything in the kvmppc_vcpu_book3s struct, so we don't allocate it. We don't have a shared page with the guest, but we still need a kvm_vcpu_arch_shared struct to store the values of various registers, so we include one in the vcpu_arch struct. The POWER7 processor has a restriction that all threads in a core have to be in the same partition. MMU-on kernel code counts as a partition (partition 0), so we have to do a partition switch on every entry to and exit from the guest. At present we require the host and guest to run in single-thread mode because of this hardware restriction. This code allocates a hashed page table for the guest and initializes it with HPTEs for the guest's Virtual Real Memory Area (VRMA). We require that the guest memory is allocated using 16MB huge pages, in order to simplify the low-level memory management. This also means that we can get away without tracking paging activity in the host for now, since huge pages can't be paged or swapped. This also adds a few new exports needed by the book3s_hv code. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
In preparation for adding code to enable KVM to use hypervisor mode on 64-bit Book 3S processors, this splits book3s.c into two files, book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is specific to running the guest in problem state (user mode) and book3s.c contains code which should apply to all Book 3S processors. In doing this, we abstract some details, namely the interrupt offset, updating the interrupt pending flag, and detecting if the guest is in a critical section. These are all things that will be different when we use hypervisor mode. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 13 10月, 2010 1 次提交
-
-
由 matt mooney 提交于
Replace EXTRA_CFLAGS with ccflags-y and EXTRA_AFLAGS with asflags-y. Signed-off-by: Nmatt mooney <mfm@muteddisk.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 01 8月, 2010 1 次提交
-
-
由 Alexander Graf 提交于
We just introduced generic functions to handle shadow pages on PPC. This patch makes the respective backends make use of them, getting rid of a lot of duplicate code along the way. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 17 5月, 2010 3 次提交
-
-
由 Alexander Graf 提交于
Now that we have all the bits and pieces in place, let's enable building of the Book3S_32 target. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
So far we had a lot of conditional code on CONFIG_KVM_BOOK3S_64_HANDLER. As we're moving towards common code between 32 and 64 bits, most of these ifdefs can be moved to a more generic term define, called CONFIG_KVM_BOOK3S_HANDLER. This patch adds the new generic config option and moves ifdefs over. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
We have quite some code that can be used by Book3S_32 and Book3S_64 alike, so let's call it "Book3S" instead of "Book3S_64", so we can later on use it from the 32 bit port too. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 25 4月, 2010 2 次提交
-
-
由 Alexander Graf 提交于
The one big thing about the Gekko is paired singles. Paired singles are an extension to the instruction set, that adds 32 single precision floating point registers (qprs), some SPRs to modify the behavior of paired singled operations and instructions to deal with qprs to the instruction set. Unfortunately, it also changes semantics of existing operations that affect single values in FPRs. In most cases they get mirrored to the coresponding QPR. Thanks to that we need to emulate all FPU operations and all the new paired single operations too. In order to achieve that, we use the just introduced FPU call helpers to call the real FPU whenever the guest wants to modify an FPR. Additionally we also fix up the QPR values along the way. That way we can execute paired single FPU operations without implementing a soft fpu. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
To emulate paired single instructions, we need to be able to call FPU operations from within the kernel. Since we don't want gcc to spill arbitrary FPU code everywhere, we tell it to use a soft fpu. Since we know we can really call the FPU in safe areas, let's also add some calls that we can later use to actually execute real world FPU operations on the host's FPU. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 05 11月, 2009 1 次提交
-
-
由 Alexander Graf 提交于
Now we have everything in place to be able to build KVM, so let's add it as config option and in the Makefile. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 10 9月, 2009 2 次提交
-
-
由 Marcelo Tosatti 提交于
Return EOPNOTSUPP for KVM_TRACE_ENABLE/PAUSE/DISABLE ioctls. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
[avi: make it build] [avi: fold trace-arch.h into trace.h] CC: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 16 6月, 2009 1 次提交
-
-
由 Michael Ellerman 提交于
Add the option to build the code under arch/powerpc with -Werror. The intention is to make it harder for people to inadvertantly introduce warnings in the arch/powerpc code. It needs to be configurable so that if a warning is introduced, people can easily work around it while it's being fixed. The option is a negative, ie. don't enable -Werror, so that it will be turned on for allyes and allmodconfig builds. The default is n, in the hope that developers will build with -Werror, that will probably lead to some build breaks, I am prepared to be flamed. It's not enabled for math-emu, which is a steaming pile of warnings. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 24 3月, 2009 2 次提交
-
-
由 Hollis Blanchard 提交于
Signed-off-by: NLiu Yu <yu.liu@freescale.com> Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Hollis Blanchard 提交于
The Book E code will be shared with e500. I've left PID in kvmppc_core_emulate_op() just so that we don't need to move kvmppc_set_pid() right now. Once we have the e500 implementation, we can probably share that too. Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 31 12月, 2008 4 次提交
-
-
由 Hollis Blanchard 提交于
Existing KVM statistics are either just counters (kvm_stat) reported for KVM generally or trace based aproaches like kvm_trace. For KVM on powerpc we had the need to track the timings of the different exit types. While this could be achieved parsing data created with a kvm_trace extension this adds too much overhead (at least on embedded PowerPC) slowing down the workloads we wanted to measure. Therefore this patch adds a in-kernel exit timing statistic to the powerpc kvm code. These statistic is available per vm&vcpu under the kvm debugfs directory. As this statistic is low, but still some overhead it can be enabled via a .config entry and should be off by default. Since this patch touched all powerpc kvm_stat code anyway this code is now merged and simplified together with the exit timing statistic code (still working with exit timing disabled in .config). Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Hollis Blanchard 提交于
Cores provide 3 emulation hooks, implemented for example in the new 4xx_emulate.c: kvmppc_core_emulate_op kvmppc_core_emulate_mtspr kvmppc_core_emulate_mfspr Strictly speaking the last two aren't necessary, but provide for more informative error reporting ("unknown SPR"). Long term I'd like to have instruction decoding autogenerated from tables of opcodes, and that way we could aggregate universal, Book E, and core-specific instructions more easily and without redundant switch statements. Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Hollis Blanchard 提交于
This introduces a set of core-provided hooks. For 440, some of these are implemented by booke.c, with the rest in (the new) 44x.c. Note that these hooks are link-time, not run-time. Since it is not possible to build a single kernel for both e500 and 440 (for example), using function pointers would only add overhead. Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Hollis Blanchard 提交于
The division was somewhat artificial and cumbersome, and had no functional benefit anyways: we can only guests built for the real host processor. Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 15 10月, 2008 1 次提交
-
-
由 Jerone Young 提交于
This patch enables KVM_TRACE to build for PowerPC arch. This means just adding sections to Kconfig and Makefile. Signed-off-by: NJerone Young <jyoung5@us.ibm.com> Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 20 7月, 2008 1 次提交
-
-
由 Laurent Vivier 提交于
This patch enables coalesced MMIO for powerpc architecture. It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO. It enables the compilation of coalesced_mmio.c. Signed-off-by: NLaurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 27 4月, 2008 1 次提交
-
-
由 Hollis Blanchard 提交于
This functionality is definitely experimental, but is capable of running unmodified PowerPC 440 Linux kernels as guests on a PowerPC 440 host. (Only tested with 440EP "Bamboo" guests so far, but with appropriate userspace support other SoC/board combinations should work.) See Documentation/powerpc/kvm_440.txt for technical details. [stephen: build fix] Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-