- 08 9月, 2016 1 次提交
-
-
由 Suraj Jitindar Singh 提交于
The struct kvmppc_vcore is a structure used to store various information about a virtual core for a kvm guest. The runnable_threads element of the struct provides a list of all of the currently runnable vcpus on the core (those in the KVMPPC_VCPU_RUNNABLE state). The previous implementation of this list was a linked_list. The next patch requires that the list be able to be iterated over without holding the vcore lock. Reimplement the runnable_threads list in the kvmppc_vcore struct as an array. Implement function to iterate over valid entries in the array and update access sites accordingly. Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 25 8月, 2016 1 次提交
-
-
由 Paul Mackerras 提交于
As discussed recently on the kvm mailing list, David Gibson's intention in commit 178a7875 ("vfio: Enable VFIO device for powerpc", 2016-02-01) was to have the KVM VFIO device built in on all powerpc platforms. This patch adds the "select KVM_VFIO" statement that makes this happen. Currently, arch/powerpc/kvm/Makefile doesn't include vfio.o for the 64-bit kvm module, because the list of objects doesn't use the $(common-objs-y) list. The reason it doesn't is because we don't necessarily want coalesced_mmio.o or emulate.o (for example if HV KVM is the only target), and common-objs-y includes both. Since this is confusing, this patch adjusts the definitions so that we now use $(common-objs-y) in the list for the 64-bit kvm.ko module, emulate.o is removed from common-objs-y and added in the places that need it, and the inclusion of coalesced_mmio.o now depends on CONFIG_KVM_MMIO. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 19 8月, 2016 2 次提交
-
-
由 Paul Mackerras 提交于
It doesn't make sense to create irqfds for a VM that doesn't have in-kernel interrupt controller emulation. There is an existing interface for architecture code to tell the irqfd code whether or not any interrupt controller has been initialized, called kvm_arch_intc_initialized(), so let's implement that for powerpc. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Paul Mackerras 提交于
It turns out that if userspace creates a pseries-type VM without in-kernel XICS (interrupt controller) emulation, and then connects an eventfd to the VM as an irqfd, and the eventfd gets signalled, that the code will try to deliver an interrupt via the non-existent XICS object and crash the host kernel with a NULL pointer dereference. To fix this, we check for the presence of the XICS object before trying to deliver the interrupt, and return with an error if not. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 12 8月, 2016 2 次提交
-
-
由 Christoffer Dall 提交于
KVM devices were manipulating list data structures without any form of synchronization, and some implementations of the create operations also suffered from a lack of synchronization. Now when we've split the xics create operation into create and init, we can hold the kvm->lock mutex while calling the create operation and when manipulating the devices list. The error path in the generic code gets slightly ugly because we have to take the mutex again and delete the device from the list, but holding the mutex during anon_inode_getfd or releasing/locking the mutex in the common non-error path seemed wrong. Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Christoffer Dall 提交于
As we are about to hold the kvm->lock during the create operation on KVM devices, we should move the call to xics_debugfs_init into its own function, since holding a mutex over extended amounts of time might not be a good idea. Introduce an init operation on the kvm_device_ops struct which cannot fail and call this, if configured, after the device has been created. Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 02 8月, 2016 1 次提交
-
-
由 Sam Bobroff 提交于
Introduce a new KVM capability, KVM_CAP_PPC_HTM, that can be queried to determine if a PowerPC KVM guest should use HTM (Hardware Transactional Memory). This will be used by QEMU to populate the pa-features bits in the guest's device tree. Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 7月, 2016 2 次提交
-
-
由 Paul Mackerras 提交于
It turns out that if the guest does a H_CEDE while the CPU is in a transactional state, and the H_CEDE does a nap, and the nap loses the architected state of the CPU (which is is allowed to do), then we lose the checkpointed state of the virtual CPU. In addition, the transactional-memory state recorded in the MSR gets reset back to non-transactional, and when we try to return to the guest, we take a TM bad thing type of program interrupt because we are trying to transition from non-transactional to transactional with a hrfid instruction, which is not permitted. The result of the program interrupt occurring at that point is that the host CPU will hang in an infinite loop with interrupts disabled. Thus this is a denial of service vulnerability in the host which can be triggered by any guest (and depending on the guest kernel, it can potentially triggered by unprivileged userspace in the guest). This vulnerability has been assigned the ID CVE-2016-5412. To fix this, we save the TM state before napping and restore it on exit from the nap, when handling a H_CEDE in real mode. The case where H_CEDE exits to host virtual mode is already OK (as are other hcalls which exit to host virtual mode) because the exit path saves the TM state. Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Paul Mackerras 提交于
This moves the transactional memory state save and restore sequences out of the guest entry/exit paths into separate procedures. This is so that these sequences can be used in going into and out of nap in a subsequent patch. The only code changes here are (a) saving and restore LR on the stack, since these new procedures get called with a bl instruction, (b) explicitly saving r1 into the PACA instead of assuming that HSTATE_HOST_R1(r13) is already set, and (c) removing an unnecessary and redundant setting of MSR[TM] that should have been removed by commit 9d4d0bdd9e0a ("KVM: PPC: Book3S HV: Add transactional memory support", 2013-09-24) but wasn't. Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 21 7月, 2016 2 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Moving probe_machine() to after mmu init will cause the ppc_md fields relative to the hash table management to be overwritten. Since we have essentially disconnected the machine type from the hash backend ops, finish the job by moving them to a different structure. The only callback that didn't quite fix is update_partition_table since this is not specific to hash, so I moved it to a standalone variable for now. We can revisit later if needed. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Fix ppc64e build failure in kexec] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
The various calls to establish exception endianness and AIL are now done from a single point using already established CPU and FW feature bits to decide what to do. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 19 7月, 2016 1 次提交
-
-
由 Arnd Bergmann 提交于
There are very few files that need add an -I$(obj) gcc for the preprocessor or the assembler. For C files, we add always these for both the objtree and srctree, but for the other ones we require the Makefile to add them, and Kbuild then adds it for both trees. As a preparation for changing the meaning of the -I$(obj) directive to only refer to the srctree, this changes the two instances in arch/x86 to use an explictit $(objtree) prefix where needed, otherwise we won't find the headers any more, as reported by the kbuild 0day builder. arch/x86/realmode/rm/realmode.lds.S:75:20: fatal error: pasyms.h: No such file or directory Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMichal Marek <mmarek@suse.com>
-
- 15 7月, 2016 1 次提交
-
-
由 Shreyas B. Prabhu 提交于
Functions like power7_wakeup_loss, power7_wakeup_noloss, power7_wakeup_tb_loss are used by POWER7 and POWER8 hardware. They can also be used by POWER9. Hence rename these functions hardware agnostic names. Suggested-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 14 7月, 2016 2 次提交
-
-
由 Daniel Axtens 提交于
kvmppc_h_put_tce_indirect labels a u64 pointer as __user. It also labelled the u64 where get_user puts the result as __user. This isn't a pointer and so doesn't need to be labelled __user. Split the u64 value definition onto a new line to make it clear that it doesn't get the annotation. Signed-off-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Radim Krčmář 提交于
Arch-specific code will use it. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 01 7月, 2016 1 次提交
-
-
由 Paolo Bonzini 提交于
Use the functions from context_tracking.h directly. Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 6月, 2016 3 次提交
-
-
由 Mahesh Salgaonkar 提交于
When a guest is assigned to a core it converts the host Timebase (TB) into guest TB by adding guest timebase offset before entering into guest. During guest exit it restores the guest TB to host TB. This means under certain conditions (Guest migration) host TB and guest TB can differ. When we get an HMI for TB related issues the opal HMI handler would try fixing errors and restore the correct host TB value. With no guest running, we don't have any issues. But with guest running on the core we run into TB corruption issues. If we get an HMI while in the guest, the current HMI handler invokes opal hmi handler before forcing guest to exit. The guest exit path subtracts the guest TB offset from the current TB value which may have already been restored with host value by opal hmi handler. This leads to incorrect host and guest TB values. With split-core, things become more complex. With split-core, TB also gets split and each subcore gets its own TB register. When a hmi handler fixes a TB error and restores the TB value, it affects all the TB values of sibling subcores on the same core. On TB errors all the thread in the core gets HMI. With existing code, the individual threads call opal hmi handle independently which can easily throw TB out of sync if we have guest running on subcores. Hence we will need to co-ordinate with all the threads before making opal hmi handler call followed by TB resync. This patch introduces a sibling subcore state structure (shared by all threads in the core) in paca which holds information about whether sibling subcores are in Guest mode or host mode. An array in_guest[] of size MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore. The subcore id is used as index into in_guest[] array. Only primary thread entering/exiting the guest is responsible to set/unset its designated array element. On TB error, we get HMI interrupt on every thread on the core. Upon HMI, this patch will now force guest to vacate the core/subcore. Primary thread from each subcore will then turn off its respective bit from the above bitmap during the guest exit path just after the guest->host partition switch is complete. All other threads that have just exited the guest OR were already in host will wait until all other subcores clears their respective bit. Once all the subcores turn off their respective bit, all threads will will make call to opal hmi handler. It is not necessary that opal hmi handler would resync the TB value for every HMI interrupts. It would do so only for the HMI caused due to TB errors. For rest, it would not touch TB value. Hence to make things simpler, primary thread would call TB resync explicitly once for each core immediately after opal hmi handler instead of subtracting guest offset from TB. TB resync call will restore the TB with host value. Thus we can be sure about the TB state. One of the primary threads exiting the guest will take up the responsibility of calling TB resync. It will use one of the top bits (bit 63) from subcore state flags bitmap to make the decision. The first primary thread (among the subcores) that is able to set the bit will have to call the TB resync. Rest all other threads will wait until TB resync is complete. Once TB resync is complete all threads will then proceed. Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Thomas Huth 提交于
vcpu->arch.shadow_srr1 only contains usable values for injecting a program exception into the guest if we entered the function kvmppc_handle_exit_pr() with exit_nr == BOOK3S_INTERRUPT_PROGRAM. In other cases, the shadow_srr1 bits are zero. Since we want to pass an illegal-instruction program check to the guest, set "flags" to SRR1_PROGILL for these other cases. Signed-off-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Thomas Huth 提交于
If kvmppc_handle_exit_pr() calls kvmppc_emulate_instruction() to emulate one instruction (in the BOOK3S_INTERRUPT_H_EMUL_ASSIST case), it calls kvmppc_core_queue_program() afterwards if kvmppc_emulate_instruction() returned EMULATE_FAIL, so the guest gets an program interrupt for the illegal opcode. However, the kvmppc_emulate_instruction() also tried to inject a program exception for this already, so the program interrupt gets injected twice and the return address in srr0 gets destroyed. All other callers of kvmppc_emulate_instruction() are also injecting a program interrupt, and since the callers have the right knowledge about the srr1 flags that should be used, it is the function kvmppc_emulate_instruction() that should _not_ inject program interrupts, so remove the kvmppc_core_queue_program() here. This fixes the issue discovered by Laurent Vivier with kvm-unit-tests where the logs are filled with these messages when the test tries to execute an illegal instruction: Couldn't emulate instruction 0x00000000 (op 0 xop 0) kvmppc_handle_exit_pr: emulation at 700 failed (00000000) Signed-off-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NAlexander Graf <agraf@suse.de> Tested-by: NLaurent Vivier <lvivier@redhat.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 14 6月, 2016 1 次提交
-
-
由 Michael Ellerman 提交于
We're approaching 20 locations where we need to check for ELF ABI v2. That's fine, except the logic is a bit awkward, because we have to check that _CALL_ELF is defined and then what its value is. So check it once in asm/types.h and define PPC64_ELF_ABI_v2 when ELF ABI v2 is detected. We also have a few places where what we're really trying to check is that we are using the 64-bit v1 ABI, ie. function descriptors. So also add a #define for that, which simplifies several checks. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 13 5月, 2016 1 次提交
-
-
由 Christian Borntraeger 提交于
Some wakeups should not be considered a sucessful poll. For example on s390 I/O interrupts are usually floating, which means that _ALL_ CPUs would be considered runnable - letting all vCPUs poll all the time for transactional like workload, even if one vCPU would be enough. This can result in huge CPU usage for large guests. This patch lets architectures provide a way to qualify wakeups if they should be considered a good/bad wakeups in regard to polls. For s390 the implementation will fence of halt polling for anything but known good, single vCPU events. The s390 implementation for floating interrupts does a wakeup for one vCPU, but the interrupt will be delivered by whatever CPU checks first for a pending interrupt. We prefer the woken up CPU by marking the poll of this CPU as "good" poll. This code will also mark several other wakeup reasons like IPI or expired timers as "good". This will of course also mark some events as not sucessful. As KVM on z runs always as a 2nd level hypervisor, we prefer to not poll, unless we are really sure, though. This patch successfully limits the CPU usage for cases like uperf 1byte transactional ping pong workload or wakeup heavy workload like OLTP while still providing a proper speedup. This also introduced a new vcpu stat "halt_poll_no_tuning" that marks wakeups that are considered not good for polling. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version) Cc: David Matlack <dmatlack@google.com> Cc: Wanpeng Li <kernellwp@gmail.com> [Rename config symbol. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 5月, 2016 1 次提交
-
-
由 Paul Mackerras 提交于
Commit c9a5ecca ("kvm/eventfd: add arch-specific set_irq", 2015-10-16) added the possibility for architecture-specific code to handle the generation of virtual interrupts in atomic context where possible, without having to schedule a work function. Since we can easily generate virtual interrupts on XICS without having to do anything worse than take a spinlock, we define a kvm_arch_set_irq_inatomic() for XICS. We also remove kvm_set_msi() since it is not used any more. The one slightly tricky thing is that with the new interface, we don't get told whether the interrupt is an MSI (or other edge sensitive interrupt) vs. level-sensitive. The difference as far as interrupt generation is concerned is that for LSIs we have to set the asserted flag so it will continue to fire until it is explicitly cleared. In fact the XICS code gets told which interrupts are LSIs by userspace when it configures the interrupt via the KVM_DEV_XICS_GRP_SOURCES attribute group on the XICS device. To store this information, we add a new "lsi" field to struct ics_irq_state. With that we can also do a better job of returning accurate values when reading the attribute group. Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 11 5月, 2016 4 次提交
-
-
由 Gavin Shan 提交于
When CONFIG_KVM_XICS is enabled, CPU_UP_PREPARE and other macros for CPU states in linux/cpu.h are needed by arch/powerpc/kvm/book3s_hv.c. Otherwise, build error as below is seen: gwshan@gwshan:~/sandbox/l$ make arch/powerpc/kvm/book3s_hv.o : CC arch/powerpc/kvm/book3s_hv.o arch/powerpc/kvm/book3s_hv.c: In function ‘kvmppc_cpu_notify’: arch/powerpc/kvm/book3s_hv.c:3072:7: error: ‘CPU_UP_PREPARE’ \ undeclared (first use in this function) This fixes the issue introduced by commit <6f3bb809> ("KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs"). Fixes: 6f3bb809 Cc: stable@vger.kernel.org # v4.6 Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Paul Mackerras 提交于
When the guest does a sign-extending load instruction (such as lha or lwa) to an emulated MMIO location, it results in a call to kvmppc_handle_loads() in the host. That function sets the vcpu->arch.mmio_sign_extend flag and calls kvmppc_handle_load() to do the rest of the work. However, kvmppc_handle_load() sets the mmio_sign_extend flag to 0 unconditionally, so the sign extension never gets done. To fix this, we rename kvmppc_handle_load to __kvmppc_handle_load and add an explicit parameter to indicate whether sign extension is required. kvmppc_handle_load() and kvmppc_handle_loads() then become 1-line functions that just call __kvmppc_handle_load() with the extra parameter. Reported-by: NBin Lu <lblulb@linux.vnet.ibm.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Reviewed-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Alexey Kardashevskiy 提交于
When XICS_DBG is enabled, gcc produces format errors. This fixes formats to match passed values types. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Laurent Vivier 提交于
Until now, when we connect gdb to the QEMU gdb-server, the single-step mode is not managed. This patch adds this, only for kvm-pr: If KVM_GUESTDBG_SINGLESTEP is set, we enable single-step trace bit in the MSR (MSR_SE) just before the __kvmppc_vcpu_run(), and disable it just after. In kvmppc_handle_exit_pr, instead of routing the interrupt to the guest, we return to host, with KVM_EXIT_DEBUG reason. Signed-off-by: NLaurent Vivier <lvivier@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Reviewed-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 01 5月, 2016 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
PowerISA 3.0 adds a parition table indexed by LPID. Parition table allows us to specify the MMU model that will be used for guest and host translation. This patch adds support with SLB based hash model (UPRT = 0). What is required with this model is to support the new hash page table entry format and also setup partition table such that we use hash table for address translation. We don't have segment table support yet. In order to make sure we don't load KVM module on Power9 (since we don't have kvm support yet) this patch also disables KVM on Power9. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
PowerISA 3.0 introduces two pte bits with the below meaning for radix: 00 -> Normal Memory 01 -> Strong Access Order (SAO) 10 -> Non idempotent I/O (Cache inhibited and guarded) 11 -> Tolerant I/O (Cache inhibited) We drop the existing WIMG bits in the Linux page table in favour of the above constants. We loose _PAGE_WRITETHRU with this conversion. We only use writethru via pgprot_cached_wthru() which is used by fbdev/controlfb.c which is Apple control display and also PPC32. With respect to _PAGE_COHERENCE, we have been marking hpte always coherent for some time now. htab_convert_pte_flags() always added HPTE_R_M. NOTE: KVM changes need closer review. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 22 3月, 2016 3 次提交
-
-
由 Paolo Bonzini 提交于
Build on 32-bit PPC fails with the following error: int kvm_vfio_ops_init(void) ^ In file included from arch/powerpc/kvm/../../../virt/kvm/vfio.c:21:0: arch/powerpc/kvm/../../../virt/kvm/vfio.h:8:90: note: previous definition of ‘kvm_vfio_ops_init’ was here arch/powerpc/kvm/../../../virt/kvm/vfio.c:292:6: error: redefinition of ‘kvm_vfio_ops_exit’ void kvm_vfio_ops_exit(void) ^ In file included from arch/powerpc/kvm/../../../virt/kvm/vfio.c:21:0: arch/powerpc/kvm/../../../virt/kvm/vfio.h:12:91: note: previous definition of ‘kvm_vfio_ops_exit’ was here scripts/Makefile.build:258: recipe for target arch/powerpc/kvm/../../../virt/kvm/vfio.o failed make[3]: *** [arch/powerpc/kvm/../../../virt/kvm/vfio.o] Error 1 Check whether CONFIG_KVM_VFIO is set before including vfio.o in the build. Reported-by: NPranith Kumar <bobby.prani@gmail.com> Tested-by: NPranith Kumar <bobby.prani@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lan Tianyu 提交于
The barrier also orders the write to mode from any reads to the page tables done and so update the comment. Signed-off-by: NLan Tianyu <tianyu.lan@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alexey Kardashevskiy 提交于
Upcoming in-kernel VFIO acceleration needs different handling in real and virtual modes which makes it hard to support both modes in the same handler. This creates a copy of kvmppc_rm_h_stuff_tce and kvmppc_rm_h_put_tce in addition to the existing kvmppc_rm_h_put_tce_indirect. This also fixes linker breakage when only PR KVM was selected (leaving HV KVM off): the kvmppc_h_put_tce/kvmppc_h_stuff_tce functions would not compile at all and the linked would fail. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 3月, 2016 1 次提交
-
-
由 Paul Mackerras 提交于
Thomas Huth discovered that a guest could cause a hard hang of a host CPU by setting the Instruction Authority Mask Register (IAMR) to a suitable value. It turns out that this is because when the code was added to context-switch the new special-purpose registers (SPRs) that were added in POWER8, we forgot to add code to ensure that they were restored to a sane value on guest exit. This adds code to set those registers where a bad value could compromise the execution of the host kernel to a suitable neutral value on guest exit. Cc: stable@vger.kernel.org # v3.14+ Fixes: b005255eReported-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 03 3月, 2016 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
No code changes. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 02 3月, 2016 3 次提交
-
-
由 Alexey Kardashevskiy 提交于
The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not enough for directly mapped windows as the guest can get more than 4GB. This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it via KVM_CAP_SPAPR_TCE_64 capability. The table size is checked against the locked memory limit. Since 64bit windows are to support Dynamic DMA windows (DDW), let's add @bus_offset and @page_shift which are also required by DDW. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Alexey Kardashevskiy 提交于
This enables userspace view of TCE tables to start from non-zero offset on a bus. This will be used for huge DMA windows. This only changes the internal structure, the user interface needs to change in order to use an offset. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Alexey Kardashevskiy 提交于
At the moment the kvmppc_spapr_tce_table struct can only describe 4GB windows and handle fixed size (4K) pages. Dynamic DMA windows support more so these limits need to be extended. This replaces window_size (in bytes, 4GB max) with page_shift (32bit) and size (64bit, in pages). This should cause no behavioural change as this is changing the internal structures only - the user interface still only allows one to create a 32-bit table with 4KiB pages at this stage. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 01 3月, 2016 1 次提交
-
-
由 Adam Buchbinder 提交于
Signed-off-by: NAdam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 29 2月, 2016 3 次提交
-
-
由 Suresh E. Warrier 提交于
Redirecting the wakeup of a VCPU from the H_IPI hypercall to a core running in the host is usually a good idea, most workloads seemed to benefit. However, in one heavily interrupt-driven SMT1 workload, some regression was observed. This patch adds a kvm_hv module parameter called h_ipi_redirect to control this feature. The default value for this tunable is 1 - that is enable the feature. Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Suresh E. Warrier 提交于
This patch adds support to real-mode KVM to search for a core running in the host partition and send it an IPI message with VCPU to be woken. This avoids having to switch to the host partition to complete an H_IPI hypercall when the VCPU which is the target of the the H_IPI is not loaded (is not running in the guest). The patch also includes the support in the IPI handler running in the host to do the wakeup by calling kvmppc_xics_ipi_action for the PPC_MSG_RM_HOST_ACTION message. When a guest is being destroyed, we need to ensure that there are no pending IPIs waiting to wake up a VCPU before we free the VCPUs of the guest. This is accomplished by: - Forces a PPC_MSG_CALL_FUNCTION IPI to be completed by all CPUs before freeing any VCPUs in kvm_arch_destroy_vm(). - Any PPC_MSG_RM_HOST_ACTION messages must be executed first before any other PPC_MSG_CALL_FUNCTION messages. Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Suresh Warrier 提交于
This patch adds the support for the kick VCPU operation for kvmppc_host_rm_ops. The kvmppc_xics_ipi_action() function provides the function to be invoked for a host side operation when poked by the real mode KVM. This is initiated by KVM by sending an IPI to any free host core. KVM real mode must set the rm_action to XICS_RM_KICK_VCPU and rm_data to point to the VCPU to be woken up before sending the IPI. Note that we have allocated one kvmppc_host_rm_core structure per core. The above values need to be set in the structure corresponding to the core to which the IPI will be sent. Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-