- 30 10月, 2012 4 次提交
-
-
由 Paul Mackerras 提交于
If a thread in a virtual core becomes runnable while other threads in the same virtual core are already running in the guest, it is possible for the latecomer to join the others on the core without first pulling them all out of the guest. Currently this only happens rarely, when a vcpu is first started. This fixes some bugs and omissions in the code in this case. First, we need to check for VPA updates for the latecomer and make a DTL entry for it. Secondly, if it comes along while the master vcpu is doing a VPA update, we don't need to do anything since the master will pick it up in kvmppc_run_core. To handle this correctly we introduce a new vcore state, VCORE_STARTING. Thirdly, there is a race because we currently clear the hardware thread's hwthread_req before waiting to see it get to nap. A latecomer thread could have its hwthread_req cleared before it gets to test it, and therefore never increment the nap_count, leading to messages about wait_for_nap timeouts. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
There were a few places where we were traversing the list of runnable threads in a virtual core, i.e. vc->runnable_threads, without holding the vcore spinlock. This extends the places where we hold the vcore spinlock to cover everywhere that we traverse that list. Since we possibly need to sleep inside kvmppc_book3s_hv_page_fault, this moves the call of it from kvmppc_handle_exit out to kvmppc_vcpu_run, where we don't hold the vcore lock. In kvmppc_vcore_blocked, we don't actually need to check whether all vcpus are ceded and don't have any pending exceptions, since the caller has already done that. The caller (kvmppc_run_vcpu) wasn't actually checking for pending exceptions, so we add that. The change of if to while in kvmppc_run_vcpu is to make sure that we never call kvmppc_remove_runnable() when the vcore state is RUNNING or EXITING. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
When a Book3S HV KVM guest is running, we need the host to be in single-thread mode, that is, all of the cores (or at least all of the cores where the KVM guest could run) to be running only one active hardware thread. This is because of the hardware restriction in POWER processors that all of the hardware threads in the core must be in the same logical partition. Complying with this restriction is much easier if, from the host kernel's point of view, only one hardware thread is active. This adds two hooks in the SMP hotplug code to allow the KVM code to make sure that secondary threads (i.e. hardware threads other than thread 0) cannot come online while any KVM guest exists. The KVM code still has to check that any core where it runs a guest has the secondary threads offline, but having done that check it can now be sure that they will not come online while the guest is running. Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexander Graf 提交于
The new uapi framework splits kernel internal and user space exported bits of header files more cleanly. Adjust the ePAPR header accordingly. Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 06 10月, 2012 18 次提交
-
-
由 Paul Mackerras 提交于
The PAPR paravirtualization interface lets guests register three different types of per-vCPU buffer areas in its memory for communication with the hypervisor. These are called virtual processor areas (VPAs). Currently the hypercalls to register and unregister VPAs are handled by KVM in the kernel, and userspace has no way to know about or save and restore these registrations across a migration. This adds "register" codes for these three areas that userspace can use with the KVM_GET/SET_ONE_REG ioctls to see what addresses have been registered, and to register or unregister them. This will be needed for guest hibernation and migration, and is also needed so that userspace can unregister them on reset (otherwise we corrupt guest memory after reboot by writing to the VPAs registered by the previous kernel). The "register" for the VPA is a 64-bit value containing the address, since the length of the VPA is fixed. The "registers" for the SLB shadow buffer and dispatch trace log (DTL) are 128 bits long, consisting of the guest physical address in the high (first) 64 bits and the length in the low 64 bits. This also fixes a bug where we were calling init_vpa unconditionally, leading to an oops when unregistering the VPA. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
This enables userspace to get and set all the guest floating-point state using the KVM_[GS]ET_ONE_REG ioctls. The floating-point state includes all of the traditional floating-point registers and the FPSCR (floating point status/control register), all the VMX/Altivec vector registers and the VSCR (vector status/control register), and on POWER7, the vector-scalar registers (note that each FP register is the high-order half of the corresponding VSR). Most of these are implemented in common Book 3S code, except for VSX on POWER7. Because HV and PR differ in how they store the FP and VSX registers on POWER7, the code for these cases is not common. On POWER7, the FP registers are the upper halves of the VSX registers vsr0 - vsr31. PR KVM stores vsr0 - vsr31 in two halves, with the upper halves in the arch.fpr[] array and the lower halves in the arch.vsr[] array, whereas HV KVM on POWER7 stores the whole VSX register in arch.vsr[]. Signed-off-by: NPaul Mackerras <paulus@samba.org> [agraf: fix whitespace, vsx compilation] Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
This enables userspace to get and set various SPRs (special-purpose registers) using the KVM_[GS]ET_ONE_REG ioctls. With this, userspace can get and set all the SPRs that are part of the guest state, either through the KVM_[GS]ET_REGS ioctls, the KVM_[GS]ET_SREGS ioctls, or the KVM_[GS]ET_ONE_REG ioctls. The SPRs that are added here are: - DABR: Data address breakpoint register - DSCR: Data stream control register - PURR: Processor utilization of resources register - SPURR: Scaled PURR - DAR: Data address register - DSISR: Data storage interrupt status register - AMR: Authority mask register - UAMOR: User authority mask override register - MMCR0, MMCR1, MMCRA: Performance monitor unit control registers - PMC1..PMC8: Performance monitor unit counter registers In order to reduce code duplication between PR and HV KVM code, this moves the kvm_vcpu_ioctl_[gs]et_one_reg functions into book3s.c and centralizes the copying between user and kernel space there. The registers that are handled differently between PR and HV, and those that exist only in one flavor, are handled in kvmppc_[gs]et_one_reg() functions that are specific to each flavor. Signed-off-by: NPaul Mackerras <paulus@samba.org> [agraf: minimal style fixes] Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
This adds an implementation of kvm_arch_flush_shadow_memslot for Book3S HV, and arranges for kvmppc_core_commit_memory_region to flush the dirty log when modifying an existing slot. With this, we can handle deletion and modification of memory slots. kvm_arch_flush_shadow_memslot calls kvmppc_core_flush_memslot, which on Book3S HV now traverses the reverse map chains to remove any HPT (hashed page table) entries referring to pages in the memslot. This gets called by generic code whenever deleting a memslot or changing the guest physical address for a memslot. We flush the dirty log in kvmppc_core_commit_memory_region for consistency with what x86 does. We only need to flush when an existing memslot is being modified, because for a new memslot the rmap array (which stores the dirty bits) is all zero, meaning that every page is considered clean already, and when deleting a memslot we obviously don't care about the dirty bits any more. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
Now that we have an architecture-specific field in the kvm_memory_slot structure, we can use it to store the array of page physical addresses that we need for Book3S HV KVM on PPC970 processors. This reduces the size of struct kvm_arch for Book3S HV, and also reduces the size of struct kvm_arch_memory_slot for other PPC KVM variants since the fields in it are now only compiled in for Book3S HV. This necessitates making the kvm_arch_create_memslot and kvm_arch_free_memslot operations specific to each PPC KVM variant. That in turn means that we now don't allocate the rmap arrays on Book3S PR and Book E. Since we now unpin pages and free the slot_phys array in kvmppc_core_free_memslot, we no longer need to do it in kvmppc_core_destroy_vm, since the generic code takes care to free all the memslots when destroying a VM. We now need the new memslot to be passed in to kvmppc_core_prepare_memory_region, since we need to initialize its arch.slot_phys member on Book3S HV. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Mihai Caraman 提交于
The current form of DO_KVM macro restricts its use to one call per input parameter set. This is caused by kvmppc_resume_\intno\()_\srr1 symbol definition. Duplicate calls of DO_KVM are required by distinct implementations of exeption handlers which are delegated at runtime. Use a rare label number to avoid conflicts with the calling contexts. Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Bharat Bhushan 提交于
IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG interface is added to set/get them. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Bharat Bhushan 提交于
This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: NLiu Yu <yu.liu@freescale.com> Signed-off-by: NScott Wood <scottwood@freescale.com> [bharat.bhushan@freescale.com: reworked patch] Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> [agraf: adjust to new request framework] Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexander Graf 提交于
Requests may want to tell us that we need to go back into host state, so add a return value for the checks. Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexander Graf 提交于
Today, we disable preemption while inside guest context, because we need to expose to the world that we are not in a preemptible context. However, during that time we already have interrupts disabled, which would indicate that we are in a non-preemptible context. The reason the checks for irqs_disabled() fail for us though is that we manually control hard IRQs and ignore all the lazy EE framework. Let's stop doing that. Instead, let's always use lazy EE to indicate when we want to disable IRQs, but do a special final switch that gets us into EE disabled, but soft enabled state. That way when we get back out of guest state, we are immediately ready to process interrupts. This simplifies the code drastically and reduces the time that we appear as preempt disabled. Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexander Graf 提交于
Now that we have very simple MMU Notifier support for e500 in place, also add the same simple support to book3s. It gets us one step closer to actual fast support. Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexander Graf 提交于
We need to do the same things when preparing to enter a guest for booke and book3s_pr cores. Fold the generic code into a generic function that both call. Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Alexander Graf 提交于
The e500 target has lived without mmu notifiers ever since it got introduced, but fails for the user space check on them with hugetlbfs. So in order to get that one working, implement mmu notifiers in a reasonably dumb fashion and be happy. On embedded hardware, we almost never end up with mmu notifier calls, since most people don't overcommit. Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Liu Yu-B13201 提交于
Signed-off-by: NLiu Yu <yu.liu@freescale.com> Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Liu Yu-B13201 提交于
Signed-off-by: NLiu Yu <yu.liu@freescale.com> [varun: 64-bit changes] Signed-off-by: NVarun Sethi <Varun.Sethi@freescale.com> Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Liu Yu-B13201 提交于
And add a new flag definition in kvm_ppc_pvinfo to indicate whether the host supports the EV_IDLE hcall. Signed-off-by: NLiu Yu <yu.liu@freescale.com> [stuart.yoder@freescale.com: cleanup,fixes for conditions allowing idle] Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com> [agraf: fix typo] Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Stuart Yoder 提交于
Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Stuart Yoder 提交于
Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 03 10月, 2012 2 次提交
-
-
由 Catalin Marinas 提交于
This function is used by sparc, powerpc and arm64 for compat support. The patch adds a generic implementation which calls do_sendfile() directly and avoids set_fs(). The sparc architecture has wrappers for the sign extensions while powerpc relies on the compiler to do the this. The patch adds wrappers for powerpc to handle the u32->int type conversion. compat_sys_sendfile64() can be replaced by a sys_sendfile() call since compat_loff_t has the same size as off_t on a 64-bit system. On powerpc, the patch also changes the 64-bit sendfile call from sys_sendile64 to sys_sendfile. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 David Howells 提交于
Convert #include "..." to #include <path/...> in kernel system headers. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NDave Jones <davej@redhat.com>
-
- 12 9月, 2012 1 次提交
-
-
由 Gavin Shan 提交于
This patch implements pcibios_window_alignment() so powerpc platforms can force P2P bridge windows to be at larger alignments than the PCI spec requires. [bhelgaas: changelog] Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 05 9月, 2012 1 次提交
-
-
由 Paul Mackerras 提交于
The CPU hotplug code for the powernv platform currently only puts offline CPUs into nap mode if the powersave_nap variable is set. However, HV-style KVM on this platform requires secondary CPU threads to be offline and in nap mode. Since we know nap mode works just fine on all POWER7 machines, and the only machines that support the powernv platform are POWER7 machines, this changes the code to always put offline CPUs into nap mode, regardless of powersave_nap. Powersave_nap still controls whether or not CPUs go into nap mode when idle, as before. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 24 8月, 2012 2 次提交
-
-
由 Michael Neuling 提交于
Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Scott Wood 提交于
Add several #includes that mpic_msgr relies on being pulled implicitly, which only happens on certain configs. Signed-off-by: NScott Wood <scottwood@freescale.com> Cc: Meador Inge <meador_inge@mentor.com> Cc: Jia Hongtao <B38951@freescale.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 20 8月, 2012 1 次提交
-
-
由 Frederic Weisbecker 提交于
The archs that implement virtual cputime accounting all flush the cputime of a task when it gets descheduled and sometimes set up some ground initialization for the next task to account its cputime. These archs all put their own hooks in their context switch callbacks and handle the off-case themselves. Consolidate this by creating a new account_switch_vtime() callback called in generic code right after a context switch and that these archs must implement to flush the prev task cputime and initialize the next task cputime related state. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 16 8月, 2012 1 次提交
-
-
由 Alexander Graf 提交于
When we map a page that wasn't icache cleared before, do so when first mapping it in KVM using the same information bits as the Linux mapping logic. That way we are 100% sure that any page we map does not have stale entries in the icache. Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 06 8月, 2012 1 次提交
-
-
由 Takuya Yoshikawa 提交于
Two reasons: - x86 can integrate rmap and rmap_pde and remove heuristics in __gfn_to_rmap(). - Some architectures do not need rmap. Since rmap is one of the most memory consuming stuff in KVM, ppc'd better restrict the allocation to Book3S HV. Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 31 7月, 2012 1 次提交
-
-
由 Will Deacon 提交于
Rather than #define the options manually in the architecture code, add Kconfig options for them and select them there instead. This also allows us to select the compat IPC version parsing automatically for platforms using the old compat IPC interface. Reported-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 7月, 2012 1 次提交
-
-
由 Marek Szyprowski 提交于
Commit 9adc5374 ('common: dma-mapping: introduce mmap method') added a generic method for implementing mmap user call to dma_map_ops structure. This patch converts ARM and PowerPC architectures (the only providers of dma_mmap_coherent/dma_mmap_writecombine calls) to use this generic dma_map_ops based call and adds a generic cross architecture definition for dma_mmap_attrs, dma_mmap_coherent, dma_mmap_writecombine functions. The generic mmap virt_to_page-based fallback implementation is provided for architectures which don't provide their own implementation for mmap method. Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: NKyungmin Park <kyungmin.park@samsung.com>
-
- 24 7月, 2012 1 次提交
-
-
由 Cong Wang 提交于
Signed-off-by: NCong Wang <amwang@redhat.com>
-
- 19 7月, 2012 1 次提交
-
-
由 Takuya Yoshikawa 提交于
When we tested KVM under memory pressure, with THP enabled on the host, we noticed that MMU notifier took a long time to invalidate huge pages. Since the invalidation was done with mmu_lock held, it not only wasted the CPU but also made the host harder to respond. This patch mitigates this by using kvm_handle_hva_range(). Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Cc: Alexander Graf <agraf@suse.de> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 11 7月, 2012 5 次提交
-
-
由 Bharat Bhushan 提交于
Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Haren Myneni 提交于
Some power systems do not have legacy ISA devices. So, /dev/port is not a valid interface on these systems. User level tools such as kbdrate is trying to access the device using this interface which is causing the system crash. This patch will fix this issue by not creating this interface on these powerpc systems. Signed-off-by: NHaren Myneni <haren@us.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Tiejun Chen 提交于
Add "memory" attribute in inline assembly language as a compiler barrier to make sure 4.6.x GCC don't reorder mfmsr(). Signed-off-by: NTiejun Chen <tiejun.chen@windriver.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> CC: stable@vger.kernel.org
-
由 Anton Blanchard 提交于
We have a request for a fast method of getting CPU and NUMA node IDs from userspace. This patch implements a getcpu VDSO function, similar to x86. Ben suggested we use SPRG3 which is userspace readable. SPRG3 can be modified by a KVM guest, so we save the SPRG3 value in the paca and restore it when transitioning from the guest to the host. I have a glibc patch that implements sched_getcpu on top of this. Testing on a POWER7: baseline: 538 cycles vdso: 30 cycles Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Stuart Yoder 提交于
Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-