- 05 3月, 2012 24 次提交
-
-
由 Scott Wood 提交于
The associativity, not just total size, can differ from the host hardware. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
As Scott put it: > If we get a signal after the check, we want to be sure that we don't > receive the reschedule IPI until after we're in the guest, so that it > will cause another signal check. we need to have interrupts disabled from the point we do signal_check() all the way until we actually enter the guest. This patch fixes potential signal loss races. Reported-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
Our vcpu kick implementation differs a bit from x86 which resulted in us not disabling preemption during the kick. Get it a bit closer to what x86 does. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
When running the 64-bit Book3s PR code without CONFIG_PREEMPT_NONE, we were doing a few things wrong, most notably access to PACA fields without making sure that the pointers stay stable accross the access (preempt_disable()). This patch moves to_svcpu towards a get/put model which allows us to disable preemption while accessing the shadow vcpu fields in the PACA. That way we can run preemptible and everyone's happy! Reported-by: NJörg Sommer <joerg@alea.gnuu.de> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
Somewhere during merges we ended up from local_irq_enable() foo(); local_irq_disable() to always keeping irqs enabled during that part. However, we now have the following code: foo(); local_irq_disable() which disables interrupts without the surrounding code enabling them again! So let's remove that disable and be happy. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
When entering the guest, we want to make sure we're not getting preempted away, so let's disable preemption on entry, but enable it again while handling guest exits. Reported-by: NJörg Sommer <joerg@alea.gnuu.de> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Decrementers are now properly driven by TCR/TSR, and the guest has full read/write access to these registers. The decrementer keeps ticking (and setting the TSR bit) regardless of whether the interrupts are enabled with TCR. The decrementer stops at zero, rather than going negative. Decrementers (and FITs, once implemented) are delivered as level-triggered interrupts -- dequeued when the TSR bit is cleared, not on delivery. Signed-off-by: NLiu Yu <yu.liu@freescale.com> [scottwood@freescale.com: significant changes] Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This allows additional registers to be accessed by the guest in PR-mode KVM without trapping. SPRG4-7 are readable from userspace. On booke, KVM will sync these registers when it enters the guest, so that accesses from guest userspace will work. The guest kernel, OTOH, must consistently use either the real registers or the shared area between exits. This also applies to the already-paravirted SPRG3. On non-booke, it's not clear to what extent SPRG4-7 are supported (they're not architected for book3s, but exist on at least some classic chips). They are copied in the get/set regs ioctls, but I do not see any non-booke emulation. I also do not see any syncing with real registers (in PR-mode) including the user-readable SPRG3. This patch should not make that situation any worse. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
int_pending was only being lowered if a bit in pending_exceptions was cleared during exception delivery -- but for interrupts, we clear it during IACK/TSR emulation. This caused paravirt for enabling MSR[EE] to be ineffective. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This prevents us from inappropriately blocking in a KVM_SET_REGS ioctl -- the MSR[WE] will take effect when the guest is next entered. It also causes SRR1[WE] to be set when we enter the guest's interrupt handler, which is what e500 hardware is documented to do. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This function should be called with interrupts disabled, to avoid a race where an exception is delivered after we check, but the resched kick is received before we disable interrupts (and thus doesn't actually trigger the exit code that would recheck exceptions). booke already does this properly in the lightweight exit case, but not on initial entry. For now, move the call of prepare_to_enter into subarch-specific code so that booke can do the right thing here. Ideally book3s would do the same thing, but I'm having a hard time seeing where it does any interrupt disabling of this sort (plus it has several additional call sites), so I'm deferring the book3s fix to someone more familiar with that code. book3s behavior should be unchanged by this patch. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This function also updates paravirt int_pending, so rename it to be more obvious that this is a collection of checks run prior to (re)entering a guest. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Currently we check prior to returning from a lightweight exit, but not prior to initial entry. book3s already does a similar test. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Bharat Bhushan 提交于
As per specification the decrementer interrupt not happen when DEC is written with 0. Also when DEC is zero, no decrementer running. So we should not start hrtimer for decrementer when DEC = 0. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Bharat Bhushan 提交于
kvmppc_emulate_dec() uses dec_nsec of type unsigned long and does below calculation: dec_nsec = vcpu->arch.dec; dec_nsec *= 1000; This will truncate if DEC value "vcpu->arch.dec" is greater than 0xffff_ffff/1000. For example : For tb_ticks_per_usec = 4a, we can not set decrementer more than ~58ms. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Acked-by: NLiu Yu <yu.liu@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
With hugetlbfs support emerging on e500, we should also support KVM backing its guest memory by it. This patch adds support for hugetlbfs into the e500 shadow mmu code. Signed-off-by: NAlexander Graf <agraf@suse.de> Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
The hardcoded behavior prevents proper SMP support. user space shall specify the vcpu's PIR as the vcpu id. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
It should contain the way, not the absolute TLB0 index. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This implements a shared-memory API for giving host userspace access to the guest's TLB. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Split out the portions of tlbe_priv that should be associated with host entries into tlbe_ref. Base victim selection on the number of hardware entries, not guest entries. For TLB1, where one guest entry can be mapped by multiple host entries, we use the host tlbe_ref for tracking page references. For the guest TLB0 entries, we still track it with gtlb_priv, to avoid having to retranslate if the entry is evicted from the host TLB but not the guest TLB. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
The only place it makes sense to call this function already needs to have preemption disabled. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Delay allocation of the shadow pid until we're ready to disable preemption and write the entry. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Carsten Otte 提交于
This patch exports the s390 SIE hardware control block to userspace via the mapping of the vcpu file descriptor. In order to do so, a new arch callback named kvm_arch_vcpu_fault is introduced for all architectures. It allows to map architecture specific pages. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Carsten Otte 提交于
This patch introduces a new config option for user controlled kernel virtual machines. It introduces a parameter to KVM_CREATE_VM that allows to set bits that alter the capabilities of the newly created virtual machine. The parameter is passed to kvm_arch_init_vm for all architectures. The only valid modifier bit for now is KVM_VM_S390_UCONTROL. This requires CAP_SYS_ADMIN privileges and creates a user controlled virtual machine on s390 architectures. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 27 12月, 2011 2 次提交
-
-
由 Nishanth Aravamudan 提交于
kvm_rma_init() is only called at boot-time, by setup_arch, which is also __init. Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Xiao Guangrong 提交于
Introduce id_to_memslot to get memslot by slot id Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 26 12月, 2011 3 次提交
-
-
由 Scott Wood 提交于
This is required for THIS_MODULE. We recently stopped acquiring it via some other header. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Michael Neuling 提交于
Currently kvmppc_start_thread() tries to wake other SMT threads via xics_wake_cpu(). Unfortunately xics_wake_cpu only exists when CONFIG_SMP=Y so when compiling with CONFIG_SMP=N we get: arch/powerpc/kvm/built-in.o: In function `.kvmppc_start_thread': book3s_hv.c:(.text+0xa1e0): undefined reference to `.xics_wake_cpu' The following should be fine since kvmppc_start_thread() shouldn't called to start non-zero threads when SMP=N since threads_per_core=1. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Andreas Schwab 提交于
kvmppc_h_pr is only available if CONFIG_KVM_BOOK3S_64_PR. Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 08 12月, 2011 1 次提交
-
-
由 Paul Mackerras 提交于
This fixes a problem where a CPU thread coming out of nap mode can think it has valid values in the nonvolatile GPRs (r14 - r31) as saved away in power7_idle, but in fact the values have been trashed because the thread was used for KVM in the mean time. The result is that the thread crashes because code that called power7_idle (e.g., pnv_smp_cpu_kill_self()) goes to use values in registers that have been trashed. The bit field in SRR1 that tells whether state was lost only reflects the most recent nap, which may not have been the nap instruction in power7_idle. So we need an extra PACA field to indicate that state has been lost even if SRR1 indicates that the most recent nap didn't lose state. We clear this field when saving the state in power7_idle, we set it to a non-zero value when we use the thread for KVM, and we test it in power7_wakeup_noloss. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 17 11月, 2011 1 次提交
-
-
由 Alexander Graf 提交于
This reverts commit a15bd354. It exceeded the padding on the SREGS struct, rendering the ABI backwards-incompatible. Conflicts: arch/powerpc/kvm/powerpc.c include/linux/kvm.h Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 16 11月, 2011 1 次提交
-
-
由 Michael Neuling 提交于
If you build with KVM and UP it fails with the following due to a missing include. /arch/powerpc/kvm/book3s_hv.c: In function 'do_h_register_vpa': arch/powerpc/kvm/book3s_hv.c:156:10: error: 'H_PARAMETER' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:156:10: note: each undeclared identifier is reported only once for each function it appears in arch/powerpc/kvm/book3s_hv.c:192:12: error: 'H_RESOURCE' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:222:9: error: 'H_SUCCESS' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c: In function 'kvmppc_pseries_do_hcall': arch/powerpc/kvm/book3s_hv.c:228:30: error: 'H_SUCCESS' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:232:7: error: 'H_CEDE' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:234:7: error: 'H_PROD' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:238:10: error: 'H_PARAMETER' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:250:7: error: 'H_CONFER' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:252:7: error: 'H_REGISTER_VPA' undeclared (first use in this function) make[2]: *** [arch/powerpc/kvm/book3s_hv.o] Error 1 Signed-off-by: NMichael Neuling <mikey@neuling.org> cc: stable@kernel.org (3.1 only) Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 08 11月, 2011 1 次提交
-
-
由 Nishanth Aravamudan 提交于
Fix KVM build for older toolchains (found with .powerpc64-unknown-linux-gnu-gcc (crosstool-NG-1.8.1) 4.3.2): AS arch/powerpc/kvm/book3s_hv_rmhandlers.o arch/powerpc/kvm/book3s_hv_rmhandlers.S: Assembler messages: arch/powerpc/kvm/book3s_hv_rmhandlers.S:1388: Error: Unrecognized opcode: `popcntw' make[1]: *** [arch/powerpc/kvm/book3s_hv_rmhandlers.o] Error 1 make: *** [_module_arch/powerpc/kvm] Error 2 Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 01 11月, 2011 4 次提交
-
-
由 Paul Gortmaker 提交于
None of the files touched here are modules, and they are not exporting any symbols either -- so there is no need to be including the module.h. Builds of all the files remains successful. Even kernel/module.c does not need to include it, since it includes linux/moduleloader.h instead. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Paul Gortmaker 提交于
All these files were including module.h just for the basic EXPORT_SYMBOL infrastructure. We can shift them off to the export.h header which is a way smaller footprint and thus realize some compile time gains. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Paul Gortmaker 提交于
Fix failures in powerpc associated with the previously allowed implicit module.h presence that now lead to things like this: arch/powerpc/mm/mmu_context_hash32.c:76:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL' arch/powerpc/mm/tlb_hash32.c:48:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' arch/powerpc/kernel/pci_32.c:51:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL' arch/powerpc/kernel/iomap.c:36:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' arch/powerpc/platforms/44x/canyonlands.c:126:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' arch/powerpc/kvm/44x.c:168:59: error: 'THIS_MODULE' undeclared (first use in this function) [with several contibutions from Stephen Rothwell <sfr@canb.auug.org.au>] Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Paul Gortmaker 提交于
With module.h being implicitly everywhere via device.h, the absence of explicitly including something for EXPORT_SYMBOL went unnoticed. Since we are heading to fix things up and clean module.h from the device.h file, we need to explicitly include these files now. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 26 9月, 2011 3 次提交
-
-
由 Paul Mackerras 提交于
With a KVM guest operating in SMT4 mode (i.e. 4 hardware threads per core), whenever a CPU goes idle, we have to pull all the other hardware threads in the core out of the guest, because the H_CEDE hcall is handled in the kernel. This is inefficient. This adds code to book3s_hv_rmhandlers.S to handle the H_CEDE hcall in real mode. When a guest vcpu does an H_CEDE hcall, we now only exit to the kernel if all the other vcpus in the same core are also idle. Otherwise we mark this vcpu as napping, save state that could be lost in nap mode (mainly GPRs and FPRs), and execute the nap instruction. When the thread wakes up, because of a decrementer or external interrupt, we come back in at kvm_start_guest (from the system reset interrupt vector), find the `napping' flag set in the paca, and go to the resume path. This has some other ramifications. First, when starting a core, we now start all the threads, both those that are immediately runnable and those that are idle. This is so that we don't have to pull all the threads out of the guest when an idle thread gets a decrementer interrupt and wants to start running. In fact the idle threads will all start with the H_CEDE hcall returning; being idle they will just do another H_CEDE immediately and go to nap mode. This required some changes to kvmppc_run_core() and kvmppc_run_vcpu(). These functions have been restructured to make them simpler and clearer. We introduce a level of indirection in the wait queue that gets woken when external and decrementer interrupts get generated for a vcpu, so that we can have the 4 vcpus in a vcore using the same wait queue. We need this because the 4 vcpus are being handled by one thread. Secondly, when we need to exit from the guest to the kernel, we now have to generate an IPI for any napping threads, because an HDEC interrupt doesn't wake up a napping thread. Thirdly, we now need to be able to handle virtual external interrupts and decrementer interrupts becoming pending while a thread is napping, and deliver those interrupts to the guest when the thread wakes. This is done in kvmppc_cede_reentry, just before fast_guest_return. Finally, since we are not using the generic kvm_vcpu_block for book3s_hv, and hence not calling kvm_arch_vcpu_runnable, we can remove the #ifdef from kvm_arch_vcpu_runnable. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
This simplifies the way that the book3s_pr makes the transition to real mode when entering the guest. We now call kvmppc_entry_trampoline (renamed from kvmppc_rmcall) in the base kernel using a normal function call instead of doing an indirect call through a pointer in the vcpu. If kvm is a module, the module loader takes care of generating a trampoline as it does for other calls to functions outside the module. kvmppc_entry_trampoline then disables interrupts and jumps to kvmppc_handler_trampoline_enter in real mode using an rfi[d]. That then uses the link register as the address to return to (potentially in module space) when the guest exits. This also simplifies the way that we call the Linux interrupt handler when we exit the guest due to an external, decrementer or performance monitor interrupt. Instead of turning on the MMU, then deciding that we need to call the Linux handler and turning the MMU back off again, we now go straight to the handler at the point where we would turn the MMU on. The handler will then return to the virtual-mode code (potentially in the module). Along the way, this moves the setting and clearing of the HID5 DCBZ32 bit into real-mode interrupts-off code, and also makes sure that we clear the MSR[RI] bit before loading values into SRR0/1. The net result is that we no longer need any code addresses to be stored in vcpu->arch. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
This makes arch/powerpc/kvm/book3s_rmhandlers.S and arch/powerpc/kvm/book3s_hv_rmhandlers.S be assembled as separate compilation units rather than having them #included in arch/powerpc/kernel/exceptions-64s.S. We no longer have any conditional branches between the exception prologs in exceptions-64s.S and the KVM handlers, so there is no need to keep their contents close together in the vmlinux image. In their current location, they are using up part of the limited space between the first-level interrupt handlers and the firmware NMI data area at offset 0x7000, and with some kernel configurations this area will overflow (e.g. allyesconfig), leading to an "attempt to .org backwards" error when compiling exceptions-64s.S. Moving them out requires that we add some #includes that the book3s_{,hv_}rmhandlers.S code was previously getting implicitly via exceptions-64s.S. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-