- 17 6月, 2013 4 次提交
-
-
由 Christian Borntraeger 提交于
Lets use the common waitqueue for kvm cpus on s390. By itself it is just a cleanup, but it should also improve the accuracy of diag 0x44 which is implemented via kvm_vcpu_on_spin. kvm_vcpu_on_spin has an explicit check for waiting on the waitqueue to optimize the yielding. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Michael Mueller 提交于
cleanup of arch specific code to use common code provided vcpu slab cache instead of kzalloc() provided memory Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Christian Borntraeger 提交于
This patch enables kvm to give large pages to the guest. The heavy lifting is done by the hardware, the host only has to take care of the PFMF instruction, which is also part of EDAT-1. We also support the non-quiescing key setting facility if the host supports it, to behave similar to the interpretation of sske. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Christian Borntraeger 提交于
From time to time we need to set the guest storage key. Lets provide a helper function that handles the changes with all the right locking and checking. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 6月, 2013 1 次提交
-
-
由 Marcelo Tosatti 提交于
Its possible that idivl overflows (due to large delta stored in usdiff, valid scenario). Create an exception handler to catch the overflow exception (division by zero is protected by vcpu->arch.virtual_tsc_khz check), and interpret it accordingly (delta is larger than USEC_PER_SEC). Fixes https://bugzilla.redhat.com/show_bug.cgi?id=969644Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 05 6月, 2013 11 次提交
-
-
由 Gleb Natapov 提交于
Quote Gleb's mail: | why don't we check for sp->role.invalid in | kvm_mmu_prepare_zap_page before calling kvm_reload_remote_mmus()? and | Actually we can add check for is_obsolete_sp() there too since | kvm_mmu_invalidate_all_pages() already calls kvm_reload_remote_mmus() | after incrementing mmu_valid_gen. [ Xiao: add some comments and the check of is_obsolete_sp() ] Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
As Marcelo pointed out that | "(retention of large number of pages while zapping) | can be fatal, it can lead to OOM and host crash" We introduce a list, kvm->arch.zapped_obsolete_pages, to link all the pages which are deleted from the mmu cache but not actually freed. When page reclaiming is needed, we always zap this kind of pages first. Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
kvm_zap_obsolete_pages uses lock-break technique to zap pages, it will flush tlb every time when it does lock-break We can reload mmu on all vcpus after updating the generation number so that the obsolete pages are not used on any vcpus, after that we do not need to flush tlb when obsolete pages are zapped It will do kvm_mmu_prepare_zap_page many times and use one kvm_mmu_commit_zap_page to collapse tlb flush, the side-effects is that causes obsolete pages unlinked from active_list but leave on hash-list, so we add the comment around the hash list walker Note: kvm_mmu_commit_zap_page is still needed before free the pages since other vcpus may be doing locklessly shadow page walking Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
Zap at lease 10 pages before releasing mmu-lock to reduce the overload caused by requiring lock After the patch, kvm_zap_obsolete_pages can forward progress anyway, so update the comments [ It improves the case 0.6% ~ 1% that do kernel building meanwhile read PCI ROM. ] Note: i am not sure that "10" is the best speculative value, i just guessed that '10' can make vcpu do not spend long time on kvm_zap_obsolete_pages and do not cause mmu-lock too hungry. Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
The obsolete page will be zapped soon, do not reuse it to reduce future page fault Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
It is good for debug and development Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
Show sp->mmu_valid_gen Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
Replace kvm_mmu_zap_all by kvm_mmu_invalidate_zap_all_pages Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to walk and zap all shadow pages one by one, also it need to zap all guest page's rmap and all shadow page's parent spte list. Particularly, things become worse if guest uses more memory or vcpus. It is not good for scalability In this patch, we introduce a faster way to invalidate all shadow pages. KVM maintains a global mmu invalid generation-number which is stored in kvm->arch.mmu_valid_gen and every shadow page stores the current global generation-number into sp->mmu_valid_gen when it is created When KVM need zap all shadow pages sptes, it just simply increase the global generation-number then reload root shadow pages on all vcpus. Vcpu will create a new shadow page table according to current kvm's generation-number. It ensures the old pages are not used any more. Then the obsolete pages (sp->mmu_valid_gen != kvm->arch.mmu_valid_gen) are zapped by using lock-break technique Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
It is the responsibility of kvm_mmu_zap_all that keeps the consistent of mmu and tlbs. And it is also unnecessary after zap all mmio sptes since no mmio spte exists on root shadow page and it can not be cached into tlb Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
Quote Gleb's mail: | Back then kvm->lock protected memslot access so code like: | | mutex_lock(&vcpu->kvm->lock); | kvm_mmu_zap_all(vcpu->kvm); | mutex_unlock(&vcpu->kvm->lock); | | which is what 7aa81cc0 does was enough to guaranty that no vcpu will | run while code is patched. This is no longer the case and | mutex_lock(&vcpu->kvm->lock); is gone from that code path long time ago, | so now kvm_mmu_zap_all() there is useless and the code is incorrect. So we drop it and it will be fixed later Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 04 6月, 2013 1 次提交
-
-
由 Amos Kong 提交于
We can easily reach the 1000 limit by start VM with a couple hundred I/O devices (multifunction=on). The hardcode limit already been adjusted 3 times (6 ~ 200 ~ 300 ~ 1000). In userspace, we already have maximum file descriptor to limit ioeventfd count. But kvm_io_bus devices also are used for pit, pic, ioapic, coalesced_mmio. They couldn't be limited by maximum file descriptor. Currently only ioeventfds take too much kvm_io_bus devices, so just exclude it from counting kvm_io_range limit. Also fixed one indent issue in kvm_host.h Signed-off-by: NAmos Kong <akong@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 03 6月, 2013 1 次提交
-
-
由 Cornelia Huck 提交于
Providing a "devname:kvm" module alias enables automatic loading of the kvm module when /dev/kvm is opened. Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 21 5月, 2013 16 次提交
-
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Avi Kivity 提交于
Since DIV and IDIV can generate exceptions, we need an additional output parameter indicating whether an execption has occured. To avoid increasing register pressure on i386, we use %rsi, which is already allocated for the fastop code pointer. Gleb: added comment about fop usage as exception indication. Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Avi Kivity 提交于
This makes OpAccHi useful. Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Avi Kivity 提交于
Single-operand MUL and DIV access an extended accumulator: AX for byte instructions, and DX:AX, EDX:EAX, or RDX:RAX for larger-sized instructions. Add support for fetching the extended accumulator. In order not to change things too much, RDX is loaded into Src2, which is already loaded by fastop(). This avoids increasing register pressure on i386. Gleb: disable src writeback for ByteOp div/mul. Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Avi Kivity 提交于
Some instructions write back the source operand, not just the destination. Add support for doing this via the decode flags. Gleb: add BUG_ON() to prevent source to be memory operand. Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Christian Borntraeger 提交于
On heavy paging load some guest cpus started to loop in gmap_ipte_notify. This was visible as stalled cpus inside the guest. The gmap_ipte_notifier tries to map a user page and then made sure that the pte is valid and writable. Turns out that with the software change bit tracking the pte can become read-only (and only software writable) if the page is clean. Since we loop in this code, the page would stay clean and, therefore, be never writable again. Let us just use fixup_user_fault, that guarantees to call handle_mm_fault. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Martin Schwidefsky 提交于
Do not automatically restart the sie instruction in entry64.S after an interrupt, return to the caller with a reason code instead. That allows to deal with RCU and other conditions in C code. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Christian Borntraeger 提交于
The guest prefix pages must be mapped writeable all the time while SIE is running, otherwise the guest might see random behaviour. (pinned at the pte level) Turns out that mlocking is not enough, the page table entry (not the page) might change or become r/o. This patch uses the gmap notifiers to kick guest cpus out of SIE. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Christian Borntraeger 提交于
Lets provide functions to prevent KVM from reentering SIE and to kick cpus out of SIE. We cannot use the common kvm_vcpu_kick code, since we need to kick out guests in places that hold architecture specific locks (e.g. pgste lock) which might be necessary on the other cpus - so no waiting possible. So lets provide a bit in a private field of the sie control block that acts as a gate keeper, after we claimed we are in SIE. Please note that we do not reuse prog0c, since we want to access that bit without atomic ops. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Christian Borntraeger 提交于
Lets track in a private bit if the sie control block is active. We want to track this as closely as possible, so we also have to instrument the interrupt and program check handler. Lets use the existing HANDLE_SIE_INTERCEPT macro. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Martin Schwidefsky 提交于
The RCP byte is a part of the PGSTE value, the existing RCP_xxx names are inaccurate. As the defines describe bits and pieces of the PGSTE, the names should start with PGSTE_. The KVM_UR_BIT and KVM_UC_BIT are part of the PGSTE as well, give them better names as well. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Martin Schwidefsky 提交于
The PSW can wrap if the guest has been running in the 24 bit or 31 bit addressing mode. Use __rewind_psw to find the correct address. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Christian Borntraeger 提交于
Dont use the same bit as user referenced. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 19 5月, 2013 2 次提交
-
-
由 Marc Zyngier 提交于
As requested by the KVM maintainers, remove the addprefix used to refer to the main KVM code from the arch code, and replace it with a KVM variable that does the same thing. Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Christoffer Dall <cdall@cs.columbia.edu> Acked-by: NXiantao Zhang <xiantao.zhang@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Alexander Graf <agraf@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Marc Zyngier 提交于
As KVM/arm64 is looming on the horizon, it makes sense to move some of the common code to a single location in order to reduce duplication. The code could live anywhere. Actually, most of KVM is already built with a bunch of ugly ../../.. hacks in the various Makefiles, so we're not exactly talking about style here. But maybe it is time to start moving into a less ugly direction. The include files must be in a "public" location, as they are accessed from non-KVM files (arch/arm/kernel/asm-offsets.c). For this purpose, introduce two new locations: - virt/kvm/arm/ : x86 and ia64 already share the ioapic code in virt/kvm, so this could be seen as a (very ugly) precedent. - include/kvm/ : there is already an include/xen, and while the intent is slightly different, this seems as good a location as any Eventually, we should probably have independant Makefiles at every levels (just like everywhere else in the kernel), but this is just the first step. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 16 5月, 2013 2 次提交
-
-
由 Gleb Natapov 提交于
Do locking around each case separately instead of having one lock and two unlocks. Move root_hpa assignment out of the lock. Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Marcelo Tosatti 提交于
kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu migration, should not allow system_timestamp from the rest of the vcpus to remain static. Otherwise ntp frequency correction applies to one vcpu's system_timestamp but not the others. So in those cases, request a kvmclock update for all vcpus. The worst case for a remote vcpu to update its kvmclock is then bounded by maximum nohz sleep latency. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 14 5月, 2013 1 次提交
-
-
由 Jan Kiszka 提交于
Since the arrival of posted interrupt support we can no longer guarantee that coalesced IRQs are always reported to the IRQ source. Moreover, accumulated APIC timer events could cause a busy loop when a VCPU should rather be halted. The consensus is to remove coalesced tracking from the LAPIC. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Acked-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 12 5月, 2013 1 次提交
-
-
由 Takuya Yoshikawa 提交于
No need to open-code this function. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-