- 01 8月, 2010 16 次提交
-
-
由 Gui Jianfeng 提交于
Since we have is_writable_pte(), make use of it. Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Lai Jiangshan 提交于
In Document/kvm/mmu.txt: gfn: Either the guest page table containing the translations shadowed by this page, or the base page frame for linear translations. See role.direct. But in __direct_map(), the base gfn calculation is incorrect, it does not calculate correctly when level=3 or 4. Fix by using PT64_LVL_ADDR_MASK() which accounts for all levels correctly. Reported-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Lai Jiangshan 提交于
When sp->role.direct is set, sp->gfns does not contain any essential information, leaf sptes reachable from this sp are for a continuous guest physical memory range (a linear range). So sp->gfns[i] (if it was set) equals to sp->gfn + i. (PT_PAGE_TABLE_LEVEL) Obviously, it is not essential information, we can calculate it when need. It means we don't need sp->gfns when sp->role.direct=1, Thus we can save one page usage for every kvm_mmu_page. Note: Access to sp->gfns must be wrapped by kvm_mmu_page_get_gfn() or kvm_mmu_page_set_gfn(). It is only exposed in FNAME(sync_page). Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Allow more page become asynchronous at getting sp time, if need create new shadow page for gfn but it not allow unsync(level > 1), we should unsync all gfn's unsync page Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
In current code, shadow page can become asynchronous only if one shadow page for a gfn, this rule is too strict, in fact, we can let all last mapping page(i.e, it's the pte page) become unsync, and sync them at invlpg or flush tlb time. This patch allow more page become asynchronous at gfn mapping time Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Two cases maybe happen in kvm_mmu_get_page() function: - one case is, the goal sp is already in cache, if the sp is unsync, we only need update it to assure this mapping is valid, but not mark it sync and not write-protect sp->gfn since it not broke unsync rule(one shadow page for a gfn) - another case is, the goal sp not existed, we need create a new sp for gfn, i.e, gfn (may)has another shadow page, to keep unsync rule, we should sync(mark sync and write-protect) gfn's unsync shadow page. After enabling multiple unsync shadows, we sync those shadow pages only when the new sp not allow to become unsync(also for the unsyc rule, the new rule is: allow all pte page become unsync) Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Split kvm_sync_page() into kvm_sync_page() and kvm_sync_page_transient() to clarify the code address Avi's suggestion kvm_sync_page_transient() function only update shadow page but not mark it sync and not write protect sp->gfn. it will be used by later patch Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Remove rmap before clear spte otherwise it will trigger BUG_ON() in some functions such as rmap_write_protect(). Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
Use kmem_cache_free to free objects allocated by kmem_cache_alloc. Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Sheng Yang 提交于
mmu.free() already set root_hpa to INVALID_PAGE, no need to do it again in the destory_kvm_mmu(). kvm_x86_ops->set_cr4() and set_efer() already assign cr4/efer to vcpu->arch.cr4/efer, no need to do it again later. Signed-off-by: NSheng Yang <sheng@linux.intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Marcelo Tosatti 提交于
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
We drop the mmu lock between freeing memory and allocating the roots; this allows some other vcpu to sneak in and allocate memory. While the race is benign (resulting only in temporary overallocation, not oom) it is simple and easy to fix by moving the freeing close to the allocation. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Do not kill VM when instruction emulation fails. Inject #UD and report failure to userspace instead. Userspace may choose to reenter guest if vcpu is in userspace (cpl == 3) in which case guest OS will kill offending process and continue running. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gui Jianfeng 提交于
Currently, kvm_mmu_zap_page() returning the number of freed children sp. This might confuse the caller, because caller don't know the actual freed number. Let's make kvm_mmu_zap_page() return the number of pages it actually freed. Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Huang Ying 提交于
In common cases, guest SRAO MCE will cause corresponding poisoned page be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay the MCE to guest OS. But it is reported that if the poisoned page is accessed in guest after unmapping and before MCE is relayed to guest OS, userspace will be killed. The reason is as follows. Because poisoned page has been un-mapped, guest access will cause guest exit and kvm_mmu_page_fault will be called. kvm_mmu_page_fault can not get the poisoned page for fault address, so kernel and user space MMIO processing is tried in turn. In user MMIO processing, poisoned page is accessed again, then userspace is killed by force_sig_info. To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM and do not try kernel and user space MMIO processing for poisoned page. [xiao: fix warning introduced by avi] Reported-by: NMax Asbock <masbock@linux.vnet.ibm.com> Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 19 7月, 2010 1 次提交
-
-
由 Dave Chinner 提交于
The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 13 7月, 2010 1 次提交
-
-
由 Xiao Guangrong 提交于
After remove a rmap, we should flush all vcpu's tlb Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 09 6月, 2010 2 次提交
-
-
由 Avi Kivity 提交于
If cr0.wp=0, we have to allow the guest kernel access to a page with pte.w=0. We do that by setting spte.w=1, since the host cr0.wp must remain set so the host can write protect pages. Once we allow write access, we must remove user access otherwise we mistakenly allow the user to write the page. Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
Always invalidate spte and flush TLBs when changing page size, to make sure different sized translations for the same address are never cached in a CPU's TLB. Currently the only case where this occurs is when a non-leaf spte pointer is overwritten by a leaf, large spte entry. This can happen after dirty logging is disabled on a memslot, for example. Noticed by Andrea. KVM-Stable-Tag Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 19 5月, 2010 5 次提交
-
-
由 Avi Kivity 提交于
When cr0.wp=0, we may shadow a gpte having u/s=1 and r/w=0 with an spte having u/s=0 and r/w=1. This allows excessive access if the guest sets cr0.wp=1 and accesses through this spte. Fix by making cr0.wp part of the base role; we'll have different sptes for the two cases and the problem disappears. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
On svm, kvm_read_pdptr() may require reading guest memory, which can sleep. Push the spinlock into mmu_alloc_roots(), and only take it after we've read the pdptr. Tested-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Move unsync/sync tracepoints to the proper place, it's good for us to obtain unsync page live time Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gui Jianfeng 提交于
kvm_mmu_remove_one_alloc_mmu_page() assumes kvm_mmu_zap_page() only reclaims only one sp, but that's not the case. This will cause mmu shrinker returns a wrong number. This patch fix the counting error. Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Eric Northup 提交于
For TDP mode, avoid creating multiple page table roots for the single guest-to-host physical address map by fixing the inputs used for the shadow page table hash in mmu_alloc_roots(). Signed-off-by: NEric Northup <digitaleric@google.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 17 5月, 2010 15 次提交
-
-
由 Wei Yongjun 提交于
Since gfn is not changed in the for loop, we do not need to call gfn_to_memslot_unaliased() under the loop, and it is safe to move it out. Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gui Jianfeng 提交于
Nobody use gva_to_page() anymore, get rid of it. Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gui Jianfeng 提交于
Remove unused varialbe in rmap_next() Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Lai Jiangshan 提交于
The RCU/SRCU API have already changed for proving RCU usage. I got the following dmesg when PROVE_RCU=y because we used incorrect API. This patch coverts rcu_deference() to srcu_dereference() or family API. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by qemu-system-x86/8550: #0: (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm] #1: (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm] stack backtrace: Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27 Call Trace: [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm] [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm] [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm] [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm] [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel] [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm] [<ffffffff810a8692>] ? unlock_page+0x27/0x2c [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm] [<ffffffff81060cfa>] ? up_read+0x23/0x3d [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a [<ffffffff810021db>] system_call_fastpath+0x16/0x1b Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Quote from Avi: |Just change the assignment to a 'goto restart;' please, |I don't like playing with list_for_each internals. Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
'vcpu' is unused, remove it Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
Remove 'struct kvm_unsync_walk' since it's not used. Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
There is no real distinction between glevels=3 and glevels=4; both have exactly the same format and the code is treated exactly the same way. Drop role.glevels and replace is with role.cr4_pae (which is meaningful). This simplifies the code a bit. As a side effect, it allows sharing shadow page tables between pae and longmode guest page tables at the same guest page. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
kvm_mmu_page.oos_link is not used, so remove it Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
This patch does: - 'sp' parameter in inspect_spte_fn() is not used, so remove it - fix 'kvm' and 'slots' is not defined in count_rmaps() - fix a bug in inspect_spte_has_rmap() Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Direct maps are linear translations for a section of memory, used for real mode or with large pages. As such, they are independent of the guest levels. Teach the mmu about this by making page->role.glevels = 0 for direct maps. This allows direct maps to be shared among real mode and the various paging modes. Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
- Check reserved bits only if CR4.PAE=1 or CR4.PSE=1 when guest #PF occurs - Fix a typo in reset_rsvds_bits_mask() Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
Commit fb341f57 removed the pte prefetch on guest invlpg, citing guest races. However, the SDM is adamant that prefetch is allowed: "The processor may create entries in paging-structure caches for translations required for prefetches and for accesses that are a result of speculative execution that would never actually occur in the executed code path." And, in fact, there was a race in the prefetch code: we picked up the pte without the mmu lock held, so an older invlpg could install the pte over a newer invlpg. Reinstate the prefetch logic, but this time note whether another invlpg has executed using a counter. If a race occured, do not install the pte. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
kvm_mmu_pte_write() reads guest ptes in two different occasions, both to allow a 32-bit pae guest to update a pte with 4-byte writes. Consolidate these into a single read, which also allows us to consolidate another read from an invlpg speculating a gpte into the shadow page table. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Minchan Kim 提交于
The prep_new_page() in page allocator calls set_page_private(page, 0). So we don't need to reinitialize private of page. Signed-off-by: NMinchan Kim <minchan.kim@gmail.com> Cc: Avi Kivity<avi@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-