- 24 10月, 2010 35 次提交
-
-
由 Joerg Roedel 提交于
This patch adds code to initialize the Nested Nested Paging MMU context when the L1 guest executes a VMRUN instruction and has nested paging enabled in its VMCB. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
With Nested Paging emulation the NX state between the two MMU contexts may differ. To make sure that always the right fault error code is recorded this patch moves the NX state into struct kvm_mmu so that the code can distinguish between L1 and L2 NX state. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
Currently the KVM softmmu implementation can not shadow a 32 bit legacy or PAE page table with a long mode page table. This is a required feature for nested paging emulation because the nested page table must alway be in host format. So this patch implements the missing pieces to allow long mode page tables for page table types. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch factors out the direct-mapping paths of the mmu_alloc_roots function into a seperate function. This makes it a lot easier to avoid all the unnecessary checks done in the shadow path which may break when running direct. In fact, this patch already fixes a problem when running PAE guests on a PAE shadow page table. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This function is implemented to load the pdptr pointers of the currently running guest (l1 or l2 guest). Therefore it takes care about the current paging mode and can read pdptrs out of l2 guest physical memory. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch introduces the init_kvm_nested_mmu() function which is used to re-initialize the nested mmu when the l2 guest changes its paging mode. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch adds the functions to do a nested l2_gva to l1_gpa page table walk. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch introduces the walk_mmu pointer which points to the mmu-context currently used for gva_to_gpa translations. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch introduces a struct with two new fields in vcpu_arch for x86: * fault.address * fault.error_code This will be used to correctly propagate page faults back into the guest when we could have either an ordinary page fault or a nested page fault. In the case of a nested page fault the fault-address is different from the original address that should be walked. So we need to keep track about the real fault-address. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch changes is_rsvd_bits_set() function prototype to take only a kvm_mmu context instead of a full vcpu. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
Some logic of the init_kvm_softmmu function is required to build the Nested Nested Paging context. So factor the required logic into a seperate function and export it. Also make the whole init path suitable for more than one mmu context. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch introduces an inject_page_fault function pointer into struct kvm_mmu which will be used to inject a page fault. This will be used later when Nested Nested Paging is implemented. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This function pointer in the MMU context is required to implement Nested Nested Paging. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch introduces a special set_tdp_cr3 function pointer in kvm_x86_ops which is only used for tpd enabled mmu contexts. This allows to remove some hacks from svm code. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This is necessary to implement Nested Nested Paging. As a side effect this allows some cleanups in the SVM nested paging code. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch changes the tdp_enabled flag from its global meaning to the mmu-context and renames it to direct_map there. This is necessary for Nested SVM with emulation of Nested Paging where we need an extra MMU context to shadow the Nested Nested Page Table. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch fixes 32 bit legacy paging with NPT enabled. The mmu_check_root call on the top-level of the loop causes root_gfn to take values (in the tdp_enabled path) which are outside of guest memory. So the mmu_check_root call fails at some point in the loop interation causing the guest to tiple-fault. This patch changes the mmu_check_root calls to the places where they are really necessary. As a side-effect it introduces a check for the root of a pae page table too. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
Move the audit code from arch/x86/kvm/mmu.c to arch/x86/kvm/mmu_audit.c Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Add a r/w module parameter named 'mmu_audit', it can control audit enable/disable: enable: echo 1 > /sys/module/kvm/parameters/mmu_audit disable: echo 0 > /sys/module/kvm/parameters/mmu_audit This patch not change the logic Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Nothing is checked in count_rmaps(), so remove it Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
There is a bugs in this function, we call gfn_to_pfn() and kvm_mmu_gva_to_gpa_read() in atomic context(kvm_mmu_audit() is called under the spinlock(mmu_lock)'s protection). This patch fix it by: - introduce gfn_to_pfn_atomic instead of gfn_to_pfn - get the mapping gfn from kvm_mmu_page_get_gfn() And it adds 'notrap' ptes check in unsync/direct sps Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
The audit code reports some sp not write protected in current code, it's just the bug in audit_write_protection(), since: - the invalid sp not need write protected - using uninitialize local variable('gfn') - call kvm_mmu_audit() out of mmu_lock's protection Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
The read-only spte also has reverse mapping, so fix the code to check them, also modify the function name to fit its doing Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
fix: arch/x86/kvm/mmu.c: In function ‘kvm_mmu_unprotect_page’: arch/x86/kvm/mmu.c:1741: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’ arch/x86/kvm/mmu.c:1745: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’ arch/x86/kvm/mmu.c: In function ‘mmu_unshadow’: arch/x86/kvm/mmu.c:1761: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’ arch/x86/kvm/mmu.c: In function ‘set_spte’: arch/x86/kvm/mmu.c:2005: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’ arch/x86/kvm/mmu.c: In function ‘mmu_set_spte’: arch/x86/kvm/mmu.c:2033: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 7 has type ‘gfn_t’ Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Support prefetch ptes when intercept guest #PF, avoid to #PF by later access If we meet any failure in the prefetch path, we will exit it and not try other ptes to avoid become heavy path Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Wei Yongjun 提交于
commit ad05c88266b4cce1c820928ce8a0fb7690912ba1 (KVM: create aggregate kvm_total_used_mmu_pages value) introduce percpu counter kvm_total_used_mmu_pages but never destroy it, this may cause oops when rmmod & modprobe. Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com> Acked-by: NTim Pepper <lnxninja@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiaotian Feng 提交于
Latest kvm mmu_shrink code rework makes kernel changes kvm->arch.n_used_mmu_pages/ kvm->arch.n_max_mmu_pages at kvm_mmu_free_page/kvm_mmu_alloc_page, which is called by kvm_mmu_commit_zap_page. So the kvm->arch.n_used_mmu_pages or kvm_mmu_available_pages(vcpu->kvm) is unchanged after kvm_mmu_prepare_zap_page(), This caused kvm_mmu_change_mmu_pages/__kvm_mmu_free_some_pages loops forever. Moving kvm_mmu_commit_zap_page would make the while loop performs as normal. Reported-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NXiaotian Feng <dfeng@redhat.com> Tested-by: NAvi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Tim Pepper <lnxninja@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Dave Hansen 提交于
Of slab shrinkers, the VM code says: * Note that 'shrink' will be passed nr_to_scan == 0 when the VM is * querying the cache size, so a fastpath for that case is appropriate. and it *means* it. Look at how it calls the shrinkers: nr_before = (*shrinker->shrink)(0, gfp_mask); shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask); So, if you do anything stupid in your shrinker, the VM will doubly punish you. The mmu_shrink() function takes the global kvm_lock, then acquires every VM's kvm->mmu_lock in sequence. If we have 100 VMs, then we're going to take 101 locks. We do it twice, so each call takes 202 locks. If we're under memory pressure, we can have each cpu trying to do this. It can get really hairy, and we've seen lock spinning in mmu_shrink() be the dominant entry in profiles. This is guaranteed to optimize at least half of those lock aquisitions away. It removes the need to take any of the locks when simply trying to count objects. A 'percpu_counter' can be a large object, but we only have one of these for the entire system. There are not any better alternatives at the moment, especially ones that handle CPU hotplug. Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Dave Hansen 提交于
Doing this makes the code much more readable. That's borne out by the fact that this patch removes code. "used" also happens to be the number that we need to return back to the slab code when our shrinker gets called. Keeping this value as opposed to free makes the next patch simpler. So, 'struct kvm' is kzalloc()'d. 'struct kvm_arch' is a structure member (and not a pointer) of 'struct kvm'. That means they start out zeroed. I _think_ they get initialized properly by kvm_mmu_change_mmu_pages(). But, that only happens via kvm ioctls. Another benefit of storing 'used' intead of 'free' is that the values are consistent from the moment the structure is allocated: no negative "used" value. Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Dave Hansen 提交于
arch.n_alloc_mmu_pages is a poor choice of name. This value truly means, "the number of pages which _may_ be allocated". But, reading the name, "n_alloc_mmu_pages" implies "the number of allocated mmu pages", which is dead wrong. It's really the high watermark, so let's give it a name to match: nr_max_mmu_pages. This change will make the next few patches much more obvious and easy to read. Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Dave Hansen 提交于
"free" is a poor name for this value. In this context, it means, "the number of mmu pages which this kvm instance should be able to allocate." But "free" implies much more that the objects are there and ready for use. "available" is a much better description, especially when you see how it is calculated. In this patch, we abstract its use into a function. We'll soon replace the function's contents by calculating the value in a different way. All of the reads of n_free_mmu_pages are taken care of in this patch. The modification sites will be handled in a patch later in the series. Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Mark page dirty only when this page is really written, it's more exacter, and also can fix dirty page marking in speculation path Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Introduce spte_has_volatile_bits() function to judge whether spte bits will miss, it's more readable and can help us to cleanup code later Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
It's a small cleanup that using using kvm_set_pfn_accessed() instead of mark_page_accessed() Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
After commit 53383eaad08d, the '*spte' has updated before call rmap_remove()(in most case it's 'shadow_trap_nonpresent_pte'), so remove this information from error message Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 07 8月, 2010 1 次提交
-
-
由 H. Peter Anvin 提交于
KVM ended up having to put a pretty ugly wrapper around set_64bit() in order to get the type right. Now set_64bit() takes the expected u64 type, and this wrapper can be cleaned up. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: Avi Kivity <avi@redhat.com> LKML-Reference: <4C5C4E7A.8040603@kernel.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 8月, 2010 4 次提交
-
-
由 Xiao Guangrong 提交于
Sometimes, atomically set spte is not needed, this patch call __xchg_spte() more smartly Note: if the old mapping's access bit is already set, we no need atomic operation since the access bit is not lost Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
Introduce set_spte_track_bits() to cleanup current code Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
If the old mapping is not present, the spte.a is not lost, so no need atomic operation to set it Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
In sync-page path, if spte.writable is changed, it will lose page dirty tracking, for example: assume spte.writable = 0 in a unsync-page, when it's synced, it map spte to writable(that is spte.writable = 1), later guest write spte.gfn, it means spte.gfn is dirty, then guest changed this mapping to read-only, after it's synced, spte.writable = 0 So, when host release the spte, it detect spte.writable = 0 and not mark page dirty Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-