- 10 6月, 2009 1 次提交
-
-
由 Joerg Roedel 提交于
There is no reason to update the shadow pte here because the guest pte is only changed to dirty state. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 24 3月, 2009 5 次提交
-
-
由 Andrea Arcangeli 提交于
When kvm emulates an invlpg instruction, it can drop a shadow pte, but leaves the guest tlbs intact. This can cause memory corruption when swapping out. Without this the other cpu can still write to a freed host physical page. tlb smp flush must happen if rmap_remove is called always before mmu_lock is released because the VM will take the mmu_lock before it can finally add the page to the freelist after swapout. mmu notifier makes it safe to flush the tlb after freeing the page (otherwise it would never be safe) so we can do a single flush for multiple sptes invalidated. Cc: stable@kernel.org Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
In the paging_fetch function rmap_remove is called after setting a large pte to non-present. This causes rmap_remove to not drop the reference to the large page. The result is a memory leak of that page. Cc: stable@kernel.org Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Acked-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
This actually describes what is going on, rather than alerting the reader that something strange is going on. Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
Effectively reverting to the pre walk_shadow() version -- but now with the reusable for_each(). Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 31 12月, 2008 3 次提交
-
-
由 Marcelo Tosatti 提交于
The invlpg and sync walkers lack knowledge of large host sptes, descending to non-existant pagetable level. Stop at directory level in such case. Fixes SMP Windows XP with hugepages. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
If the guest executes invlpg, peek into the pagetable and attempt to prepopulate the shadow entry. Also stop dirty fault updates from interfering with the fork detector. 2% improvement on RHEL3/AIM7. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
Skip syncing global pages on cr3 switch (but not on cr4/cr0). This is important for Linux 32-bit guests with PAE, where the kmap page is marked as global. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 26 11月, 2008 1 次提交
-
-
由 Marcelo Tosatti 提交于
It is possible for a shadow page to have a parent link pointing to a freed page. When zapping a high level table, kvm_mmu_page_unlink_children fails to remove the parent_pte link. For that to happen, the child must be unreachable via the shadow tree, which can happen in shadow_walk_entry if the guest pte was modified in between walk() and fetch(). Remove the parent pte reference in such case. Possible cause for oops in bug #2217430. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 15 10月, 2008 8 次提交
-
-
由 Marcelo Tosatti 提交于
Allow guest pagetables to go out of sync. Instead of emulating write accesses to guest pagetables, or unshadowing them, we un-write-protect the page table and allow the guest to modify it at will. We rely on invlpg executions to synchronize individual ptes, and will synchronize the entire pagetable on tlb flushes. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
With pages out of sync invlpg needs to be trapped. For now simply nuke the entry. Untested on AMD. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
Examine guest pagetable and bring the shadow back in sync. Caller is responsible for local TLB flush before re-entering guest mode. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
It is necessary to flush all TLB's when a large spte entry is overwritten with a normal page directory pointer. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Marcelo Tosatti 提交于
Convert gfn_to_pfn to use get_user_pages_fast, which can do lockless pagetable lookups on x86. Kernel compilation on 4-way guest is 3.7% faster on VMX. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Sheng Yang 提交于
EPT is 4 level by default in 32pae(48 bits), but the addr parameter of kvm_shadow_walk->entry() only accept unsigned long as virtual address, which is 32bit in 32pae. This result in SHADOW_PT_INDEX() overflow when try to fetch level 4 index. Fix it by extend kvm_shadow_walk->entry() to accept 64bit addr in parameter. Signed-off-by: NSheng Yang <sheng.yang@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
It is not specific to the paging mode, so can be made global (and reusable). Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 25 8月, 2008 1 次提交
-
-
由 Avi Kivity 提交于
The shadow code assigns a pte directly in one place, which is nonatomic on i386 can can cause random memory references. Fix by using an atomic setter. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 29 7月, 2008 1 次提交
-
-
由 Andrea Arcangeli 提交于
Synchronize changes to host virtual addresses which are part of a KVM memory slot to the KVM shadow mmu. This allows pte operations like swapping, page migration, and madvise() to transparently work with KVM. Signed-off-by: NAndrea Arcangeli <andrea@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 20 7月, 2008 1 次提交
-
-
由 Avi Kivity 提交于
Instead of reading each pte individually, read 256 bytes worth of ptes and batch process them. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 07 6月, 2008 1 次提交
-
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 27 4月, 2008 7 次提交
-
-
由 Anthony Liguori 提交于
This patch introduces a gfn_to_pfn() function and corresponding functions like kvm_release_pfn_dirty(). Using these new functions, we can modify the x86 MMU to no longer assume that it can always get a struct page for any given gfn. We don't want to eliminate gfn_to_page() entirely because a number of places assume they can do gfn_to_page() and then kmap() the results. When we support IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will succeed. This does not implement support for avoiding reference counting for reserved RAM or for IO memory. However, it should make those things pretty straight forward. Since we're only introducing new common symbols, I don't think it will break the non-x86 architectures but I haven't tested those. I've tested Intel, AMD, NPT, and hugetlbfs with Windows and Linux guests. [avi: fix overflow when shifting left pfns by adding casts] Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
Unify slots_lock acquision around vcpu_run(). This is simpler and less error-prone. Also fix some callsites that were not grabbing the lock properly. [avi: drop slots_lock while in guest mode to avoid holding the lock for indefinite periods] Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
If we populate a shadow pte due to a fault (and not speculatively due to a pte write) then we can set the accessed bit on it, as we know it will be set immediately on the next guest instruction. This saves a read-modify-write operation. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Harvey Harrison 提交于
__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
Create large pages mappings if the guest PTE's are marked as such and the underlying memory is hugetlbfs backed. If the largepage contains write-protected pages, a large pte is not used. Gives a consistent 2% improvement for data copies on ram mounted filesystem, without NPT/EPT. Anthony measures a 4% improvement on 4-way kernbench, with NPT. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Currently an mmio guest pte is encoded in the shadow pagetable as a not-present trapping pte, with the SHADOW_IO_MARK bit set. However nothing is ever done with this information, so maintaining it is a useless complication. This patch moves the check for mmio to before shadow ptes are instantiated, so the shadow code is never invoked for ptes that reference mmio. The code is simpler, and with future work, can be made to handle mmio concurrently. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Dong, Eddie 提交于
A guest partial guest pte write will leave shadow_trap_nonpresent_pte in spte, which generates a vmexit at the next guest access through that pte. This patch improves this by reading the full guest pte in advance and thus being able to update the spte and eliminate the vmexit. This helps pae guests which use two 32-bit writes to set a single 64-bit pte. [truncation fix by Eric] Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: NFeng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 04 3月, 2008 3 次提交
-
-
由 Avi Kivity 提交于
For improved concurrency, the guest walk is performed concurrently with other vcpus. This means that we need to revalidate the guest ptes once we have write-protected the guest page tables, at which point they can no longer be modified. The current code attempts to avoid this check if the shadow page table is not new, on the assumption that if it has existed before, the guest could not have modified the pte without the shadow lock. However the assumption is incorrect, as the racing vcpu could have modified the pte, then instantiated the shadow page, before our vcpu regains control: vcpu0 vcpu1 fault walk pte modify pte fault in same pagetable instantiate shadow page lookup shadow page conclude it is old instantiate spte based on stale guest pte We could do something clever with generation counters, but a test run by Marcelo suggests this is unnecessary and we can just do the revalidation unconditionally. The pte will be in the processor cache and the check can be quite fast. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
the cr3 variable is now inside the vcpu->arch structure. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Izik Eidus 提交于
This patch replaces the mmap_sem lock for the memory slots with a new kvm private lock, it is needed beacuse untill now there were cases where kvm accesses user memory while holding the mmap semaphore. Signed-off-by: NIzik Eidus <izike@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 31 1月, 2008 7 次提交
-
-
由 Dong, Eddie 提交于
Remove the redundant level check when fetching shadow pte for present & non-present spte. Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
If some other cpu steals mmu pages between our check and an attempt to allocate, we can run out of mmu pages. Fix by moving the check into the same critical section as the allocation. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
Convert the synchronization of the shadow handling to a separate mmu_lock spinlock. Also guard fetch() by mmap_sem in read-mode to protect against alias and memslot changes. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Since gfn_to_page() is a sleeping function, and we want to make the core mmu spinlocked, we need to pass the page from the walker context (which can sleep) to the shadow context (which cannot). [marcelo: avoid recursive locking of mmap_sem] Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
In preparation for a mmu spinlock, add kvm_read_guest_atomic() and use it in fetch() and prefetch_page(). Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
Do not hold kvm->lock mutex across the entire pagefault code, only acquire it in places where it is necessary, such as mmu hash list, active list, rmap and parent pte handling. Allow concurrent guest walkers by switching walk_addr() to use mmap_sem in read-mode. And get rid of the lockless __gfn_to_page. [avi: move kvm_mmu_pte_write() locking inside the function] [avi: add locking for real mode] [avi: fix cmpxchg locking] Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
This paves the way for multiple architecture support. Note that while ioapic.c could potentially be shared with ia64, it is also moved. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 30 1月, 2008 1 次提交
-
-
由 Zhang Xiantao 提交于
Move all the architecture-specific fields in kvm_vcpu into a new struct kvm_vcpu_arch. Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com> Acked-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-