- 28 9月, 2020 9 次提交
-
-
由 Sean Christopherson 提交于
Detect spurious page faults, e.g. page faults that occur when multiple vCPUs simultaneously access a not-present page, and skip the SPTE write, prefetch, and stats update for spurious faults. Note, the performance benefits of skipping the write and prefetch are likely negligible, and the false positive stats adjustment is probably lost in the noise. The primary motivation is to play nice with TDX's SEPT in the long term. SEAMCALLs (to program SEPT entries) are quite costly, e.g. thousands of cycles, and a spurious SEPT update will result in a SEAMCALL error (which KVM will ideally treat as fatal). Reported-by: NKai Huang <kai.huang@intel.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923220425.18402-5-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Introduce RET_PF_FIXED and RET_PF_SPURIOUS to provide unique return values instead of overloading RET_PF_RETRY. In the short term, the unique values add clarity to the code and RET_PF_SPURIOUS will be used by set_spte() to avoid unnecessary work for spurious faults. In the long term, TDX will use RET_PF_FIXED to deterministically map memory during pre-boot. The page fault flow may bail early for benign reasons, e.g. if the mmu_notifier fires for an unrelated address. With only RET_PF_RETRY, it's impossible for the caller to distinguish between "cool, page is mapped" and "darn, need to try again", and thus cannot handle benign cases like the mmu_notifier retry. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923220425.18402-4-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Explicitly check for RET_PF_EMULATE instead of implicitly doing the same by checking for !RET_PF_RETRY (RET_PF_INVALID is handled earlier). This will adding new RET_PF_ types in future patches without breaking the emulation path. No functional change intended. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923220425.18402-3-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Exit to userspace with an error if the MMU is buggy and returns RET_PF_INVALID when servicing a page fault. This will allow a future patch to invert the emulation path, i.e. emulate only on RET_PF_EMULATE instead of emulating on anything but RET_PF_RETRY. This technically means that KVM will exit to userspace instead of emulating on RET_PF_INVALID, but practically speaking it's a nop as the MMU never returns RET_PF_INVALID. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923220425.18402-2-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
Recursively zap all to-be-orphaned children, unsynced or otherwise, when zapping a shadow page for a nested TDP MMU. KVM currently only zaps the unsynced child pages, but not the synced ones. This can create problems over time when running many nested guests because it leaves unlinked pages which will not be freed until the page quota is hit. With the default page quota of 20 shadow pages per 1000 guest pages, this looks like a memory leak and can degrade MMU performance. In a recent benchmark, substantial performance degradation was observed: An L1 guest was booted with 64G memory. 2G nested Windows guests were booted, 10 at a time for 20 iterations. (200 total boots) Windows was used in this benchmark because they touch all of their memory on startup. By the end of the benchmark, the nested guests were taking ~10% longer to boot. With this patch there is no degradation in boot time. Without this patch the benchmark ends with hundreds of thousands of stale EPT02 pages cluttering up rmaps and the page hash map. As a result, VM shutdown is also much slower: deleting memslot 0 was observed to take over a minute. With this patch it takes just a few miliseconds. Cc: Peter Shier <pshier@google.com> Signed-off-by: NBen Gardon <bgardon@google.com> Co-developed-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923221406.16297-3-sean.j.christopherson@intel.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the logic that controls whether or not FNAME(invlpg) needs to flush fully into FNAME(invlpg) so that mmu_page_zap_pte() doesn't return a value. This allows a future patch to redefine the return semantics for mmu_page_zap_pte() so that it can recursively zap orphaned child shadow pages for nested TDP MMUs. No functional change intended. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923221406.16297-2-sean.j.christopherson@intel.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
To make kvm_mmu_free_roots() a bit more readable, capture 'kvm' in a local variable instead of doing vcpu->kvm over and over (and over). No functional change intended. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923191204.8410-1-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename kvm_mmu_is_illegal_gpa() to kvm_vcpu_is_illegal_gpa() and move it to cpuid.h so that's it's colocated with cpuid_maxphyaddr(). The helper is not MMU specific and will gain a user that is completely unrelated to the MMU in a future patch. No functional change intended. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200924194250.19137-5-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Replace the existing kvm_x86_ops.need_emulation_on_page_fault() with a more generic is_emulatable(), and unconditionally call the new function in x86_emulate_instruction(). KVM will use the generic hook to support multiple security related technologies that prevent emulation in one way or another. Similar to the existing AMD #NPF case where emulation of the current instruction is not possible due to lack of information, AMD's SEV-ES and Intel's SGX and TDX will introduce scenarios where emulation is impossible due to the guest's register state being inaccessible. And again similar to the existing #NPF case, emulation can be initiated by kvm_mmu_page_fault(), i.e. outside of the control of vendor-specific code. While the cause and architecturally visible behavior of the various cases are different, e.g. SGX will inject a #UD, AMD #NPF is a clean resume or complete shutdown, and SEV-ES and TDX "return" an error, the impact on the common emulation code is identical: KVM must stop emulation immediately and resume the guest. Query is_emulatable() in handle_ud() as well so that the force_emulation_prefix code doesn't incorrectly modify RIP before calling emulate_instruction() in the absurdly unlikely scenario that KVM encounters forced emulation in conjunction with "do not emulate". Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200915232702.15945-1-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 9月, 2020 1 次提交
-
-
由 Lai Jiangshan 提交于
When kvm_mmu_get_page() gets a page with unsynced children, the spt pagetable is unsynchronized with the guest pagetable. But the guest might not issue a "flush" operation on it when the pagetable entry is changed from zero or other cases. The hypervisor has the responsibility to synchronize the pagetables. KVM behaved as above for many years, But commit 8c8560b8 ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes") inadvertently included a line of code to change it without giving any reason in the changelog. It is clear that the commit's intention was to change KVM_REQ_TLB_FLUSH -> KVM_REQ_TLB_FLUSH_CURRENT, so we don't needlessly flush other contexts; however, one of the hunks changed a nearby KVM_REQ_MMU_SYNC instead. This patch changes it back. Link: https://lore.kernel.org/lkml/20200320212833.3507-26-sean.j.christopherson@intel.com/ Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20200902135421.31158-1-jiangshanlai@gmail.com> fixes: 8c8560b8 ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes") Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 8月, 2020 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 22 8月, 2020 1 次提交
-
-
由 Will Deacon 提交于
The 'flags' field of 'struct mmu_notifier_range' is used to indicate whether invalidate_range_{start,end}() are permitted to block. In the case of kvm_mmu_notifier_invalidate_range_start(), this field is not forwarded on to the architecture-specific implementation of kvm_unmap_hva_range() and therefore the backend cannot sensibly decide whether or not to block. Add an extra 'flags' parameter to kvm_unmap_hva_range() so that architectures are aware as to whether or not they are permitted to block. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: NWill Deacon <will@kernel.org> Message-Id: <20200811102725.7121-2-will@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 31 7月, 2020 5 次提交
-
-
由 Sean Christopherson 提交于
Capture the max TDP level during kvm_configure_mmu() instead of using a kvm_x86_ops hook to do it at every vCPU creation. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200716034122.5998-10-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename max_page_level to explicitly call out that it tracks the max huge page level so as to avoid confusion when a future patch moves the max TDP level, i.e. max root level, into the MMU and kvm_configure_mmu(). Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200716034122.5998-9-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Calculate the desired TDP level on the fly using the max TDP level and MAXPHYADDR instead of doing the same when CPUID is updated. This avoids the hidden dependency on cpuid_maxphyaddr() in vmx_get_tdp_level() and also standardizes the "use 5-level paging iff MAXPHYADDR > 48" behavior across x86. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200716034122.5998-8-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Refactor the shadow NPT role calculation into a separate helper to better differentiate it from the non-nested shadow MMU, e.g. the NPT variant is never direct and derives its root level from the TDP level. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200716034122.5998-3-sean.j.christopherson@intel.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the initialization of shadow NPT MMU's shadow_root_level into kvm_init_shadow_npt_mmu() and explicitly set the level in the shadow NPT MMU's role to be the TDP level. This ensures the role and MMU levels are synchronized and also initialized before __kvm_mmu_new_pgd(), which consumes the level when attempting a fast PGD switch. Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Fixes: 9fa72119 ("kvm: x86: Introduce kvm_mmu_calc_root_page_role()") Fixes: a506fdd2 ("KVM: nSVM: implement nested_svm_load_cr3() and use it for host->guest switch") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200716034122.5998-2-sean.j.christopherson@intel.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Tested-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 7月, 2020 1 次提交
-
-
由 Kees Cook 提交于
Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. In preparation for removing[2] the[3] macro[4], remove all remaining needless uses with the following script: git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \ xargs perl -pi -e \ 's/\buninitialized_var\(([^\)]+)\)/\1/g; s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;' drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid pathological white-space. No outstanding warnings were found building allmodconfig with GCC 9.3.0 for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64, alpha, and m68k. [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5 Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 11 7月, 2020 6 次提交
-
-
由 Mohammed Gamal 提交于
Intel processors of various generations have supported 36, 39, 46 or 52 bits for physical addresses. Until IceLake introduced MAXPHYADDR==52, running on a machine with higher MAXPHYADDR than the guest more or less worked, because software that relied on reserved address bits (like KVM) generally used bit 51 as a marker and therefore the page faults where generated anyway. Unfortunately this is not true anymore if the host MAXPHYADDR is 52, and this can cause problems when migrating from a MAXPHYADDR<52 machine to one with MAXPHYADDR==52. Typically, the latter are machines that support 5-level page tables, so they can be identified easily from the LA57 CPUID bit. When that happens, the guest might have a physical address with reserved bits set, but the host won't see that and trap it. Hence, we need to check page faults' physical addresses against the guest's maximum physical memory and if it's exceeded, we need to add the PFERR_RSVD_MASK bits to the page fault error code. This patch does this for the MMU's page walks. The next patches will ensure that the correct exception and error code is produced whenever no host-reserved bits are set in page table entries. Signed-off-by: NMohammed Gamal <mgamal@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <20200710154811.418214-4-mgamal@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Mohammed Gamal 提交于
Also no point of it being inline since it's always called through function pointers. So remove that. Signed-off-by: NMohammed Gamal <mgamal@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <20200710154811.418214-3-mgamal@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
The mmu_check_root() check in fast_pgd_switch() seems to be superfluous: when GPA is outside of the visible range cached_root_available() will fail for non-direct roots (as we can't have a matching one on the list) and we don't seem to care for direct ones. Also, raising #TF immediately when a non-existent GFN is written to CR3 doesn't seem to mach architectural behavior. Drop the check. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200710141157.1640173-10-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
Undesired triple fault gets injected to L1 guest on SVM when L2 is launched with certain CR3 values. #TF is raised by mmu_check_root() check in fast_pgd_switch() and the root cause is that when kvm_set_cr3() is called from nested_prepare_vmcb_save() with NPT enabled CR3 points to a nGPA so we can't check it with kvm_is_visible_gfn(). Using generic kvm_set_cr3() when switching to nested guest is not a great idea as we'll have to distinguish between 'real' CR3s and 'nested' CR3s to e.g. not call kvm_mmu_new_pgd() with nGPA. Following nVMX implement nested-specific nested_svm_load_cr3() doing the job. To support the change, nested_svm_load_cr3() needs to be re-ordered with nested_svm_init_mmu_context(). Note: the current implementation is sub-optimal as we always do TLB flush/MMU sync but this is still an improvement as we at least stop doing kvm_mmu_reset_context(). Fixes: 7c390d35 ("kvm: x86: Add fast CR3 switch code path") Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200710141157.1640173-8-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
kvm_init_shadow_mmu() was actually the only function that could be called with different vcpu->arch.mmu values. Now that kvm_init_shadow_npt_mmu() is separated from kvm_init_shadow_mmu(), we always know the MMU context we need to use and there is no need to dereference vcpu->arch.mmu pointer. Based on a patch by Vitaly Kuznetsov <vkuznets@redhat.com>. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200710141157.1640173-3-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
As a preparatory change for moving kvm_mmu_new_pgd() from nested_prepare_vmcb_save() to nested_svm_init_mmu_context() split kvm_init_shadow_npt_mmu() from kvm_init_shadow_mmu(). This also makes the code look more like nVMX (kvm_init_shadow_ept_mmu()). No functional change intended. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200710141157.1640173-2-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 7月, 2020 13 次提交
-
-
由 Sean Christopherson 提交于
Move x86's memory cache helpers to common KVM code so that they can be reused by arm64 and MIPS in future patches. Suggested-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-16-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename the memory helpers that will soon be moved to common code and be made globaly available via linux/kvm_host.h. "mmu" alone is not a sufficient namespace for globally available KVM symbols. Opportunistically add "nr_" in mmu_memory_cache_free_objects() to make it clear the function returns the number of free objects, as opposed to freeing existing objects. Suggested-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-14-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Don't bother filling the gfn array cache when the caller is a fully direct MMU, i.e. won't need a gfn array for shadow pages. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-13-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Set __GFP_ZERO for the shadow page memory cache and drop the explicit clear_page() from kvm_mmu_get_page(). This moves the cost of zeroing a page to the allocation time of the physical page, i.e. when topping up the memory caches, and thus avoids having to zero out an entire page while holding mmu_lock. Cc: Peter Feiner <pfeiner@google.com> Cc: Peter Shier <pshier@google.com> Cc: Junaid Shahid <junaids@google.com> Cc: Jim Mattson <jmattson@google.com> Suggested-by: NBen Gardon <bgardon@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-12-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a gfp_zero flag to 'struct kvm_mmu_memory_cache' and use it to control __GFP_ZERO instead of hardcoding a call to kmem_cache_zalloc(). A future patch needs such a flag for the __get_free_page() path, as gfn arrays do not need/want the allocator to zero the memory. Convert the kmem_cache paths to __GFP_ZERO now so as to avoid a weird and inconsistent API in the future. No functional change intended. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-11-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Use separate caches for allocating shadow pages versus gfn arrays. This sets the stage for specifying __GFP_ZERO when allocating shadow pages without incurring extra cost for gfn arrays. No functional change intended. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-10-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Clean up the minimums in mmu_topup_memory_caches() to document the driving mechanisms behind the minimums. Now that encountering an empty cache is unlikely to trigger BUG_ON(), it is less dangerous to be more precise when defining the minimums. For rmaps, the logic is 1 parent PTE per level, plus a single rmap, and prefetched rmaps. The extra objects in the current '8 + PREFETCH' minimum came about due to an abundance of paranoia in commit c41ef344 ("KVM: MMU: increase per-vcpu rmap cache alloc size"), i.e. it could have increased the minimum to 2 rmaps. Furthermore, the unexpected extra rmap case was killed off entirely by commits f759e2b4 ("KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write") and f5a1e9f8 ("KVM: MMU: remove call to kvm_mmu_pte_write from walk_addr"). For the so called page cache, replace '8' with 2*PT64_ROOT_MAX_LEVEL. The 2x multiplier is needed because the cache is used for both shadow pages and gfn arrays for indirect MMUs. And finally, for page headers, replace '4' with PT64_ROOT_MAX_LEVEL. Note, KVM now supports 5-level paging, i.e. the old minimums that used a baseline derived from 4-level paging were technically wrong. But, KVM always allocates roots in a separate flow, e.g. it's impossible in the current implementation to actually need 5 new shadow pages in a single flow. Use PT64_ROOT_MAX_LEVEL unmodified instead of subtracting 1, as the direct usage is likely more intuitive to uninformed readers, and the inflated minimum is unlikely to affect functionality in practice. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-9-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Avoid refilling the memory caches and potentially slow reclaim/swap when handling a fast page fault, which does not need to allocate any new objects. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-7-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Attempt to allocate a new object instead of crashing KVM (and likely the kernel) if a memory cache is unexpectedly empty. Use GFP_ATOMIC for the allocation as the caches are used while holding mmu_lock. The immediate BUG_ON() makes the code unnecessarily explosive and led to confusing minimums being used in the past, e.g. allocating 4 objects where 1 would suffice. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-6-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Return errors directly from mmu_topup_memory_caches() instead of branching to a label that does the same. No functional change intended. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-5-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Use "mc" for local variables to shorten line lengths and provide consistent names, which will be especially helpful when some of the helpers are moved to common KVM code in future patches. No functional change intended. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-4-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop the "page" variants of the topup/free memory cache helpers, using the existence of an associated kmem_cache to select the correct alloc or free routine. No functional change intended. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-3-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Track the kmem_cache used for non-page KVM MMU memory caches instead of passing in the associated kmem_cache when filling the cache. This will allow consolidating code and other cleanups. No functional change intended. Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-2-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 7月, 2020 3 次提交
-
-
由 Vitaly Kuznetsov 提交于
OVMF booted guest running on shadow pages crashes on TRIPLE FAULT after enabling paging from SMM. The crash is triggered from mmu_check_root() and is caused by kvm_is_visible_gfn() searching through memslots with as_id = 0 while vCPU may be in a different context (address space). Introduce kvm_vcpu_is_visible_gfn() and use it from mmu_check_root(). Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200708140023.1476020-1-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename KVM's accessor for retrieving a 'struct kvm_mmu_page' from the associated host physical address to better convey what the function is doing. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200622202034.15093-7-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Introduce sptep_to_sp() to reduce the boilerplate code needed to get the shadow page associated with a spte pointer, and to improve readability as it's not immediately obvious that "page_header" is a KVM-specific accessor for retrieving a shadow page. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200622202034.15093-6-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-