- 07 5月, 2021 5 次提交
-
-
由 Sean Christopherson 提交于
Squish the Intel and AMD emulation of MSR_TSC_AUX together and tie it to the guest CPU model instead of the host CPU behavior. While not strictly necessary to avoid guest breakage, emulating cross-vendor "architecture" will provide consistent behavior for the guest, e.g. WRMSR fault behavior won't change if the vCPU is migrated to a host with divergent behavior. Note, the "new" kvm_is_supported_user_return_msr() checks do not add new functionality on either SVM or VMX. On SVM, the equivalent was "tsc_aux_uret_slot < 0", and on VMX the check was buried in the vmx_find_uret_msr() call at the find_uret_msr label. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-15-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Now that SVM and VMX both probe MSRs before "defining" user return slots for them, consolidate the code for probe+define into common x86 and eliminate the odd behavior of having the vendor code define the slot for a given MSR. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-14-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Split out and export the number of configured user return MSRs so that VMX can iterate over the set of MSRs without having to do its own tracking. Keep the list itself internal to x86 so that vendor code still has to go through the "official" APIs to add/modify entries. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-13-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop VMX's global list of user return MSRs now that VMX doesn't resort said list to isolate "active" MSRs, i.e. now that VMX's list and x86's list have the same MSRs in the same order. In addition to eliminating the redundant list, this will also allow moving more of the list management into common x86. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-11-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Disable preemption when probing a user return MSR via RDSMR/WRMSR. If the MSR holds a different value per logical CPU, the WRMSR could corrupt the host's value if KVM is preempted between the RDMSR and WRMSR, and then rescheduled on a different CPU. Opportunistically land the helper in common x86, SVM will use the helper in a future commit. Fixes: 4be53410 ("KVM: VMX: Initialize vmx->guest_msrs[] right after allocation") Cc: stable@vger.kernel.org Cc: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-6-seanjc@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 5月, 2021 1 次提交
-
-
由 Maxim Levitsky 提交于
* Define and use an invalid GPA (all ones) for init value of last and current nested vmcb physical addresses. * Reset the current vmcb12 gpa to the invalid value when leaving the nested mode, similar to what is done on nested vmexit. * Reset the last seen vmcb12 address when disabling the nested SVM, as it relies on vmcb02 fields which are freed at that point. Fixes: 4995a368 ("KVM: SVM: Use a separate vmcb for the nested L2 guest") Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210503125446.1353307-3-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 22 4月, 2021 1 次提交
-
-
由 Nathan Tempelman 提交于
Add a capability for userspace to mirror SEV encryption context from one vm to another. On our side, this is intended to support a Migration Helper vCPU, but it can also be used generically to support other in-guest workloads scheduled by the host. The intention is for the primary guest and the mirror to have nearly identical memslots. The primary benefits of this are that: 1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they can't accidentally clobber each other. 2) The VMs can have different memory-views, which is necessary for post-copy migration (the migration vCPUs on the target need to read and write to pages, when the primary guest would VMEXIT). This does not change the threat model for AMD SEV. Any memory involved is still owned by the primary guest and its initial state is still attested to through the normal SEV_LAUNCH_* flows. If userspace wanted to circumvent SEV, they could achieve the same effect by simply attaching a vCPU to the primary VM. This patch deliberately leaves userspace in charge of the memslots for the mirror, as it already has the power to mess with them in the primary guest. This patch does not support SEV-ES (much less SNP), as it does not handle handing off attested VMSAs to the mirror. For additional context, we need a Migration Helper because SEV PSP migration is far too slow for our live migration on its own. Using an in-guest migrator lets us speed this up significantly. Signed-off-by: NNathan Tempelman <natet@google.com> Message-Id: <20210408223214.2582277-1-natet@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 4月, 2021 4 次提交
-
-
由 Sean Christopherson 提交于
Add an ECREATE handler that will be used to intercept ECREATE for the purpose of enforcing and enclave's MISCSELECT, ATTRIBUTES and XFRM, i.e. to allow userspace to restrict SGX features via CPUID. ECREATE will be intercepted when any of the aforementioned masks diverges from hardware in order to enforce the desired CPUID model, i.e. inject #GP if the guest attempts to set a bit that hasn't been enumerated as allowed-1 in CPUID. Note, access to the PROVISIONKEY is not yet supported. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: NKai Huang <kai.huang@intel.com> Signed-off-by: NKai Huang <kai.huang@intel.com> Message-Id: <c3a97684f1b71b4f4626a1fc3879472a95651725.1618196135.git.kai.huang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Page faults that are signaled by the SGX Enclave Page Cache Map (EPCM), as opposed to the traditional IA32/EPT page tables, set an SGX bit in the error code to indicate that the #PF was induced by SGX. KVM will need to emulate this behavior as part of its trap-and-execute scheme for virtualizing SGX Launch Control, e.g. to inject SGX-induced #PFs if EINIT faults in the host, and to support live migration. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NKai Huang <kai.huang@intel.com> Message-Id: <e170c5175cb9f35f53218a7512c9e3db972b97a2.1618196135.git.kai.huang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Keqian Zhu 提交于
kvm_mmu_slot_largepage_remove_write_access() is decared but not used, just remove it. Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com> Message-Id: <20210406063504.17552-1-zhukeqian1@huawei.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
To analyze some performance issues with lock contention and scheduling, it is nice to know when directed yield are successful or failing. Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1617941911-5338-2-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 19 4月, 2021 1 次提交
-
-
由 Ben Gardon 提交于
Protect the contents of the TDP MMU roots list with RCU in preparation for a future patch which will allow the iterator macro to be used under the MMU lock in read mode. Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-9-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 4月, 2021 4 次提交
-
-
由 Sean Christopherson 提交于
Yank out the hva-based MMU notifier APIs now that all architectures that use the notifiers have moved to the gfn-based APIs. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-7-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the hva->gfn lookup for MMU notifiers into common code. Every arch does a similar lookup, and some arch code is all but identical across multiple architectures. In addition to consolidating code, this will allow introducing optimizations that will benefit all architectures without incurring multiple walks of the memslots, e.g. by taking mmu_lock if and only if a relevant range exists in the memslots. The use of __always_inline to avoid indirect call retpolines, as done by x86, may also benefit other architectures. Consolidating the lookups also fixes a wart in x86, where the legacy MMU and TDP MMU each do their own memslot walks. Lastly, future enhancements to the memslot implementation, e.g. to add an interval tree to track host address, will need to touch far less arch specific code. MIPS, PPC, and arm64 will be converted one at a time in future patches. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-3-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Maxim Levitsky 提交于
Store the supported bits into KVM_GUESTDBG_VALID_MASK macro, similar to how arm does this. Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401135451.1004564-4-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the prototypes for the MMU notifier callbacks out of arch code and into common code. There is no benefit to having each arch replicate the prototypes since any deviation from the invocation in common code will explode. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-9-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 19 3月, 2021 1 次提交
-
-
由 Sean Christopherson 提交于
Fix a plethora of issues with MSR filtering by installing the resulting filter as an atomic bundle instead of updating the live filter one range at a time. The KVM_X86_SET_MSR_FILTER ioctl() isn't truly atomic, as the hardware MSR bitmaps won't be updated until the next VM-Enter, but the relevant software struct is atomically updated, which is what KVM really needs. Similar to the approach used for modifying memslots, make arch.msr_filter a SRCU-protected pointer, do all the work configuring the new filter outside of kvm->lock, and then acquire kvm->lock only when the new filter has been vetted and created. That way vCPU readers either see the old filter or the new filter in their entirety, not some half-baked state. Yuan Yao pointed out a use-after-free in ksm_msr_allowed() due to a TOCTOU bug, but that's just the tip of the iceberg... - Nothing is __rcu annotated, making it nigh impossible to audit the code for correctness. - kvm_add_msr_filter() has an unpaired smp_wmb(). Violation of kernel coding style aside, the lack of a smb_rmb() anywhere casts all code into doubt. - kvm_clear_msr_filter() has a double free TOCTOU bug, as it grabs count before taking the lock. - kvm_clear_msr_filter() also has memory leak due to the same TOCTOU bug. The entire approach of updating the live filter is also flawed. While installing a new filter is inherently racy if vCPUs are running, fixing the above issues also makes it trivial to ensure certain behavior is deterministic, e.g. KVM can provide deterministic behavior for MSRs with identical settings in the old and new filters. An atomic update of the filter also prevents KVM from getting into a half-baked state, e.g. if installing a filter fails, the existing approach would leave the filter in a half-baked state, having already committed whatever bits of the filter were already processed. [*] https://lkml.kernel.org/r/20210312083157.25403-1-yaoyuan0329os@gmail.com Fixes: 1a155254 ("KVM: x86: Introduce MSR filtering") Cc: stable@vger.kernel.org Cc: Alexander Graf <graf@amazon.com> Reported-by: NYuan Yao <yaoyuan0329os@gmail.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210316184436.2544875-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 3月, 2021 2 次提交
-
-
由 Ingo Molnar 提交于
Fix ~144 single-word typos in arch/x86/ code comments. Doing this in a single commit should reduce the churn. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-kernel@vger.kernel.org
-
由 Vitaly Kuznetsov 提交于
Create an infrastructure for tracking Hyper-V TSC page status, i.e. if it was updated from guest/host side or if we've failed to set it up (because e.g. guest wrote some garbage to HV_X64_MSR_REFERENCE_TSC) and there's no need to retry. Also, in a hypothetical situation when we are in 'always catchup' mode for TSC we can now avoid contending 'hv->hv_lock' on every guest enter by setting the state to HV_TSC_PAGE_BROKEN after compute_tsc_page_parameters() returns false. Check for HV_TSC_PAGE_SET state instead of '!hv->tsc_ref.tsc_sequence' in get_time_ref_counter() to properly handle the situation when we failed to write the updated TSC page values to the guest. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210316143736.964151-4-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 3月, 2021 8 次提交
-
-
由 Sean Christopherson 提交于
Retrieve the active PCID only when writing a guest CR3 value, i.e. don't get the PCID when using EPT or NPT. The PCID is especially problematic for EPT as the bits have different meaning, and so the PCID and must be manually stripped, which is annoying and unnecessary. And on VMX, getting the active PCID also involves reading the guest's CR3 and CR4.PCIDE, i.e. may add pointless VMREADs. Opportunistically rename the pgd/pgd_level params to root_hpa and root_level to better reflect their new roles. Keep the function names, as "load the guest PGD" is still accurate/correct. Last, and probably least, pass root_hpa as a hpa_t/u64 instead of an unsigned long. The EPTP holds a 64-bit value, even in 32-bit mode, so in theory EPT could support HIGHMEM for 32-bit KVM. Never mind that doing so would require changing the MMU page allocators and reworking the MMU to use kmap(). Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210305183123.3978098-2-seanjc@google.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Let the MMU deal with the SPTE masks to avoid splitting the logic and knowledge across the MMU and VMX. The SPTE masks that are used for EPT are very, very tightly coupled to the MMU implementation. The use of available bits, the existence of A/D types, the fact that shadow_x_mask even exists, and so on and so forth are all baked into the MMU implementation. Cross referencing the params to the masks is also a nightmare, as pretty much every param is a u64. A future patch will make the location of the MMU_WRITABLE and HOST_WRITABLE bits MMU specific, to free up bit 11 for a MMU_PRESENT bit. Doing that change with the current kvm_mmu_set_mask_ptes() would be an absolute mess. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210225204749.1512652-18-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the entirety of the accelerated RDPMC emulation to x86.c, and assign the common handler directly to the exit handler array for VMX. SVM has bizarre nrips behavior that prevents it from directly invoking the common handler. The nrips goofiness will be addressed in a future patch. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210205005750.3841462-8-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the trivial exit handlers, e.g. for instructions that KVM "emulates" as nops, to common x86 code. Assign the common handlers directly to the exit handler arrays and drop the vendor trampolines. Opportunistically use pr_warn_once() where appropriate. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210205005750.3841462-7-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the entirety of XSETBV emulation to x86.c, and assign the function directly to both VMX's and SVM's exit handlers, i.e. drop the unnecessary trampolines. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210205005750.3841462-6-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Synthesize a nested VM-Exit if L2 triggers an emulated triple fault instead of exiting to userspace, which likely will kill L1. Any flow that does KVM_REQ_TRIPLE_FAULT is suspect, but the most common scenario for L2 killing L1 is if L0 (KVM) intercepts a contributory exception that is _not_intercepted by L1. E.g. if KVM is intercepting #GPs for the VMware backdoor, a #GP that occurs in L2 while vectoring an injected #DF will cause KVM to emulate triple fault. Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210302174515.2812275-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Unexport the MMU load and unload helpers now that they are no longer used (incorrectly) in vendor code. Opportunistically move the kvm_mmu_sync_roots() declaration into mmu.h, it should not be exposed to vendor code. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210305011101.3597423-16-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Dongli Zhang 提交于
The new per-cpu stat 'nested_run' is introduced in order to track if L1 VM is running or used to run L2 VM. An example of the usage of 'nested_run' is to help the host administrator to easily track if any L1 VM is used to run L2 VM. Suppose there is issue that may happen with nested virtualization, the administrator will be able to easily narrow down and confirm if the issue is due to nested virtualization via 'nested_run'. For example, whether the fix like commit 88dddc11 ("KVM: nVMX: do not use dangling shadow VMCS after guest reset") is required. Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com> Message-Id: <20210305225747.7682-1-dongli.zhang@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 13 3月, 2021 1 次提交
-
-
由 Muhammad Usama Anjum 提交于
This patch adds the annotation to fix the following sparse errors: arch/x86/kvm//x86.c:8147:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map * arch/x86/kvm//x86.c:10628:16: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map * arch/x86/kvm//x86.c:10629:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter * arch/x86/kvm//lapic.c:267:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:269:9: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map * arch/x86/kvm//lapic.c:637:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:994:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:1036:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:1173:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map * arch/x86/kvm//pmu.c:190:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:251:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter * Signed-off-by: NMuhammad Usama Anjum <musamaanjum@gmail.com> Message-Id: <20210305191123.GA497469@LEGION> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 3月, 2021 1 次提交
-
-
由 David Woodhouse 提交于
This is how Xen guests do steal time accounting. The hypervisor records the amount of time spent in each of running/runnable/blocked/offline states. In the Xen accounting, a vCPU is still in state RUNSTATE_running while in Xen for a hypercall or I/O trap, etc. Only if Xen explicitly schedules does the state become RUNSTATE_blocked. In KVM this means that even when the vCPU exits the kvm_run loop, the state remains RUNSTATE_running. The VMM can explicitly set the vCPU to RUNSTATE_blocked by using the KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT attribute, and can also use KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST to retrospectively add a given amount of time to the blocked state and subtract it from the running state. The state_entry_time corresponds to get_kvmclock_ns() at the time the vCPU entered the current state, and the total times of all four states should always add up to state_entry_time. Co-developed-by: NJoao Martins <joao.m.martins@oracle.com> Signed-off-by: NJoao Martins <joao.m.martins@oracle.com> Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk> Message-Id: <20210301125309.874953-2-dwmw2@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 2月, 2021 1 次提交
-
-
由 Dongli Zhang 提交于
The 'mmu_page_hash' is used as hash table while 'active_mmu_pages' is a list. Remove the misplaced comment as it's mostly stating the obvious anyways. Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210226061945.1222-1-dongli.zhang@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 19 2月, 2021 5 次提交
-
-
由 Sean Christopherson 提交于
Remove several exports from the MMU that are no longer necessary. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-15-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Stop setting dirty bits for MMU pages when dirty logging is disabled for a memslot, as PML is now completely disabled when there are no memslots with dirty logging enabled. This means that spurious PML entries will be created for memslots with dirty logging disabled if at least one other memslot has dirty logging enabled. However, spurious PML entries are already possible since dirty bits are set only when a dirty logging is turned off, i.e. memslots that are never dirty logged will have dirty bits cleared. In the end, it's faster overall to eat a few spurious PML entries in the window where dirty logging is being disabled across all memslots. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-13-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Makarand Sonare 提交于
Currently, if enable_pml=1 PML remains enabled for the entire lifetime of the VM irrespective of whether dirty logging is enable or disabled. When dirty logging is disabled, all the pages of the VM are manually marked dirty, so that PML is effectively non-operational. Setting the dirty bits is an expensive operation which can cause severe MMU lock contention in a performance sensitive path when dirty logging is disabled after a failed or canceled live migration. Manually setting dirty bits also fails to prevent PML activity if some code path clears dirty bits, which can incur unnecessary VM-Exits. In order to avoid this extra overhead, dynamically enable/disable PML when dirty logging gets turned on/off for the first/last memslot. Signed-off-by: NMakarand Sonare <makarandsonare@google.com> Co-developed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-12-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop the facade of KVM's PML logic being vendor specific and move the bits that aren't truly VMX specific into common x86 code. The MMU logic for dealing with PML is tightly coupled to the feature and to VMX's implementation, bouncing through kvm_x86_ops obfuscates the code without providing any meaningful separation of concerns or encapsulation. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-10-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Store the vendor-specific dirty log size in a variable, there's no need to wrap it in a function since the value is constant after hardware_setup() runs. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-9-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 2月, 2021 5 次提交
-
-
由 Vitaly Kuznetsov 提交于
Hyper-V emulation is enabled in KVM unconditionally. This is bad at least from security standpoint as it is an extra attack surface. Ideally, there should be a per-VM capability explicitly enabled by VMM but currently it is not the case and we can't mandate one without breaking backwards compatibility. We can, however, check guest visible CPUIDs and only enable Hyper-V emulation when "Hv#1" interface was exposed in HYPERV_CPUID_INTERFACE. Note, VMMs are free to act in any sequence they like, e.g. they can try to set MSRs first and CPUIDs later so we still need to allow the host to read/write Hyper-V specific MSRs unconditionally. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210126134816.1880136-14-vkuznets@redhat.com> [Add selftest vcpu_set_hv_cpuid API to avoid breaking xen_vmcall_test. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
Hyper-V context is only needed for guests which use Hyper-V emulation in KVM (e.g. Windows/Hyper-V guests). 'struct kvm_vcpu_hv' is, however, quite big, it accounts for more than 1/4 of the total 'struct kvm_vcpu_arch' which is also quite big already. This all looks like a waste. Allocate 'struct kvm_vcpu_hv' dynamically. This patch does not bring any (intentional) functional change as we still allocate the context unconditionally but it paves the way to doing that only when needed. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210126134816.1880136-13-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
Current KVM_USER_MEM_SLOTS limits are arch specific (512 on Power, 509 on x86, 32 on s390, 16 on MIPS) but they don't really need to be. Memory slots are allocated dynamically in KVM when added so the only real limitation is 'id_to_index' array which is 'short'. We don't have any other KVM_MEM_SLOTS_NUM/KVM_USER_MEM_SLOTS-sized statically defined structures. Low KVM_USER_MEM_SLOTS can be a limiting factor for some configurations. In particular, when QEMU tries to start a Windows guest with Hyper-V SynIC enabled and e.g. 256 vCPUs the limit is hit as SynIC requires two pages per vCPU and the guest is free to pick any GFN for each of them, this fragments memslots as QEMU wants to have a separate memslot for each of these pages (which are supposed to act as 'overlay' pages). Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210127175731.2020089-3-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
kvm_get_dr and emulator_get_dr except an in-range value for the register number so they cannot fail. Change the return type to void. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The TDP MMU assumes that it can do atomic accesses to 64-bit PTEs. Rather than just disabling it, compile it out completely so that it is possible to use for example 64-bit xchg. To limit the number of stubs, wrap all accesses to tdp_mmu_enabled or tdp_mmu_page with a function. Calls to all other functions in tdp_mmu.c are eliminated and do not even reach the linker. Reviewed-by: NSean Christopherson <seanjc@google.com> Tested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-