- 13 2月, 2020 3 次提交
-
-
由 Paolo Bonzini 提交于
The check cpu->hv_clock.system_time < 0 is redundant since system_time is a u64 and hence can never be less than zero. But what was actually meant is to check that the result is positive, since kernel_ns and v->kvm->arch.kvmclock_offset are both s64. Reported-by: NColin King <colin.king@canonical.com> Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com> Addresses-Coverity: ("Macro compares unsigned to 0") Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Wrap calls to ->page_fault() with a small shim to directly invoke the TDP fault handler when the kernel is using retpolines and TDP is being used. Single out the TDP fault handler and annotate the TDP path as likely to coerce the compiler into preferring it over the indirect function call. Rename tdp_page_fault() to kvm_tdp_page_fault(), as it's exposed outside of mmu.c to allow inlining the shim. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Miaohe Lin 提交于
The KVM_REQ_EVENT request is already made in kvm_set_rflags(). We should not make it again. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 2月, 2020 2 次提交
-
-
由 Oliver Upton 提交于
KVM allows the deferral of exception payloads when a vCPU is in guest mode to allow the L1 hypervisor to intercept certain events (#PF, #DB) before register state has been modified. However, this behavior is incompatible with the KVM_{GET,SET}_VCPU_EVENTS ABI, as userspace expects register state to have been immediately modified. Userspace may opt-in for the payload deferral behavior with the KVM_CAP_EXCEPTION_PAYLOAD per-VM capability. As such, kvm_multiple_exception() will immediately manipulate guest registers if the capability hasn't been requested. Since the deferral is only necessary if a userspace ioctl were to be serviced at the same as a payload bearing exception is recognized, this behavior can be relaxed. Instead, opportunistically defer the payload from kvm_multiple_exception() and deliver the payload before completing a KVM_GET_VCPU_EVENTS ioctl. Signed-off-by: NOliver Upton <oupton@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Oliver Upton 提交于
KVM defines the #DB payload as compatible with the 'pending debug exceptions' field under VMX, not DR6. Mask off bit 12 when applying the payload to DR6, as it is reserved on DR6 but not the 'pending debug exceptions' field. Fixes: f10c729f ("kvm: vmx: Defer setting of DR6 until #DB delivery") Signed-off-by: NOliver Upton <oupton@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 2月, 2020 9 次提交
-
-
由 Sean Christopherson 提交于
Re-add code to mark CR4.UMIP as reserved if UMIP is not supported by the host. The UMIP handling was unintentionally dropped during a recent refactoring. Not flagging CR4.UMIP allows the guest to set its CR4.UMIP regardless of host support or userspace desires. On CPUs with UMIP support, including emulated UMIP, this allows the guest to enable UMIP against the wishes of the userspace VMM. On CPUs without any form of UMIP, this results in a failed VM-Enter due to invalid guest state. Fixes: 345599f9 ("KVM: x86: Add macro to ensure reserved cr4 bits checks stay in sync") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Commit 53fafdbb ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") changed kvmclock to use tkr_raw instead of tkr_mono. However, the default kvmclock_offset for the VM was still based on the monotonic clock and, if the raw clock drifted enough from the monotonic clock, this could cause a negative system_time to be written to the guest's struct pvclock. RHEL5 does not like it and (if it boots fast enough to observe a negative time value) it hangs. There is another thing to be careful about: getboottime64 returns the host boot time with tkr_mono frequency, and subtracting the tkr_raw-based kvmclock value will cause the wallclock to be off if tkr_raw drifts from tkr_mono. To avoid this, compute the wallclock delta from the current time instead of being clever and using getboottime64. Fixes: 53fafdbb ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") Cc: stable@vger.kernel.org Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
We will need a copy of tk->offs_boot in the next patch. Store it and cleanup the struct: instead of storing tk->tkr_xxx.base with the tk->offs_boot included, store the raw value in struct pvclock_clock and sum it in do_monotonic_raw and do_realtime. tk->tkr_xxx.xtime_nsec also moves to struct pvclock_clock. While at it, fix a (usually harmless) typo in do_monotonic_raw, which was using gtod->clock.shift instead of gtod->raw_clock.shift. Fixes: 53fafdbb ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") Cc: stable@vger.kernel.org Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Suravee Suthikulpanit 提交于
Since disabling APICv has to be done for all vcpus on AMD-based system, adopt the newly introduced kvm_request_apicv_update() interface, and introduce a new APICV_INHIBIT_REASON_HYPERV. Also, remove the kvm_vcpu_deactivate_apicv() since no longer used. Cc: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Suravee Suthikulpanit 提交于
AMD SVM AVIC needs to update APIC backing page mapping before changing APICv mode. Introduce struct kvm_x86_ops.pre_update_apicv_exec_ctrl function hook to be called prior KVM APICv update request to each vcpu. Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Suravee Suthikulpanit 提交于
Inibit reason bits are used to determine if APICv deactivation is applicable for a particular hardware virtualization architecture. Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Suravee Suthikulpanit 提交于
Add trace points when sending request to (de)activate APICv. Suggested-by: NAlexander Graf <graf@amazon.com> Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Suravee Suthikulpanit 提交于
Certain runtime conditions require APICv to be temporary deactivated during runtime. The current implementation only support run-time deactivation of APICv when Hyper-V SynIC is enabled, which is not temporary. In addition, for AMD, when APICv is (de)activated at runtime, all vcpus in the VM have to operate in the same mode. Thus the requesting vcpu must notify the others. So, introduce the following: * A new KVM_REQ_APICV_UPDATE request bit * Interfaces to request all vcpus to update APICv status * A new interface to update APICV-related parameters for each vcpu Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Suravee Suthikulpanit 提交于
There are several reasons in which a VM needs to deactivate APICv e.g. disable APICv via parameter during module loading, or when enable Hyper-V SynIC support. Additional inhibit reasons will be introduced later on when dynamic APICv is supported, Introduce KVM APICv inhibit reason bits along with a new variable, apicv_inhibit_reasons, to help keep track of APICv state for each VM, Initially, the APICV_INHIBIT_REASON_DISABLE bit is used to indicate the case where APICv is disabled during KVM module load. (e.g. insmod kvm_amd avic=0 or insmod kvm_intel enable_apicv=0). Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> [Do not use get_enable_apicv; consider irqchip_split in svm.c. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 31 1月, 2020 4 次提交
-
-
由 Boris Ostrovsky 提交于
Now that we are mapping kvm_steal_time from the guest directly we don't need keep a copy of it in kvm_vcpu_arch.st. The same is true for the stime field. This is part of CVE-2019-3016. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: NJoao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Boris Ostrovsky 提交于
There is a potential race in record_steal_time() between setting host-local vcpu->arch.st.steal.preempted to zero (i.e. clearing KVM_VCPU_PREEMPTED) and propagating this value to the guest with kvm_write_guest_cached(). Between those two events the guest may still see KVM_VCPU_PREEMPTED in its copy of kvm_steal_time, set KVM_VCPU_FLUSH_TLB and assume that hypervisor will do the right thing. Which it won't. Instad of copying, we should map kvm_steal_time and that will guarantee atomicity of accesses to @preempted. This is part of CVE-2019-3016. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: NJoao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Boris Ostrovsky 提交于
__kvm_map_gfn()'s call to gfn_to_pfn_memslot() is * relatively expensive * in certain cases (such as when done from atomic context) cannot be called Stashing gfn-to-pfn mapping should help with both cases. This is part of CVE-2019-3016. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: NJoao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Boris Ostrovsky 提交于
kvm_steal_time_set_preempted() may accidentally clear KVM_VCPU_FLUSH_TLB bit if it is called more than once while VCPU is preempted. This is part of CVE-2019-3016. (This bug was also independently discovered by Jim Mattson <jmattson@google.com>) Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: NJoao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 1月, 2020 14 次提交
-
-
由 Peng Hao 提交于
kvm_vector_hashing_enabled() is just called in kvm.ko module. Signed-off-by: NPeng Hao <richard.peng@oppo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Krish Sadhukhan 提交于
According to section "Checks on Guest Control Registers, Debug Registers, and and MSRs" in Intel SDM vol 3C, the following checks are performed on vmentry of nested guests: If the "load debug controls" VM-entry control is 1, bits 63:32 in the DR7 field must be 0. In KVM, GUEST_DR7 is set prior to the vmcs02 VM-entry by kvm_set_dr() and the latter synthesizes a #GP if any bit in the high dword in the former is set. Hence this field needs to be checked in software. Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: NKarl Heubaum <karl.heubaum@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Peter Xu 提交于
The helper x86_set_memory_region() is only used in vmx_set_tss_addr() and kvm_arch_destroy_vm(). Push the lock upper in both cases. With that, drop x86_set_memory_region(). This prepares to allow __x86_set_memory_region() to return a HVA mapped, because the HVA will need to be protected by the lock too even after __x86_set_memory_region() returns. Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Reload the current thread's FPU state, which contains the guest's FPU state, to the CPU registers if necessary during vcpu_enter_guest(). TIF_NEED_FPU_LOAD can be set any time control is transferred out of KVM, e.g. if I/O is triggered during a KVM call to get_user_pages() or if a softirq occurs while KVM is scheduled in. Moving the handling of TIF_NEED_FPU_LOAD from vcpu_enter_guest() to kvm_arch_vcpu_load(), effectively kvm_sched_in(), papered over a bug where kvm_put_guest_fpu() failed to account for TIF_NEED_FPU_LOAD. The easiest way to the kvm_put_guest_fpu() bug was to run with involuntary preemption enable, thus handling TIF_NEED_FPU_LOAD during kvm_sched_in() made the bug go away. But, removing the handling in vcpu_enter_guest() exposed KVM to the rare case of a softirq triggering kernel_fpu_begin() between vcpu_load() and vcpu_enter_guest(). Now that kvm_{load,put}_guest_fpu() correctly handle TIF_NEED_FPU_LOAD, revert the commit to both restore the vcpu_enter_guest() behavior and eliminate the superfluous switch_fpu_return() in kvm_arch_vcpu_load(). Note, leaving the handling in kvm_arch_vcpu_load() isn't wrong per se, but it is unnecessary, and most critically, makes it extremely difficult to find bugs such as the kvm_put_guest_fpu() issue due to shrinking the window where a softirq can corrupt state. A sample trace triggered by warning if TIF_NEED_FPU_LOAD is set while vcpu state is loaded: <IRQ> gcmaes_crypt_by_sg.constprop.12+0x26e/0x660 ? 0xffffffffc024547d ? __qdisc_run+0x83/0x510 ? __dev_queue_xmit+0x45e/0x990 ? ip_finish_output2+0x1a8/0x570 ? fib4_rule_action+0x61/0x70 ? fib4_rule_action+0x70/0x70 ? fib_rules_lookup+0x13f/0x1c0 ? helper_rfc4106_decrypt+0x82/0xa0 ? crypto_aead_decrypt+0x40/0x70 ? crypto_aead_decrypt+0x40/0x70 ? crypto_aead_decrypt+0x40/0x70 ? esp_output_tail+0x8f4/0xa5a [esp4] ? skb_ext_add+0xd3/0x170 ? xfrm_input+0x7a6/0x12c0 ? xfrm4_rcv_encap+0xae/0xd0 ? xfrm4_transport_finish+0x200/0x200 ? udp_queue_rcv_one_skb+0x1ba/0x460 ? udp_unicast_rcv_skb.isra.63+0x72/0x90 ? __udp4_lib_rcv+0x51b/0xb00 ? ip_protocol_deliver_rcu+0xd2/0x1c0 ? ip_local_deliver_finish+0x44/0x50 ? ip_local_deliver+0xe0/0xf0 ? ip_protocol_deliver_rcu+0x1c0/0x1c0 ? ip_rcv+0xbc/0xd0 ? ip_rcv_finish_core.isra.19+0x380/0x380 ? __netif_receive_skb_one_core+0x7e/0x90 ? netif_receive_skb_internal+0x3d/0xb0 ? napi_gro_receive+0xed/0x150 ? 0xffffffffc0243c77 ? net_rx_action+0x149/0x3b0 ? __do_softirq+0xe4/0x2f8 ? handle_irq_event_percpu+0x6a/0x80 ? irq_exit+0xe6/0xf0 ? do_IRQ+0x7f/0xd0 ? common_interrupt+0xf/0xf </IRQ> ? irq_entries_start+0x20/0x660 ? vmx_get_interrupt_shadow+0x2f0/0x710 [kvm_intel] ? kvm_set_msr_common+0xfc7/0x2380 [kvm] ? recalibrate_cpu_khz+0x10/0x10 ? ktime_get+0x3a/0xa0 ? kvm_arch_vcpu_ioctl_run+0x107/0x560 [kvm] ? kvm_init+0x6bf/0xd00 [kvm] ? __seccomp_filter+0x7a/0x680 ? do_vfs_ioctl+0xa4/0x630 ? security_file_ioctl+0x32/0x50 ? ksys_ioctl+0x60/0x90 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x5f/0x1a0 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace 9564a1ccad733a90 ]--- This reverts commit e7517324. Fixes: e7517324 ("KVM: X86: Fix fpu state crash in kvm guest") Reported-by: NDerek Yerger <derek@djy.llc> Reported-by: kernel@najdan.com Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Thomas Lambertz <mail@thomaslambertz.de> Cc: Rik van Riel <riel@surriel.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Handle TIF_NEED_FPU_LOAD similar to how fpu__copy() handles the flag when duplicating FPU state to a new task struct. TIF_NEED_FPU_LOAD can be set any time control is transferred out of KVM, be it voluntarily, e.g. if I/O is triggered during a KVM call to get_user_pages, or involuntarily, e.g. if softirq runs after an IRQ occurs. Therefore, KVM must account for TIF_NEED_FPU_LOAD whenever it is (potentially) accessing CPU FPU state. Fixes: 5f409e20 ("x86/fpu: Defer FPU state load until return to userspace") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This reverts commit 95145c25. The next few patches will fix the issue so the warning is not needed anymore; revert it separately to simplify application to stable kernels. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Miaohe Lin 提交于
The exception pending event is cleared by kvm_clear_exception_queue(). We shouldn't clear it again. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marios Pomonis 提交于
This fixes a Spectre-v1/L1TF vulnerability in __kvm_set_dr() and kvm_get_dr(). Both kvm_get_dr() and kvm_set_dr() (a wrapper of __kvm_set_dr()) are exported symbols so KVM should tream them conservatively from a security perspective. Fixes: 020df079 ("KVM: move DR register access handling into generic code") Signed-off-by: NNick Finco <nifi@google.com> Signed-off-by: NMarios Pomonis <pomonis@google.com> Reviewed-by: NAndrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marios Pomonis 提交于
This fixes a Spectre-v1/L1TF vulnerability in set_msr_mce() and get_msr_mce(). Both functions contain index computations based on the (attacker-controlled) MSR number. Fixes: 890ca9ae ("KVM: Add MCE support") Signed-off-by: NNick Finco <nifi@google.com> Signed-off-by: NMarios Pomonis <pomonis@google.com> Reviewed-by: NAndrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all arch specific implementations are nops. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Fold init() into create() now that the two are called back-to-back by common KVM code (kvm_vcpu_init() calls kvm_arch_vcpu_init() as its last action, and kvm_vm_ioctl_create_vcpu() calls kvm_arch_vcpu_create() immediately thereafter). This paves the way for removing kvm_arch_vcpu_init() entirely. No functional change intended. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove kvm_arch_vcpu_setup() now that all arch specific implementations are nops. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Fold setup() into create() now that the two are called back-to-back by common KVM code. This paves the way for removing kvm_arch_vcpu_setup(). Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Now that all architectures tightly couple vcpu allocation/free with the mandatory calls to kvm_{un}init_vcpu(), move the sequences verbatim to common KVM code. Move both allocation and initialization in a single patch to eliminate thrash in arch specific code. The bisection benefits of moving the two pieces in separate patches is marginal at best, whereas the odds of introducing a transient arch specific bug are non-zero. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 1月, 2020 8 次提交
-
-
由 Sean Christopherson 提交于
Add kvm_vcpu_destroy() and wire up all architectures to call the common function instead of their arch specific implementation. The common destruction function will be used by future patches to move allocation and initialization of vCPUs to common KVM code, i.e. to free resources that are allocated by arch agnostic code. No functional change intended. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the call to kvm_vcpu_uninit() in kvm_arch_vcpu_destroy() down a few lines so that it is invoked immediately prior to freeing the vCPU. This paves the way for moving the uninit and free sequence to common KVM code without an associated functional change. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a pre-allocation arch hook to handle checks that are currently done by arch specific code prior to allocating the vCPU object. This paves the way for moving the allocation to common KVM code. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove the superfluous kvm_arch_vcpu_free() as it is no longer called from commmon KVM code. Note, kvm_arch_vcpu_destroy() *is* called from common code, i.e. choosing which function to whack is not completely arbitrary. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove a bogus clearing of apf.msr_val from kvm_arch_vcpu_destroy(). apf.msr_val is only set to a non-zero value by kvm_pv_enable_async_pf(), which is only reachable by kvm_set_msr_common(), i.e. by writing MSR_KVM_ASYNC_PF_EN. KVM does not autonomously write said MSR, i.e. can only be written via KVM_SET_MSRS or KVM_RUN. Since KVM_SET_MSRS and KVM_RUN are vcpu ioctls, they require a valid vcpu file descriptor. kvm_arch_vcpu_destroy() is only called if KVM_CREATE_VCPU fails, and KVM declares KVM_CREATE_VCPU successful once the vcpu fd is installed and thus visible to userspace. Ergo, apf.msr_val cannot be non-zero when kvm_arch_vcpu_destroy() is called. Fixes: 344d9588 ("KVM: Add PV MSR to enable asynchronous page faults delivery.") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
x86 does not load its MMU until KVM_RUN, which cannot be invoked until after vCPU creation succeeds. Given that kvm_arch_vcpu_destroy() is called if and only if vCPU creation fails, it is impossible for the MMU to be loaded. Note, the bogus kvm_mmu_unload() call was added during an unrelated refactoring of vCPU allocation, i.e. was presumably added as an opportunstic "fix" for a perceived leak. Fixes: fb3f0f51 ("KVM: Dynamically allocate vcpus") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the kvm_cpu_{un}init() calls to common x86 code as an intermediate step to removing kvm_cpu_{un}init() altogether. Note, VMX'x alloc_apic_access_page() and init_rmode_identity_map() are per-VM allocations and are intentionally kept if vCPU creation fails. They are freed by kvm_arch_destroy_vm(). No functional change intended. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Allocate the pio_data page after creating the MMU and local APIC so that all direct memory allocations are grouped together. This allows setting the return value to -ENOMEM prior to starting the allocations instead of setting it in the fail path for every allocation. The pio_data page is only consumed when KVM_RUN is invoked, i.e. moving its allocation has no real functional impact. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-