- 30 11月, 2021 1 次提交
-
-
由 Paolo Bonzini 提交于
The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even if APICv is disabled on the vCPU that receives it. In that case, the interrupt will just cause a vmexit and leave the ON bit set together with the PIR bit corresponding to the interrupt. Right now, the interrupt would not be delivered until APICv is re-enabled. However, fixing this is just a matter of always doing the PIR->IRR synchronization, even if the vCPU has temporarily disabled APICv. This is not a problem for performance, or if anything it is an improvement. First, in the common case where vcpu->arch.apicv_active is true, one fewer check has to be performed. Second, static_call_cond will elide the function call if APICv is not present or disabled. Finally, in the case for AMD hardware we can remove the sync_pir_to_irr callback: it is only needed for apic_has_interrupt_for_ppr, and that function already has a fallback for !APICv. Cc: stable@vger.kernel.org Co-developed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Reviewed-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211123004311.2954158-4-pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 11月, 2021 2 次提交
-
-
由 Vitaly Kuznetsov 提交于
KVM: x86: Don't update vcpu->arch.pv_eoi.msr_val when a bogus value was written to MSR_KVM_PV_EOI_EN When kvm_gfn_to_hva_cache_init() call from kvm_lapic_set_pv_eoi() fails, MSR write to MSR_KVM_PV_EOI_EN results in #GP so it is reasonable to expect that the value we keep internally in KVM wasn't updated. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211108152819.12485-3-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
kvm_lapic_enable_pv_eoi() is a misnomer as the function is also used to disable PV EOI. Rename it to kvm_lapic_set_pv_eoi(). No functional change intended. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211108152819.12485-2-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 19 10月, 2021 2 次提交
-
-
由 Sean Christopherson 提交于
WARN if the static keys used to track if any vCPU has disabled its APIC are left elevated at module exit. Unlike the underflow case, nothing in the static key infrastructure will complain if a key is left elevated, and because an elevated key only affects performance, nothing in KVM will fail if either key is improperly incremented. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211013003554.47705-3-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Revert a change to open code bits of kvm_lapic_set_base() when emulating APIC RESET to fix an apic_hw_disabled underflow bug due to arch.apic_base and apic_hw_disabled being unsyncrhonized when the APIC is created. If kvm_arch_vcpu_create() fails after creating the APIC, kvm_free_lapic() will see the initialized-to-zero vcpu->arch.apic_base and decrement apic_hw_disabled without KVM ever having incremented apic_hw_disabled. Using kvm_lapic_set_base() in kvm_lapic_reset() is also desirable for a potential future where KVM supports RESET outside of vCPU creation, in which case all the side effects of kvm_lapic_set_base() are needed, e.g. to handle the transition from x2APIC => xAPIC. Alternatively, KVM could temporarily increment apic_hw_disabled (and call kvm_lapic_set_base() at RESET), but that's a waste of cycles and would impact the performance of other vCPUs and VMs. The other subtle side effect is that updating the xAPIC ID needs to be done at RESET regardless of whether the APIC was previously enabled, i.e. kvm_lapic_reset() needs an explicit call to kvm_apic_set_xapic_id() regardless of whether or not kvm_lapic_set_base() also performs the update. That makes stuffing the enable bit at vCPU creation slightly more palatable, as doing so affects only the apic_hw_disabled key. Opportunistically tweak the comment to explicitly call out the connection between vcpu->arch.apic_base and apic_hw_disabled, and add a comment to call out the need to always do kvm_apic_set_xapic_id() at RESET. Underflow scenario: kvm_vm_ioctl() { kvm_vm_ioctl_create_vcpu() { kvm_arch_vcpu_create() { if (something_went_wrong) goto fail_free_lapic; /* vcpu->arch.apic_base is initialized when something_went_wrong is false. */ kvm_vcpu_reset() { kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event) { vcpu->arch.apic_base = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE; } } return 0; fail_free_lapic: kvm_free_lapic() { /* vcpu->arch.apic_base is not yet initialized when something_went_wrong is true. */ if (!(vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE)) static_branch_slow_dec_deferred(&apic_hw_disabled); // <= underflow bug. } return r; } } } This (mostly) reverts commit 42122123. Fixes: 42122123 ("KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU RESET") Reported-by: syzbot+9fc046ab2b0cf295a063@syzkaller.appspotmail.com Debugged-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211013003554.47705-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 02 8月, 2021 6 次提交
-
-
由 Sean Christopherson 提交于
Consolidate the APIC base RESET logic, which is currently spread out across both x86 and vendor code. For an in-kernel APIC, the vendor code is redundant. But for a userspace APIC, KVM relies on the vendor code to initialize vcpu->arch.apic_base. Hoist the vcpu->arch.apic_base initialization above the !apic check so that it applies to both flavors of APIC emulation, and delete the vendor code. Reviewed-by: NReiji Watanabe <reijiw@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-19-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Stuff vcpu->arch.apic_base and apic->base_address directly during APIC reset, as opposed to bouncing through kvm_set_apic_base() while fudging the ENABLE bit during creation to avoid the other, unwanted side effects. This is a step towards consolidating the APIC RESET logic across x86, VMX, and SVM. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-18-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Set the BSP bit appropriately during local APIC "reset" instead of relying on vendor code to clean up at a later point. This is a step towards consolidating the local APIC, VMX, and SVM xAPIC initialization code. Reviewed-by: NReiji Watanabe <reijiw@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-16-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Don't set the BSP bit in vcpu->arch.apic_base when the local APIC is managed by userspace. Forcing all vCPUs to be BSPs is non-sensical, and was dead code when it was added by commit 97222cc8 ("KVM: Emulate local APIC in kernel"). At the time, kvm_lapic_set_base() was invoked if and only if the local APIC was in-kernel (and it couldn't be called before the vCPU created its APIC). kvm_lapic_set_base() eventually gained generic usage, but the latent bug escaped notice because the only true consumer would be the guest itself in the form of an explicit RDMSRs on APs. Out of Linux, SeaBIOS, and EDK2/OVMF, only OVMF consumes the BSP bit from the APIC_BASE MSR. For the vast majority of usage in OVMF, BSP confusion would be benign. OVMF's BSP election upon SMI rendezvous might be broken, but practically no one runs KVM with an out-of-kernel local APIC, let alone does so while utilizing SMIs with OVMF. Fixes: 97222cc8 ("KVM: Emulate local APIC in kernel") Reviewed-by: NReiji Watanabe <reijiw@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-15-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove a BSP APIC update in kvm_lapic_reset() that is a glorified and confusing nop. When the code was originally added, kvm_vcpu_is_bsp() queried kvm->arch.bsp_vcpu, i.e. the intent was to set the BSP bit in the BSP vCPU's APIC. But, stuffing the BSP bit at INIT was wrong since the guest can change its BSP(s); this was fixed by commit 58d269d8 ("KVM: x86: BSP in MSR_IA32_APICBASE is writable"). In other words, kvm_vcpu_is_bsp() is now purely a reflection of vcpu->arch.apic_base.MSR_IA32_APICBASE_BSP, thus the update will always set the current value and kvm_lapic_set_base() is effectively a nop if the new and old values match. The RESET case, which does need to stuff the BSP for the reset vCPU, is handled by vendor code (though this will soon be moved to common code). No functional change intended. Reviewed-by: NReiji Watanabe <reijiw@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-13-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
WARN if KVM ends up in a state where it thinks its APIC map needs to be recalculated, but KVM is not emulating the local APIC. This is mostly to document KVM's "rules" in order to provide clarity in future cleanups. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-12-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 6月, 2021 2 次提交
-
-
由 Wanpeng Li 提交于
KVM_GET_LAPIC stores the current value of TMCCT and KVM_SET_LAPIC's memcpy stores it in vcpu->arch.apic->regs, KVM_SET_LAPIC could store zero in vcpu->arch.apic->regs after it uses it, and then the stored value would always be zero. In addition, the TMCCT is always computed on-demand and never directly readable. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1623223000-18116-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jim Mattson 提交于
No functional change intended. At present, the only negative value returned by kvm_check_nested_events is -EBUSY. Signed-off-by: NJim Mattson <jmattson@google.com> Message-Id: <20210604172611.281819-6-jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 6月, 2021 1 次提交
-
-
由 Jim Mattson 提交于
Per the SDM, "any access that touches bytes 4 through 15 of an APIC register may cause undefined behavior and must not be executed." Worse, such an access in kvm_lapic_reg_read can result in a leak of kernel stack contents. Prior to commit 01402cf8 ("kvm: LAPIC: write down valid APIC registers"), such an access was explicitly disallowed. Restore the guard that was removed in that commit. Fixes: 01402cf8 ("kvm: LAPIC: write down valid APIC registers") Signed-off-by: NJim Mattson <jmattson@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Message-Id: <20210602205224.3189316-1-jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 6月, 2021 1 次提交
-
-
由 Wanpeng Li 提交于
According to the SDM 10.5.4.1: A write of 0 to the initial-count register effectively stops the local APIC timer, in both one-shot and periodic mode. However, the lapic timer oneshot/periodic mode which is emulated by vmx-preemption timer doesn't stop by writing 0 to TMICT since vmx->hv_deadline_tsc is still programmed and the guest will receive the spurious timer interrupt later. This patch fixes it by also cancelling the vmx-preemption timer when writing 0 to the initial-count register. Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1623050385-100988-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 5月, 2021 2 次提交
-
-
由 Marcelo Tosatti 提交于
KVM_REQ_UNBLOCK will be used to exit a vcpu from its inner vcpu halt emulation loop. Rename KVM_REQ_PENDING_TIMER to KVM_REQ_UNBLOCK, switch PowerPC to arch specific request bit. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Message-Id: <20210525134321.303768132@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Let's treat lapic_timer_advance_ns automatic tuning logic as hypervisor overhead, move it before wait_lapic_expire instead of between wait_lapic_expire and the world switch, the wait duration should be calculated by the up-to-date guest_tsc after the overhead of automatic tuning logic. This patch reduces ~30+ cycles for kvm-unit-tests/tscdeadline-latency when testing busy waits. Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1621339235-11131-5-git-send-email-wanpengli@tencent.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 5月, 2021 1 次提交
-
-
由 Wanpeng Li 提交于
Commit ee66e453 (KVM: lapic: Busy wait for timer to expire when using hv_timer) tries to set ktime->expired_tscdeadline by checking ktime->hv_timer_in_use since lapic timer oneshot/periodic modes which are emulated by vmx preemption timer also get advanced, they leverage the same vmx preemption timer logic with tsc-deadline mode. However, ktime->hv_timer_in_use is cleared before apic_timer_expired() handling, let's delay this clearing in preemption-disabled region. Fixes: ee66e453 ("KVM: lapic: Busy wait for timer to expire when using hv_timer") Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1619608082-4187-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 4月, 2021 1 次提交
-
-
由 Vitaly Kuznetsov 提交于
Async PF 'page ready' event may happen when LAPIC is (temporary) disabled. In particular, Sebastien reports that when Linux kernel is directly booted by Cloud Hypervisor, LAPIC is 'software disabled' when APF mechanism is initialized. On initialization KVM tries to inject 'wakeup all' event and puts the corresponding token to the slot. It is, however, failing to inject an interrupt (kvm_apic_set_irq() -> __apic_accept_irq() -> !apic_enabled()) so the guest never gets notified and the whole APF mechanism gets stuck. The same issue is likely to happen if the guest temporary disables LAPIC and a previously unavailable page becomes available. Do two things to resolve the issue: - Avoid dequeuing 'page ready' events from APF queue when LAPIC is disabled. - Trigger an attempt to deliver pending 'page ready' events when LAPIC becomes enabled (SPIV or MSR_IA32_APICBASE). Reported-by: NSebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210422092948.568327-1-vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 3月, 2021 1 次提交
-
-
由 Sean Christopherson 提交于
Synthesize a nested VM-Exit if L2 triggers an emulated triple fault instead of exiting to userspace, which likely will kill L1. Any flow that does KVM_REQ_TRIPLE_FAULT is suspect, but the most common scenario for L2 killing L1 is if L0 (KVM) intercepts a contributory exception that is _not_intercepted by L1. E.g. if KVM is intercepting #GPs for the VMware backdoor, a #GP that occurs in L2 while vectoring an injected #DF will cause KVM to emulate triple fault. Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210302174515.2812275-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 13 3月, 2021 1 次提交
-
-
由 Wanpeng Li 提交于
Advancing the timer expiration should only be necessary on guest initiated writes. When we cancel the timer and clear .pending during state restore, clear expired_tscdeadline as well. Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1614818118-965-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 3月, 2021 1 次提交
-
-
由 Sean Christopherson 提交于
When posting a deadline timer interrupt, open code the checks guarding __kvm_wait_lapic_expire() in order to skip the lapic_timer_int_injected() check in kvm_wait_lapic_expire(). The injection check will always fail since the interrupt has not yet be injected. Moving the call after injection would also be wrong as that wouldn't actually delay delivery of the IRQ if it is indeed sent via posted interrupt. Fixes: 010fd37f ("KVM: LAPIC: Reduce world switch latency caused by timer_advance_ns") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210305021808.3769732-1-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 2月, 2021 2 次提交
-
-
由 Vitaly Kuznetsov 提交于
Currently, Hyper-V context is part of 'struct kvm_vcpu_arch' and is always available. As a preparation to allocating it dynamically, check that it is not NULL at call sites which can normally proceed without it i.e. the behavior is identical to the situation when Hyper-V emulation is not being used by the guest. When Hyper-V context for a particular vCPU is not allocated, we may still need to get 'vp_index' from there. E.g. in a hypothetical situation when Hyper-V emulation was enabled on one CPU and wasn't on another, Hyper-V style send-IPI hypercall may still be used. Luckily, vp_index is always initialized to kvm_vcpu_get_idx() and can only be changed when Hyper-V context is present. Introduce kvm_hv_get_vpindex() helper for simplification. No functional change intended. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210126134816.1880136-12-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
vcpu_to_synic()'s argument is almost always 'vcpu' so there's no need to have an additional prefix. Also, as this is used outside of hyper-v emulation code, add '_hv_' part to make it clear what this s. This makes the naming more consistent with to_hv_vcpu(). Rename synic_to_vcpu() to hv_synic_to_vcpu() for consistency. No functional change intended. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210126134816.1880136-6-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 2月, 2021 2 次提交
-
-
由 Jason Baron 提交于
Convert kvm_x86_ops to use static calls. Note that all kvm_x86_ops are covered here except for 'pmu_ops and 'nested ops'. Here are some numbers running cpuid in a loop of 1 million calls averaged over 5 runs, measured in the vm (lower is better). Intel Xeon 3000MHz: |default |mitigations=off ------------------------------------- vanilla |.671s |.486s static call|.573s(-15%)|.458s(-6%) AMD EPYC 2500MHz: |default |mitigations=off ------------------------------------- vanilla |.710s |.609s static call|.664s(-6%) |.609s(0%) Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: NJason Baron <jbaron@akamai.com> Message-Id: <e057bf1b8a7ad15652df6eeba3f907ae758d3399.1610680941.git.jbaron@akamai.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Cun Li 提交于
The use of 'struct static_key' and 'static_key_false' is deprecated. Use the new API. Signed-off-by: NCun Li <cun.jia.li@gmail.com> Message-Id: <20210111152435.50275-1-cun.jia.li@gmail.com> [Make it compile. While at it, rename kvm_no_apic_vcpu to kvm_has_noapic_vcpu; the former reads too much like "true if no vCPU has an APIC". - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 1月, 2021 2 次提交
-
-
由 Tom Lendacky 提交于
Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence, where the guest vCPU register state is updated and then the vCPU is VMRUN to begin execution of the AP. For an SEV-ES guest, this won't work because the guest register state is encrypted. Following the GHCB specification, the hypervisor must not alter the guest register state, so KVM must track an AP/vCPU boot. Should the guest want to park the AP, it must use the AP Reset Hold exit event in place of, for example, a HLT loop. First AP boot (first INIT-SIPI-SIPI sequence): Execute the AP (vCPU) as it was initialized and measured by the SEV-ES support. It is up to the guest to transfer control of the AP to the proper location. Subsequent AP boot: KVM will expect to receive an AP Reset Hold exit event indicating that the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to awaken it. When the AP Reset Hold exit event is received, KVM will place the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI sequence, KVM will make the vCPU runnable. It is again up to the guest to then transfer control of the AP to the proper location. To differentiate between an actual HLT and an AP Reset Hold, a new MP state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is placed in upon receiving the AP Reset Hold exit event. Additionally, to communicate the AP Reset Hold exit event up to userspace (if needed), a new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD. A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds a new function that, for non SEV-ES guests, invokes the original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests, implements the logic above. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Stephen Zhang 提交于
Signed-off-by: NStephen Zhang <stephenzhangzsd@gmail.com> Message-Id: <1608277897-1932-1-git-send-email-stephenzhangzsd@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 12月, 2020 1 次提交
-
-
由 Maxim Levitsky 提交于
In the commit 1c96dcce ("KVM: x86: fix apic_accept_events vs check_nested_events"), we accidently started latching SIPIs that are received while the cpu is not waiting for them. This causes vCPUs to never enter a halted state. Fixes: 1c96dcce ("KVM: x86: fix apic_accept_events vs check_nested_events") Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20201203143319.159394-2-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 11月, 2020 1 次提交
-
-
由 Paolo Bonzini 提交于
Centralize handling of interrupts from the userspace APIC in kvm_cpu_has_extint and kvm_cpu_get_extint, since userspace APIC interrupts are handled more or less the same as ExtINTs are with split irqchip. This removes duplicated code from kvm_cpu_has_injectable_intr and kvm_cpu_has_interrupt, and makes the code more similar between kvm_cpu_has_{extint,interrupt} on one side and kvm_cpu_get_{extint,interrupt} on the other. Cc: stable@vger.kernel.org Reviewed-by: NFilippo Sironi <sironi@amazon.de> Reviewed-by: NDavid Woodhouse <dwmw@amazon.co.uk> Tested-by: NDavid Woodhouse <dwmw@amazon.co.uk> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 11月, 2020 1 次提交
-
-
由 Paolo Bonzini 提交于
vmx_apic_init_signal_blocked is buggy in that it returns true even in VMX non-root mode. In non-root mode, however, INITs are not latched, they just cause a vmexit. Previously, KVM was waiting for them to be processed when kvm_apic_accept_events and in the meanwhile it ate the SIPIs that the processor received. However, in order to implement the wait-for-SIPI activity state, KVM will have to process KVM_APIC_SIPI in vmx_check_nested_events, and it will not be possible anymore to disregard SIPIs in non-root mode as the code is currently doing. By calling kvm_x86_ops.nested_ops->check_events, we can force a vmexit (with the side-effect of latching INITs) before incorrectly injecting an INIT or SIPI in a guest, and therefore vmx_apic_init_signal_blocked can do the right thing. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 9月, 2020 6 次提交
-
-
由 Sean Christopherson 提交于
On successful nested VM-Enter, check for pending interrupts and convert the highest priority interrupt to a pending posted interrupt if it matches L2's notification vector. If the vCPU receives a notification interrupt before nested VM-Enter (assuming L1 disables IRQs before doing VM-Enter), the pending interrupt (for L1) should be recognized and processed as a posted interrupt when interrupts become unblocked after VM-Enter to L2. This fixes a bug where L1/L2 will get stuck in an infinite loop if L1 is trying to inject an interrupt into L2 by setting the appropriate bit in L2's PIR and sending a self-IPI prior to VM-Enter (as opposed to KVM's method of manually moving the vector from PIR->vIRR/RVI). KVM will observe the IPI while the vCPU is in L1 context and so won't immediately morph it to a posted interrupt for L2. The pending interrupt will be seen by vmx_check_nested_events(), cause KVM to force an immediate exit after nested VM-Enter, and eventually be reflected to L1 as a VM-Exit. After handling the VM-Exit, L1 will see that L2 has a pending interrupt in PIR, send another IPI, and repeat until L2 is killed. Note, posted interrupts require virtual interrupt deliveriy, and virtual interrupt delivery requires exit-on-interrupt, ergo interrupts will be unconditionally unmasked on VM-Enter if posted interrupts are enabled. Fixes: 705699a1 ("KVM: nVMX: Enable nested posted interrupt processing") Cc: stable@vger.kernel.org Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200812175129.12172-1-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
All the checks in lapic_timer_int_injected(), __kvm_wait_lapic_expire(), and these function calls waste cpu cycles when the timer mode is not tscdeadline. We can observe ~1.3% world switch time overhead by kvm-unit-tests/vmexit.flat vmcall testing on AMD server. This patch reduces the world switch latency caused by timer_advance_ns feature when the timer mode is not tscdeadline by simpling move the check against apic->lapic_timer.expired_tscdeadline much earlier. Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1599731444-3525-7-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
The kick after setting KVM_REQ_PENDING_TIMER is used to handle the timer fires on a different pCPU which vCPU is running on. This kick costs about 1000 clock cycles and we don't need this when injecting already-expired timer or when using the VMX preemption timer because kvm_lapic_expired_hv_timer() is called from the target vCPU. Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1599731444-3525-6-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Check apic_lvtt_tscdeadline() mode directly instead of apic_lvtt_oneshot() and apic_lvtt_period() to guarantee the timer is in tsc-deadline mode when wrmsr MSR_IA32_TSCDEADLINE. Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1599731444-3525-3-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Return 0 when getting the tscdeadline timer if the lapic is hw disabled. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1599731444-3525-2-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
There is missing apic map recalculation after updating DFR, if it is INIT RESET, in x2apic mode, local apic is software enabled before. This patch fix it by introducing the function kvm_apic_set_dfr() to be called in INIT RESET handling path. Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1597827327-25055-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 8月, 2020 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 31 7月, 2020 2 次提交
-
-
由 Wanpeng Li 提交于
It is a little different between Intel and AMD, Intel's bit 2 is 0 and AMD is reserved. On bare-metal, Intel will refuse to set APIC_TDCR once bits except 0, 1, 3 are setting, however, AMD will accept bits 0, 1, 3 and ignore other bits setting as patch does. Before the patch, we can get back anything what we set to the APIC_TDCR, this patch improves it. Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1596165141-28874-2-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Prevent setting the tscdeadline timer if the lapic is hw disabled. Fixes: bce87cce (KVM: x86: consolidate different ways to test for in-kernel LAPIC) Cc: <stable@vger.kernel.org> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1596165141-28874-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-