- 09 12月, 2021 1 次提交
-
-
由 Maxim Levitsky 提交于
This allows to see how many interrupts were delivered via the APICv/AVIC from the host. Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211209115440.394441-3-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 12月, 2021 24 次提交
-
-
由 Hou Wenlong 提交于
em_rdmsr() and em_wrmsr() return X86EMUL_IO_NEEDED if MSR accesses required an exit to userspace. However, x86_emulate_insn() doesn't return X86EMUL_*, so x86_emulate_instruction() doesn't directly act on X86EMUL_IO_NEEDED; instead, it looks for other signals to differentiate between PIO, MMIO, etc. causing RDMSR/WRMSR emulation not to exit to userspace now. Nevertheless, if the userspace_msr_exit_test testcase in selftests is changed to test RDMSR/WRMSR with a forced emulation prefix, the test passes. What happens is that first userspace exit information is filled but the userspace exit does not happen. Because x86_emulate_instruction() returns 1, the guest retries the instruction---but this time RIP has already been adjusted past the forced emulation prefix, so the guest executes RDMSR/WRMSR and the userspace exit finally happens. Since the X86EMUL_IO_NEEDED path has provided a complete_userspace_io callback, x86_emulate_instruction() can just return 0 if the callback is not NULL. Then RDMSR/WRMSR instruction emulation will exit to userspace directly, without the RDMSR/WRMSR vmexit. Fixes: 1ae09954 ("KVM: x86: Allow deflecting unknown MSR accesses to user space") Signed-off-by: NHou Wenlong <houwenlong93@linux.alibaba.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <56f9df2ee5c05a81155e2be366c9dc1f7adc8817.1635842679.git.houwenlong93@linux.alibaba.com>
-
由 Hou Wenlong 提交于
If msr access triggers an exit to userspace, the complete_userspace_io callback would skip instruction by vendor callback for kvm_skip_emulated_instruction(). However, when msr access comes from the emulator, e.g. if kvm.force_emulation_prefix is enabled and the guest uses rdmsr/wrmsr with kvm prefix, VM_EXIT_INSTRUCTION_LEN in vmcs is invalid and kvm_emulate_instruction() should be used to skip instruction instead. As Sean noted, unlike the previous case, there's no #UD if unrestricted guest is disabled and the guest accesses an MSR in Big RM. So the correct way to fix this is to attach a different callback when the msr access comes from the emulator. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NHou Wenlong <houwenlong93@linux.alibaba.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <34208da8f51580a06e45afefac95afea0e3f96e3.1635842679.git.houwenlong93@linux.alibaba.com>
-
由 Hou Wenlong 提交于
The next patch would use kvm_emulate_instruction() with EMULTYPE_SKIP in complete_userspace_io callback to fix a problem in msr access emulation. However, EMULTYPE_SKIP only updates RIP, more things like updating interruptibility state and injecting single-step #DBs would be done in the callback. Since the emulator also does those things after x86_emulate_insn(), add a new emulation type to pair with EMULTYPE_SKIP to do those things for completion of user exits within the emulator. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NHou Wenlong <houwenlong93@linux.alibaba.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <8f8c8e268b65f31d55c2881a4b30670946ecfa0d.1635842679.git.houwenlong93@linux.alibaba.com>
-
由 Sean Christopherson 提交于
Truncate the new EIP to a 32-bit value when handling EMULTYPE_SKIP as the decode phase does not truncate _eip. Wrapping the 32-bit boundary is legal if and only if CS is a flat code segment, but that check is implicitly handled in the form of limit checks in the decode phase. Opportunstically prepare for a future fix by storing the result of any truncation in "eip" instead of "_eip". Fixes: 1957aa63 ("KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig") Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <093eabb1eab2965201c9b018373baf26ff256d85.1635842679.git.houwenlong93@linux.alibaba.com>
-
由 Lai Jiangshan 提交于
It uses vcpu->arch.walk_mmu always; nested EPT does not have PDPTRs, and nested NPT treats them like all other non-leaf page table levels instead of caching them. Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211124122055.64424-11-jiangshanlai@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lai Jiangshan 提交于
Reduce an indirect function call (retpoline) and some intialization code. Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211124122055.64424-4-jiangshanlai@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lai Jiangshan 提交于
The mmu->gva_to_gpa() has no "struct kvm_mmu *mmu", so an extra FNAME(gva_to_gpa_nested) is needed. Add the parameter can simplify the code. And it makes it explicit that the walk is upon vcpu->arch.walk_mmu for gva and vcpu->arch.mmu for L2 gpa in translate_nested_gpa() via the new parameter. Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211124122055.64424-3-jiangshanlai@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lai Jiangshan 提交于
It is unchanged in most cases. Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211111144527.88852-1-jiangshanlai@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lai Jiangshan 提交于
When vcpu->arch.cr3 is changed, it should be marked dirty unless it is being updated to the value of the architecture guest CR3 (i.e. VMX.GUEST_CR3 or vmcb->save.cr3 when tdp is enabled). This patch has no functionality changed because kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3) is superset of kvm_register_mark_available(vcpu, VCPU_EXREG_CR3) with additional change to vcpu->arch.regs_dirty, but no code uses regs_dirty for VCPU_EXREG_CR3. (vmx_load_mmu_pgd() uses vcpu->arch.regs_avail instead to test if VCPU_EXREG_CR3 dirty which means current code (ab)uses regs_avail for VCPU_EXREG_CR3 dirty information.) Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211108124407.12187-11-jiangshanlai@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lai Jiangshan 提交于
Not functionality changed. Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211108124407.12187-7-jiangshanlai@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lai Jiangshan 提交于
In set_cr4_guest_host_mask(), all cr4 pdptr bits are already set to be intercepted in an unclear way. Add X86_CR4_PDPTR_BITS to make it clear and self-documented. No functionality changed. Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211108124407.12187-6-jiangshanlai@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lai Jiangshan 提交于
For VMX with EPT, dirty PDPTRs need to be loaded before the next vmentry via vmx_load_mmu_pgd() But not all paths that call load_pdptrs() will cause vmx_load_mmu_pgd() to be invoked. Normally, kvm_mmu_reset_context() is used to cause KVM_REQ_LOAD_MMU_PGD, but sometimes it is skipped: * commit d81135a5("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed") skips kvm_mmu_reset_context() after load_pdptrs() when changing CR0.CD and CR0.NW. * commit 21823fbd("KVM: x86: Invalidate all PGDs for the current PCID on MOV CR3 w/ flush") skips KVM_REQ_LOAD_MMU_PGD after load_pdptrs() when rewriting the CR3 with the same value. * commit a91a7c70("KVM: X86: Don't reset mmu context when toggling X86_CR4_PGE") skips kvm_mmu_reset_context() after load_pdptrs() when changing CR4.PGE. Fixes: d81135a5 ("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed") Fixes: 21823fbd ("KVM: x86: Invalidate all PGDs for the current PCID on MOV CR3 w/ flush") Fixes: a91a7c70 ("KVM: X86: Don't reset mmu context when toggling X86_CR4_PGE") Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211108124407.12187-2-jiangshanlai@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Call kvm_vcpu_block() directly for all wait states except HALTED so that kvm_vcpu_halt() is no longer a misnomer on x86. Functionally, this means KVM will never attempt halt-polling or adjust vcpu->halt_poll_ns for INIT_RECEIVED (a.k.a. Wait-For-SIPI (WFS)) or AP_RESET_HOLD; UNINITIALIZED is handled in kvm_arch_vcpu_ioctl_run(), and x86 doesn't use any other "wait" states. As mentioned above, the motivation of this is purely so that "halt" isn't overloaded on x86, e.g. in KVM's stats. Skipping halt-polling for WFS (and RESET_HOLD) has no meaningful effect on guest performance as there are typically single-digit numbers of INIT-SIPI sequences per AP vCPU, per boot, versus thousands of HLTs just to boot to console. Reviewed-by: NDavid Matlack <dmatlack@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-19-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Go directly to kvm_vcpu_block() when handling the case where userspace attempts to run an UNINITIALIZED vCPU. The vCPU is not halted, nor is it likely that halt-polling will be successful in this case. Reviewed-by: NDavid Matlack <dmatlack@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-18-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting the actual "block" sequences into a separate helper (to be named kvm_vcpu_block()). x86 will use the standalone block-only path to handle non-halt cases where the vCPU is not runnable. Rename block_ns to halt_ns to match the new function name. No functional change intended. Reviewed-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-14-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename a variety of HLT-related helpers to free up the function name "kvm_vcpu_halt" for future use in generic KVM code, e.g. to differentiate between "block" and "halt". No functional change intended. Reviewed-by: NDavid Matlack <dmatlack@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-13-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Maciej S. Szmigiero 提交于
There is no point in recalculating from scratch the total number of pages in all memslots each time a memslot is created or deleted. Use KVM's cached nr_memslot_pages to compute the default max number of MMU pages. Note that even with nr_memslot_pages capped at ULONG_MAX we can't safely multiply it by KVM_PERMILLE_MMU_PAGES (20) since this operation can possibly overflow an unsigned long variable. Write this "* 20 / 1000" operation as "/ 50" instead to avoid such overflow. Signed-off-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> [sean: use common KVM field and rework changelog accordingly] Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <d14c5a24535269606675437d5602b7dac4ad8c0e.1638817640.git.maciej.szmigiero@oracle.com>
-
由 Maciej S. Szmigiero 提交于
There is no point in calling kvm_mmu_change_mmu_pages() for memslot operations that don't change the total page count, so do it just for KVM_MR_CREATE and KVM_MR_DELETE. Signed-off-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <9e56b7616a11f5654e4ab486b3237366b7ba9f2a.1638817640.git.maciej.szmigiero@oracle.com>
-
由 Sean Christopherson 提交于
Play nice with a NULL @old or @new when handling memslot updates so that common KVM can pass NULL for one or the other in CREATE and DELETE cases instead of having to synthesize a dummy memslot. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Signed-off-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <2eb7788adbdc2bc9a9c5f86844dd8ee5c8428732.1638817640.git.maciej.szmigiero@oracle.com>
-
由 Sean Christopherson 提交于
Drop the @mem param from kvm_arch_{prepare,commit}_memory_region() now that its use has been removed in all architectures. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Signed-off-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <aa5ed3e62c27e881d0d8bc0acbc1572bc336dc19.1638817640.git.maciej.szmigiero@oracle.com>
-
由 Sean Christopherson 提交于
Get the number of pages directly from the new memslot instead of computing the same from the userspace memory region when allocating memslot metadata. This will allow a future patch to drop @mem. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Signed-off-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <ef44892eb615f5c28e682bbe06af96aff9ce2a9f.1638817639.git.maciej.szmigiero@oracle.com>
-
由 Sean Christopherson 提交于
Pass the "old" slot to kvm_arch_prepare_memory_region() and force arch code to handle propagating arch specific data from "new" to "old" when necessary. This is a baby step towards dynamically allocating "new" from the get go, and is a (very) minor performance boost on x86 due to not unnecessarily copying arch data. For PPC HV, copy the rmap in the !CREATE and !DELETE paths, i.e. for MOVE and FLAGS_ONLY. This is functionally a nop as the previous behavior would overwrite the pointer for CREATE, and eventually discard/ignore it for DELETE. For x86, copy the arch data only for FLAGS_ONLY changes. Unlike PPC HV, x86 needs to reallocate arch data in the MOVE case as the size of x86's allocations depend on the alignment of the memslot's gfn. Opportunistically tweak kvm_arch_prepare_memory_region()'s param order to match the "commit" prototype. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> [mss: add missing RISCV kvm_arch_prepare_memory_region() change] Signed-off-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <67dea5f11bbcfd71e3da5986f11e87f5dd4013f9.1638817639.git.maciej.szmigiero@oracle.com>
-
由 Marc Zyngier 提交于
Everywhere we use kvm_for_each_vpcu(), we use an int as the vcpu index. Unfortunately, we're about to move rework the iterator, which requires this to be upgrade to an unsigned long. Let's bite the bullet and repaint all of it in one go. Signed-off-by: NMarc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-7-maz@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marc Zyngier 提交于
All architectures have similar loops iterating over the vcpus, freeing one vcpu at a time, and eventually wiping the reference off the vcpus array. They are also inconsistently taking the kvm->lock mutex when wiping the references from the array. Make this code common, which will simplify further changes. The locking is dropped altogether, as this should only be called when there is no further references on the kvm structure. Reviewed-by: NClaudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-2-maz@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 02 12月, 2021 1 次提交
-
-
由 Paolo Bonzini 提交于
kvm_vcpu_apicv_active() returns false if a virtual machine has no in-kernel local APIC, however kvm_apicv_activated might still be true if there are no reasons to disable APICv; in fact it is quite likely that there is none because APICv is inhibited by specific configurations of the local APIC and those configurations cannot be programmed. This triggers a WARN: WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu)); To avoid this, introduce another cause for APICv inhibition, namely the absence of an in-kernel local APIC. This cause is enabled by default, and is dropped by either KVM_CREATE_IRQCHIP or the enabling of KVM_CAP_IRQCHIP_SPLIT. Reported-by: NIgnat Korchagin <ignat@cloudflare.com> Fixes: ee49a893 ("KVM: x86: Move SVM's APICv sanity check to common x86", 2021-10-22) Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Tested-by: NIgnat Korchagin <ignat@cloudflare.com> Message-Id: <20211130123746.293379-1-pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 30 11月, 2021 1 次提交
-
-
由 Paolo Bonzini 提交于
The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even if APICv is disabled on the vCPU that receives it. In that case, the interrupt will just cause a vmexit and leave the ON bit set together with the PIR bit corresponding to the interrupt. Right now, the interrupt would not be delivered until APICv is re-enabled. However, fixing this is just a matter of always doing the PIR->IRR synchronization, even if the vCPU has temporarily disabled APICv. This is not a problem for performance, or if anything it is an improvement. First, in the common case where vcpu->arch.apicv_active is true, one fewer check has to be performed. Second, static_call_cond will elide the function call if APICv is not present or disabled. Finally, in the case for AMD hardware we can remove the sync_pir_to_irr callback: it is only needed for apic_has_interrupt_for_ppr, and that function already has a fallback for !APICv. Cc: stable@vger.kernel.org Co-developed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Reviewed-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211123004311.2954158-4-pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 11月, 2021 4 次提交
-
-
由 Vitaly Kuznetsov 提交于
Commit 63f5a190 ("KVM: x86: Alert userspace that KVM_SET_CPUID{,2} after KVM_RUN is broken") officially deprecated KVM_SET_CPUID{,2} ioctls after first successful KVM_RUN and promissed to make this sequence forbiden in 5.16. It's time to fulfil the promise. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211122175818.608220-3-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Like KVM_REQ_TLB_FLUSH_CURRENT, the GUEST variant needs to be serviced at nested transitions, as KVM doesn't track requests for L1 vs L2. E.g. if there's a pending flush when a nested VM-Exit occurs, then the flush was requested in the context of L2 and needs to be handled before switching to L1, otherwise the flush for L2 would effectiely be lost. Opportunistically add a helper to handle CURRENT and GUEST as a pair, the logic for when they need to be serviced is identical as both requests are tied to L1 vs. L2, the only difference is the scope of the flush. Reported-by: NLai Jiangshan <jiangshanlai+lkml@gmail.com> Fixes: 07ffaf34 ("KVM: nVMX: Sync all PGDs on nested transition with shadow paging") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211125014944.536398-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The capability, albeit present, was never exposed via KVM_CHECK_EXTENSION. Fixes: b5663931 ("KVM: SEV: Add support for SEV intra host migration") Cc: Peter Gonda <pgonda@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Synchronize the two calls to kvm_x86_sync_pir_to_irr. The one in the reenter-guest fast path invoked the callback unconditionally even if LAPIC is present but disabled. In this case, there are no interrupts to deliver, and therefore posted interrupts can be ignored. Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 11月, 2021 3 次提交
-
-
由 Vitaly Kuznetsov 提交于
It doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211116163443.88707-7-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Tom Lendacky 提交于
When processing a hypercall for a guest with protected state, currently SEV-ES guests, the guest CS segment register can't be checked to determine if the guest is in 64-bit mode. For an SEV-ES guest, it is expected that communication between the guest and the hypervisor is performed to shared memory using the GHCB. In order to use the GHCB, the guest must have been in long mode, otherwise writes by the guest to the GHCB would be encrypted and not be able to be comprehended by the hypervisor. Create a new helper function, is_64_bit_hypercall(), that assumes the guest is in 64-bit mode when the guest has protected state, and returns true, otherwise invoking is_64_bit_mode() to determine the mode. Update the hypercall related routines to use is_64_bit_hypercall() instead of is_64_bit_mode(). Add a WARN_ON_ONCE() to is_64_bit_mode() to catch occurences of calls to this helper function for a guest running with protected state. Fixes: f1c6366e ("KVM: SVM: Add required changes to support intercepts under SEV-ES") Reported-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Message-Id: <e0b20c770c9d0d1403f23d83e785385104211f74.1621878537.git.thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Woodhouse 提交于
In 64-bit mode, x86 instruction encoding allows us to use the low 8 bits of any GPR as an 8-bit operand. In 32-bit mode, however, we can only use the [abcd] registers. For which, GCC has the "q" constraint instead of the less restrictive "r". Also fix st->preempted, which is an input/output operand rather than an input. Fixes: 7e2175eb ("KVM: x86: Fix recording of guest steal time / preempted status") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk> Message-Id: <89bf72db1b859990355f9c40713a34e0d2d86c98.camel@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 11月, 2021 1 次提交
-
-
由 黄乐 提交于
In vcpu_load_eoi_exitmap(), currently the eoi_exit_bitmap[4] array is initialized only when Hyper-V context is available, in other path it is just passed to kvm_x86_ops.load_eoi_exitmap() directly from on the stack, which would cause unexpected interrupt delivery/handling issues, e.g. an *old* linux kernel that relies on PIT to do clock calibration on KVM might randomly fail to boot. Fix it by passing ioapic_handled_vectors to load_eoi_exitmap() when Hyper-V context is not available. Fixes: f2bc14b6 ("KVM: x86: hyper-v: Prepare to meet unallocated Hyper-V context") Cc: stable@vger.kernel.org Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NHuang Le <huangle1@jd.com> Message-Id: <62115b277dab49ea97da5633f8522daf@jd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 11月, 2021 1 次提交
-
-
由 Paolo Bonzini 提交于
When UBSAN is enabled, the code emitted for the call to guest_pv_has includes a call to __ubsan_handle_load_invalid_value. objtool complains that this call happens with UACCESS enabled; to avoid the warning, pull the calls to user_access_begin into both arms of the "if" statement, after the check for guest_pv_has. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 11月, 2021 4 次提交
-
-
由 Vitaly Kuznetsov 提交于
KVM_CAP_NR_VCPUS is used to get the "recommended" maximum number of VCPUs and arm64/mips/riscv report num_online_cpus(). Powerpc reports either num_online_cpus() or num_present_cpus(), s390 has multiple constants depending on hardware features. On x86, KVM reports an arbitrary value of '710' which is supposed to be the maximum tested value but it's possible to test all KVM_MAX_VCPUS even when there are less physical CPUs available. Drop the arbitrary '710' value and return num_online_cpus() on x86 as well. The recommendation will match other architectures and will mean 'no CPU overcommit'. For reference, QEMU only queries KVM_CAP_NR_VCPUS to print a warning when the requested vCPU number exceeds it. The static limit of '710' is quite weird as smaller systems with just a few physical CPUs should certainly "recommend" less. Suggested-by: NEduardo Habkost <ehabkost@redhat.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211111134733.86601-1-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vipin Sharma 提交于
Handle #GP on INVPCID due to an invalid type in the common switch statement instead of relying on the callers (VMX and SVM) to manually validate the type. Unlike INVVPID and INVEPT, INVPCID is not explicitly documented to check the type before reading the operand from memory, so deferring the type validity check until after that point is architecturally allowed. Signed-off-by: NVipin Sharma <vipinsh@google.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211109174426.2350547-3-vipinsh@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
kvm_lapic_enable_pv_eoi() is a misnomer as the function is also used to disable PV EOI. Rename it to kvm_lapic_set_pv_eoi(). No functional change intended. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211108152819.12485-2-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Maxim Levitsky 提交于
KVM_GUESTDBG_BLOCKIRQ relies on interrupts being injected using standard kvm's inject_pending_event, and not via APICv/AVIC. Since this is a debug feature, just inhibit APICv/AVIC while KVM_GUESTDBG_BLOCKIRQ is in use on at least one vCPU. Fixes: 61e5f69e ("KVM: x86: implement KVM_GUESTDBG_BLOCKIRQ") Reported-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Tested-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211108090245.166408-1-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-