- 03 11月, 2022 5 次提交
-
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v5.12-rc1 commit 9a3ecd5e category: feature feature: KVM Bus Lock Debug Exception bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RHW7 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=9a3ecd5e Intel-SIG: commit 9a3ecd5e ("KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW") ------------------------------------- KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW DR6_INIT contains the 1-reserved bits as well as the bit that is cleared to 0 when the condition (e.g. RTM) happens. The value can be used to initialize dr6 and also be the XOR mask between the #DB exit qualification (or payload) and DR6. Concerning that DR6_INIT is used as initial value only once, rename it to DR6_ACTIVE_LOW and apply it in other places, which would make the incoming changes for bus lock debug exception more simple. Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20210202090433.13441-2-chenyi.qiang@intel.com> [Define DR6_FIXED_1 from DR6_ACTIVE_LOW and DR6_VOLATILE. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Tao Xu 提交于
mainline inclusion from mainline-v6.0-rc1 commit 2f4073e0 category: feature feature: Notify VM exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=2f4073e0 Intel-SIG: commit 2f4073e0 ("KVM: VMX: Enable Notify VM exit") ------------------------------------- KVM: VMX: Enable Notify VM exit There are cases that malicious virtual machines can cause CPU stuck (due to event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and IRQ) can be delivered. It leads the CPU to be unavailable to host or other VMs. VMM can enable notify VM exit that a VM exit generated if no event window occurs in VM non-root mode for a specified amount of time (notify window). Feature enabling: - The new vmcs field SECONDARY_EXEC_NOTIFY_VM_EXITING is introduced to enable this feature. VMM can set NOTIFY_WINDOW vmcs field to adjust the expected notify window. - Add a new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT so that user space can query and enable this feature in per-VM scope. The argument is a 64bit value: bits 63:32 are used for notify window, and bits 31:0 are for flags. Current supported flags: - KVM_X86_NOTIFY_VMEXIT_ENABLED: enable the feature with the notify window provided. - KVM_X86_NOTIFY_VMEXIT_USER: exit to userspace once the exits happen. - It's safe to even set notify window to zero since an internal hardware threshold is added to vmcs.notify_window. VM exit handling: - Introduce a vcpu state notify_window_exits to records the count of notify VM exits and expose it through the debugfs. - Notify VM exit can happen incident to delivery of a vector event. Allow it in KVM. - Exit to userspace unconditionally for handling when VM_CONTEXT_INVALID bit is set. Nested handling - Nested notify VM exits are not supported yet. Keep the same notify window control in vmcs02 as vmcs01, so that L1 can't escape the restriction of notify VM exits through launching L2 VM. Notify VM exit is defined in latest Intel Architecture Instruction Set Extensions Programming Reference, chapter 9.2. Co-developed-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NTao Xu <tao3.xu@intel.com> Co-developed-by: NChenyi Qiang <chenyi.qiang@intel.com> Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220524135624.22988-5-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v6.0-rc1 commit ed235117 category: feature feature: Notify VM exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=ed235117 Intel-SIG: commit ed235117 ("KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault") ------------------------------------- KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault For the triple fault sythesized by KVM, e.g. the RSM path or nested_vmx_abort(), if KVM exits to userspace before the request is serviced, userspace could migrate the VM and lose the triple fault. Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault with a new event KVM_VCPUEVENT_VALID_FAULT_FAULT so that userspace can save and restore the triple fault event. This extension is guarded by a new KVM capability KVM_CAP_TRIPLE_FAULT_EVENT. Note that in the set_vcpu_events path, userspace is able to set/clear the triple fault request through triple_fault.pending field. Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220524135624.22988-2-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v5.12-rc1 commit fe6b6bc8 category: feature feature: KVM Bus Lock VM Exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=fe6b6bc8 Intel-SIG: commit fe6b6bc8 ("KVM: VMX: Enable bus lock VM exit") ------------------------------------- KVM: VMX: Enable bus lock VM exit Virtual Machine can exploit bus locks to degrade the performance of system. Bus lock can be caused by split locked access to writeback(WB) memory or by using locks on uncacheable(UC) memory. The bus lock is typically >1000 cycles slower than an atomic operation within a cache line. It also disrupts performance on other cores (which must wait for the bus lock to be released before their memory operations can complete). To address the threat, bus lock VM exit is introduced to notify the VMM when a bus lock was acquired, allowing it to enforce throttling or other policy based mitigations. A VMM can enable VM exit due to bus locks by setting a new "Bus Lock Detection" VM-execution control(bit 30 of Secondary Processor-based VM execution controls). If delivery of this VM exit was preempted by a higher priority VM exit (e.g. EPT misconfiguration, EPT violation, APIC access VM exit, APIC write VM exit, exception bitmap exiting), bit 26 of exit reason in vmcs field is set to 1. In current implementation, the KVM exposes this capability through KVM_CAP_X86_BUS_LOCK_EXIT. The user can get the supported mode bitmap (i.e. off and exit) and enable it explicitly (disabled by default). If bus locks in guest are detected by KVM, exit to user space even when current exit reason is handled by KVM internally. Set a new field KVM_RUN_BUS_LOCK in vcpu->run->flags to inform the user space that there is a bus lock detected in guest. Document for Bus Lock VM exit is now available at the latest "Intel Architecture Instruction Set Extensions Programming Reference". Document Link: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.htmlCo-developed-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20201106090315.18606-4-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v5.12-rc1 commit 15aad3be category: feature feature: KVM Bus Lock VM Exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=15aad3be Intel-SIG: commit 15aad3be ("KVM: X86: Reset the vcpu->run->flags at the beginning of vcpu_run") ------------------------------------- KVM: X86: Reset the vcpu->run->flags at the beginning of vcpu_run Reset the vcpu->run->flags at the beginning of kvm_arch_vcpu_ioctl_run. It can avoid every thunk of code that needs to set the flag clear it, which increases the odds of missing a case and ending up with a flag in an undefined state. Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20201106090315.18606-3-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
- 02 11月, 2022 1 次提交
-
-
由 Vitaly Kuznetsov 提交于
stable inclusion from stable-v5.10.132 commit eb58fd350a851b5cda9f4c9a2cefb15c7ccf33f3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=eb58fd350a851b5cda9f4c9a2cefb15c7ccf33f3 -------------------------------- [ Upstream commit 8a414f94 ] 'vector' and 'trig_mode' fields of 'struct kvm_lapic_irq' are left uninitialized in kvm_pv_kick_cpu_op(). While these fields are normally not needed for APIC_DM_REMRD, they're still referenced by __apic_accept_irq() for trace_kvm_apic_accept_irq(). Fully initialize the structure to avoid consuming random stack memory. Fixes: a183b638 ("KVM: x86: make apic_accept_irq tracepoint more generic") Reported-by: syzbot+d6caa905917d353f0d07@syzkaller.appspotmail.com Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <20220708125147.593975-1-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 08 10月, 2022 3 次提交
-
-
由 Chao Gao 提交于
mainline inclusion from mainline-v6.0-rc1 commit d588bb9b category: feature feature: IPI Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d588bb9be1da6aa750aa64875fe57369db983d8b Intel-SIG: commit d588bb9b ("KVM: VMX: enable IPI virtualization") ------------------------------------- KVM: VMX: enable IPI virtualization With IPI virtualization enabled, the processor emulates writes to APIC registers that would send IPIs. The processor sets the bit corresponding to the vector in target vCPU's PIR and may send a notification (IPI) specified by NDST and NV fields in target vCPU's Posted-Interrupt Descriptor (PID). It is similar to what IOMMU engine does when dealing with posted interrupt from devices. A PID-pointer table is used by the processor to locate the PID of a vCPU with the vCPU's APIC ID. The table size depends on maximum APIC ID assigned for current VM session from userspace. Allocating memory for PID-pointer table is deferred to vCPU creation, because irqchip mode and VM-scope maximum APIC ID is settled at that point. KVM can skip PID-pointer table allocation if !irqchip_in_kernel(). Like VT-d PI, if a vCPU goes to blocked state, VMM needs to switch its notification vector to wakeup vector. This can ensure that when an IPI for blocked vCPUs arrives, VMM can get control and wake up blocked vCPUs. And if a VCPU is preempted, its posted interrupt notification is suppressed. Note that IPI virtualization can only virualize physical-addressing, flat mode, unicast IPIs. Sending other IPIs would still cause a trap-like APIC-write VM-exit and need to be handled by VMM. Signed-off-by: NChao Gao <chao.gao@intel.com> Signed-off-by: NZeng Guang <guang.zeng@intel.com> Message-Id: <20220419154510.11938-1-guang.zeng@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
由 Zeng Guang 提交于
mainline inclusion from mainline-v6.0-rc1 commit 35875316 category: feature feature: IPI Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=35875316384b71d23dc2a45a969732fc8cab16af Intel-SIG: commit 35875316 ("KVM: x86: Allow userspace to set maximum VCPU id for VM") ------------------------------------- KVM: x86: Allow userspace to set maximum VCPU id for VM Introduce new max_vcpu_ids in KVM for x86 architecture. Userspace can assign maximum possible vcpu id for current VM session using KVM_CAP_MAX_VCPU_ID of KVM_ENABLE_CAP ioctl(). This is done for x86 only because the sole use case is to guide memory allocation for PID-pointer table, a structure needed to enable VMX IPI. By default, max_vcpu_ids set as KVM_MAX_VCPU_ID. Suggested-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Signed-off-by: NZeng Guang <guang.zeng@intel.com> Message-Id: <20220419154444.11888-1-guang.zeng@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
由 Zeng Guang 提交于
mainline inclusion from mainline-v6.0-rc1 commit 1d5e740d category: feature feature: IPI Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1d5e740d518e02cea46325b3d37135bf9c08982a Intel-SIG: commit 1d5e740d ("KVM: Move kvm_arch_vcpu_precreate() under kvm->lock") ------------------------------------- KVM: Move kvm_arch_vcpu_precreate() under kvm->lock kvm_arch_vcpu_precreate() targets to handle arch specific VM resource to be prepared prior to the actual creation of vCPU. For example, x86 platform may need do per-VM allocation based on max_vcpu_ids at the first vCPU creation. It probably leads to concurrency control on this allocation as multiple vCPU creation could happen simultaneously. From the architectual point of view, it's necessary to execute kvm_arch_vcpu_precreate() under protect of kvm->lock. Currently only arm64, x86 and s390 have non-nop implementations at the stage of vCPU pre-creation. Remove the lock acquiring in s390's design and make sure all architecture can run kvm_arch_vcpu_precreate() safely under kvm->lock without recrusive lock issue. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NZeng Guang <guang.zeng@intel.com> Message-Id: <20220419154409.11842-1-guang.zeng@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
- 20 9月, 2022 2 次提交
-
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 7070bbb66c5303117e4c7651711ea7daae4c64b5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5PTAS CVE: CVE-2022-29900,CVE-2022-23816,CVE-2022-29901 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7070bbb66c5303117e4c7651711ea7daae4c64b5 -------------------------------- commit 742ab6df upstream. The recent mmio_stale_data fixes broke the noinstr constraints: vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0x15b: call to wrmsrl.constprop.0() leaves .noinstr.text section vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0x1bf: call to kvm_arch_has_assigned_device() leaves .noinstr.text section make it all happy again. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paolo Bonzini 提交于
mainline inclusion from mainline-v5.19-rc2 commit 6cd88243 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PJ7H CVE: CVE-2022-39189 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6cd88243c7e03845a450795e134b488fc2afb736 ---------------------------------------- If a vCPU is outside guest mode and is scheduled out, it might be in the process of making a memory access. A problem occurs if another vCPU uses the PV TLB flush feature during the period when the vCPU is scheduled out, and a virtual address has already been translated but has not yet been accessed, because this is equivalent to using a stale TLB entry. To avoid this, only report a vCPU as preempted if sure that the guest is at an instruction boundary. A rescheduling request will be delivered to the host physical CPU as an external interrupt, so for simplicity consider any vmexit *not* instruction boundary except for external interrupts. It would in principle be okay to report the vCPU as preempted also if it is sleeping in kvm_vcpu_block(): a TLB flush IPI will incur the vmentry/vmexit overhead unnecessarily, and optimistic spinning is also unlikely to succeed. However, leave it for later because right now kvm_vcpu_check_block() is doing memory accesses. Even though the TLB flush issue only applies to virtual memory address, it's very much preferrable to be conservative. Reported-by: NJann Horn <jannh@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> conflict: arch/x86/kvm/x86.c Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 01 9月, 2022 1 次提交
-
-
由 Vitaly Kuznetsov 提交于
stable inclusion from stable-v5.10.119 commit 74c6e5d584354c6126f1231667f9d8e85d7f536f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6BB Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=74c6e5d584354c6126f1231667f9d8e85d7f536f -------------------------------- commit 2f15d027 upstream. Async PF 'page ready' event may happen when LAPIC is (temporary) disabled. In particular, Sebastien reports that when Linux kernel is directly booted by Cloud Hypervisor, LAPIC is 'software disabled' when APF mechanism is initialized. On initialization KVM tries to inject 'wakeup all' event and puts the corresponding token to the slot. It is, however, failing to inject an interrupt (kvm_apic_set_irq() -> __apic_accept_irq() -> !apic_enabled()) so the guest never gets notified and the whole APF mechanism gets stuck. The same issue is likely to happen if the guest temporary disables LAPIC and a previously unavailable page becomes available. Do two things to resolve the issue: - Avoid dequeuing 'page ready' events from APF queue when LAPIC is disabled. - Trigger an attempt to deliver pending 'page ready' events when LAPIC becomes enabled (SPIV or MSR_IA32_APICBASE). Reported-by: NSebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210422092948.568327-1-vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [Guoqing: backport to 5.10-stable ] Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 19 7月, 2022 1 次提交
-
-
由 Sean Christopherson 提交于
stable inclusion from stable-v5.10.112 commit 342454231ee5f2c2782f5510cab2e7a968486fef category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5HL0X Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=342454231ee5f2c2782f5510cab2e7a968486fef -------------------------------- commit 1d0e8480 upstream. Resolve nx_huge_pages to true/false when kvm.ko is loaded, leaving it as -1 is technically undefined behavior when its value is read out by param_get_bool(), as boolean values are supposed to be '0' or '1'. Alternatively, KVM could define a custom getter for the param, but the auto value doesn't depend on the vendor module in any way, and printing "auto" would be unnecessarily unfriendly to the user. In addition to fixing the undefined behavior, resolving the auto value also fixes the scenario where the auto value resolves to N and no vendor module is loaded. Previously, -1 would result in Y being printed even though KVM would ultimately disable the mitigation. Rename the existing MMU module init/exit helpers to clarify that they're invoked with respect to the vendor module, and add comments to document why KVM has two separate "module init" flows. ========================================================================= UBSAN: invalid-load in kernel/params.c:320:33 load of value 255 is not a valid value for type '_Bool' CPU: 6 PID: 892 Comm: tail Not tainted 5.17.0-rc3+ #799 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 ubsan_epilogue+0x5/0x40 __ubsan_handle_load_invalid_value.cold+0x43/0x48 param_get_bool.cold+0xf/0x14 param_attr_show+0x55/0x80 module_attr_show+0x1c/0x30 sysfs_kf_seq_show+0x93/0xc0 seq_read_iter+0x11c/0x450 new_sync_read+0x11b/0x1a0 vfs_read+0xf0/0x190 ksys_read+0x5f/0xe0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> ========================================================================= Fixes: b8e8c830 ("kvm: mmu: ITLB_MULTIHIT mitigation") Cc: stable@vger.kernel.org Reported-by: NBruno Goncalves <bgoncalv@redhat.com> Reported-by: NJan Stancek <jstancek@redhat.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20220331221359.3912754-1-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 18 7月, 2022 1 次提交
-
-
由 Hou Wenlong 提交于
stable inclusion from stable-v5.10.111 commit a24479c5e9f44c2a799ca2178b1ba9fe1290706e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5GL1Z Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a24479c5e9f44c2a799ca2178b1ba9fe1290706e -------------------------------- [ Upstream commit a836839c ] When RDTSCP is supported but RDPID is not supported in host, RDPID emulation is available. However, __kvm_get_msr() would only fail when RDTSCP/RDPID both are disabled in guest, so the emulator wouldn't inject a #UD when RDPID is disabled but RDTSCP is enabled in guest. Fixes: fb6d4d34 ("KVM: x86: emulate RDPID") Signed-off-by: NHou Wenlong <houwenlong.hwl@antgroup.com> Message-Id: <1dfd46ae5b76d3ed87bde3154d51c64ea64c99c1.1646226788.git.houwenlong.hwl@antgroup.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
- 08 7月, 2022 2 次提交
-
-
由 Sean Christopherson 提交于
mainline inclusion from mainline-5.13 commit fe7e9488 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK CVE: NA Intel-SIG: commit fe7e9488 KVM: x86: Add capability to grant VM access to privileged SGX attribute. Backport for SGX virtualization support -------------------------------- Add a capability, KVM_CAP_SGX_ATTRIBUTE, that can be used by userspace to grant a VM access to a priveleged attribute, with args[0] holding a file handle to a valid SGX attribute file. The SGX subsystem restricts access to a subset of enclave attributes to provide additional security for an uncompromised kernel, e.g. to prevent malware from using the PROVISIONKEY to ensure its nodes are running inside a geniune SGX enclave and/or to obtain a stable fingerprint. To prevent userspace from circumventing such restrictions by running an enclave in a VM, KVM restricts guest access to privileged attributes by default. Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NKai Huang <kai.huang@intel.com> Message-Id: <0b099d65e933e068e3ea934b0523bab070cb8cea.1618196135.git.kai.huang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NFan Du <fan.du@intel.com> Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>
-
由 Sean Christopherson 提交于
mainline inclusion from mainline-5.13 commit 54f958cd category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK CVE: NA Intel-SIG: commit 54f958cd KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX). Backport for SGX virtualization support -------------------------------- Export the gva_to_gpa() helpers for use by SGX virtualization when executing ENCLS[ECREATE] and ENCLS[EINIT] on behalf of the guest. To execute ECREATE and EINIT, KVM must obtain the GPA of the target Secure Enclave Control Structure (SECS) in order to get its corresponding HVA. Because the SECS must reside in the Enclave Page Cache (EPC), copying the SECS's data to a host-controlled buffer via existing exported helpers is not a viable option as the EPC is not readable or writable by the kernel. SGX virtualization will also use gva_to_gpa() to obtain HVAs for non-EPC pages in order to pass user pointers directly to ECREATE and EINIT, which avoids having to copy pages worth of data into the kernel. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Acked-by: NJarkko Sakkinen <jarkko@kernel.org> Signed-off-by: NKai Huang <kai.huang@intel.com> Message-Id: <02f37708321bcdfaa2f9d41c8478affa6e84b04d.1618196135.git.kai.huang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NFan Du <fan.du@intel.com> Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>
-
- 06 7月, 2022 1 次提交
-
-
由 Pawan Gupta 提交于
stable inclusion from stable-v5.10.123 commit bde15fdcce44956278b4f50680b7363ca126ffb9 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=bde15fdcce44956278b4f50680b7363ca126ffb9 -------------------------------- commit 027bbb88 upstream The enumeration of MD_CLEAR in CPUID(EAX=7,ECX=0).EDX{bit 10} is not an accurate indicator on all CPUs of whether the VERW instruction will overwrite fill buffers. FB_CLEAR enumeration in IA32_ARCH_CAPABILITIES{bit 17} covers the case of CPUs that are not vulnerable to MDS/TAA, indicating that microcode does overwrite fill buffers. Guests running in VMM environments may not be aware of all the capabilities/vulnerabilities of the host CPU. Specifically, a guest may apply MDS/TAA mitigations when a virtual CPU is enumerated as vulnerable to MDS/TAA even when the physical CPU is not. On CPUs that enumerate FB_CLEAR_CTRL the VMM may set FB_CLEAR_DIS to skip overwriting of fill buffers by the VERW instruction. This is done by setting FB_CLEAR_DIS during VMENTER and resetting on VMEXIT. For guests that enumerate FB_CLEAR (explicitly asking for fill buffer clear capability) the VMM will not use FB_CLEAR_DIS. Irrespective of guest state, host overwrites CPU buffers before VMENTER to protect itself from an MMIO capable guest, as part of mitigation for MMIO Stale Data vulnerabilities. Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: arch/x86/kvm/vmx/vmx.h Signed-off-by: NYipeng Zou <zouyipeng@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NLiao Chang <liaochang1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 28 6月, 2022 1 次提交
-
-
由 Sean Christopherson 提交于
mainline inclusion from mainline-v5.19-rc1 commit fee060cd category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5BFBW CVE: CVE-2022-1852 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fee060cd52d69c114b62d1a2948ea9648b5131f9 -------------------------------- Whenever x86_decode_emulated_instruction() detects a breakpoint, it returns the value that kvm_vcpu_check_breakpoint() writes into its pass-by-reference second argument. Unfortunately this is completely bogus because the expected outcome of x86_decode_emulated_instruction is an EMULATION_* value. Then, if kvm_vcpu_check_breakpoint() does "*r = 0" (corresponding to a KVM_EXIT_DEBUG userspace exit), it is misunderstood as EMULATION_OK and x86_emulate_instruction() is called without having decoded the instruction. This causes various havoc from running with a stale emulation context. The fix is to move the call to kvm_vcpu_check_breakpoint() where it was before commit 4aa2691d ("KVM: x86: Factor out x86 instruction emulation with decoding") introduced x86_decode_emulated_instruction(). The other caller of the function does not need breakpoint checks, because it is invoked as part of a vmexit and the processor has already checked those before executing the instruction that #GP'd. This fixes CVE-2022-1852. conficts: arch/x86/kvm/x86.c Reported-by: NQiuhao Li <qiuhao@sysec.org> Reported-by: NGaoning Pan <pgn@zju.edu.cn> Reported-by: NYongkang Jia <kangel@zju.edu.cn> Fixes: 4aa2691d ("KVM: x86: Factor out x86 instruction emulation with decoding") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20220311032801.3467418-2-seanjc@google.com> [Rewrote commit message according to Qiuhao's report, since a patch already existed to fix the bug. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NMa Junhai <majunhai2@huawei.com> Signed-off-by: NGuan Jing <guanjing6@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 07 6月, 2022 1 次提交
-
-
由 Mingwei Zhang 提交于
mainline inclusion from mainline-v5.18-rc4 commit 683412cc category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I58DFI CVE: CVE-2022-0171 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=683412ccf61294d727ead4a73d97397396e69a6b -------------------------------- Flush the CPU caches when memory is reclaimed from an SEV guest (where reclaim also includes it being unmapped from KVM's memslots). Due to lack of coherency for SEV encrypted memory, failure to flush results in silent data corruption if userspace is malicious/broken and doesn't ensure SEV guest memory is properly pinned and unpinned. Cache coherency is not enforced across the VM boundary in SEV (AMD APM vol.2 Section 15.34.7). Confidential cachelines, generated by confidential VM guests have to be explicitly flushed on the host side. If a memory page containing dirty confidential cachelines was released by VM and reallocated to another user, the cachelines may corrupt the new user at a later time. KVM takes a shortcut by assuming all confidential memory remain pinned until the end of VM lifetime. Therefore, KVM does not flush cache at mmu_notifier invalidation events. Because of this incorrect assumption and the lack of cache flushing, malicous userspace can crash the host kernel: creating a malicious VM and continuously allocates/releases unpinned confidential memory pages when the VM is running. Add cache flush operations to mmu_notifier operations to ensure that any physical memory leaving the guest VM get flushed. In particular, hook mmu_notifier_invalidate_range_start and mmu_notifier_release events and flush cache accordingly. The hook after releasing the mmu lock to avoid contention with other vCPUs. Cc: stable@vger.kernel.org Suggested-by: NSean Christpherson <seanjc@google.com> Reported-by: NMingwei Zhang <mizhang@google.com> Signed-off-by: NMingwei Zhang <mizhang@google.com> Message-Id: <20220421031407.2516575-4-mizhang@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> conflicts: arch/x86/include/asm/kvm-x86-ops.h arch/x86/kvm/x86.c virt/kvm/kvm_main.c Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com> Reviewed-by: NLiao Chang <liaochang1@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 10 5月, 2022 2 次提交
-
-
由 Sean Christopherson 提交于
stable inclusion from stable-v5.10.97 commit 080dbe7e9b86a0392d8dffc00d9971792afc121f bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=080dbe7e9b86a0392d8dffc00d9971792afc121f -------------------------------- commit f7e57078 upstream. Forcibly leave nested virtualization operation if userspace toggles SMM state via KVM_SET_VCPU_EVENTS or KVM_SYNC_X86_EVENTS. If userspace forces the vCPU out of SMM while it's post-VMXON and then injects an SMI, vmx_enter_smm() will overwrite vmx->nested.smm.vmxon and end up with both vmxon=false and smm.vmxon=false, but all other nVMX state allocated. Don't attempt to gracefully handle the transition as (a) most transitions are nonsencial, e.g. forcing SMM while L2 is running, (b) there isn't sufficient information to handle all transitions, e.g. SVM wants access to the SMRAM save state, and (c) KVM_SET_VCPU_EVENTS must precede KVM_SET_NESTED_STATE during state restore as the latter disallows putting the vCPU into L2 if SMM is active, and disallows tagging the vCPU as being post-VMXON in SMM if SMM is not active. Abuse of KVM_SET_VCPU_EVENTS manifests as a WARN and memory leak in nVMX due to failure to free vmcs01's shadow VMCS, but the bug goes far beyond just a memory leak, e.g. toggling SMM on while L2 is active puts the vCPU in an architecturally impossible state. WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline] WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656 Modules linked in: CPU: 1 PID: 3606 Comm: syz-executor725 Not tainted 5.17.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline] RIP: 0010:free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656 Code: <0f> 0b eb b3 e8 8f 4d 9f 00 e9 f7 fe ff ff 48 89 df e8 92 4d 9f 00 Call Trace: <TASK> kvm_arch_vcpu_destroy+0x72/0x2f0 arch/x86/kvm/x86.c:11123 kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline] kvm_destroy_vcpus+0x11f/0x290 arch/x86/kvm/../../../virt/kvm/kvm_main.c:460 kvm_free_vcpus arch/x86/kvm/x86.c:11564 [inline] kvm_arch_destroy_vm+0x2e8/0x470 arch/x86/kvm/x86.c:11676 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1217 [inline] kvm_put_kvm+0x4fa/0xb00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1250 kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1273 __fput+0x286/0x9f0 fs/file_table.c:311 task_work_run+0xdd/0x1a0 kernel/task_work.c:164 exit_task_work include/linux/task_work.h:32 [inline] do_exit+0xb29/0x2a30 kernel/exit.c:806 do_group_exit+0xd2/0x2f0 kernel/exit.c:935 get_signal+0x4b0/0x28c0 kernel/signal.c:2862 arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> Cc: stable@vger.kernel.org Reported-by: syzbot+8112db3ab20e70d50c31@syzkaller.appspotmail.com Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20220125220358.2091737-1-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Backported-by: NTadeusz Struk <tadeusz.struk@linaro.org> Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Like Xu 提交于
stable inclusion from stable-v5.10.96 commit e92cac1dd803aca5bc326ec22bdcd4f56855d7ce bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e92cac1dd803aca5bc326ec22bdcd4f56855d7ce -------------------------------- commit 4c282e51 upstream. Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the size in bytes of the XSAVE area is affected by the states enabled in XSS. Fixes: 20300099 ("kvm: vmx: add MSR logic for XSAVES") Cc: stable@vger.kernel.org Signed-off-by: NLike Xu <likexu@tencent.com> [sean: split out as a separate patch, adjust Fixes tag] Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20220126172226.2298529-3-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 28 4月, 2022 1 次提交
-
-
由 liangtian 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I53PTV?from=project-issue CVE: NA ----------------------------------------------------- Since the reset function is in kvm_intel module instead of kvm module, the attribute weak function in kvm_main.c could not be found, which would cause st_max in X86 never be refreshed. The solution is to define the reset function in x86.c under the kvm module. Signed-off-by: Nliangtian <liangtian13@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 19 4月, 2022 2 次提交
-
-
由 Wei Wang 提交于
stable inclusion from stable-v5.10.93 commit 4c7fb4d519e599bb69581d80fbfc1392cbea5fea bugzilla: 186204 https://gitee.com/openeuler/kernel/issues/I5311N Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4c7fb4d519e599bb69581d80fbfc1392cbea5fea -------------------------------- commit 9fb12fe5 upstream. The fixed counter 3 is used for the Topdown metrics, which hasn't been enabled for KVM guests. Userspace accessing to it will fail as it's not included in get_fixed_pmc(). This breaks KVM selftests on ICX+ machines, which have this counter. To reproduce it on ICX+ machines, ./state_test reports: Acked-by: NXie XiuQi <xiexiuqi@huawei.com> ==== Test Assertion Failure ==== lib/x86_64/processor.c:1078: r == nmsrs pid=4564 tid=4564 - Argument list too long 1 0x000000000040b1b9: vcpu_save_state at processor.c:1077 2 0x0000000000402478: main at state_test.c:209 (discriminator 6) 3 0x00007fbe21ed5f92: ?? ??:0 4 0x000000000040264d: _start at ??:? Unexpected result from KVM_GET_MSRS, r: 17 (failed MSR was 0x30c) With this patch, it works well. Signed-off-by: NWei Wang <wei.w.wang@intel.com> Message-Id: <20211217124934.32893-1-wei.w.wang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Fixes: e2ada66e ("kvm: x86: Add Intel PMU MSRs to msrs_to_save[]") Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Sean Christopherson 提交于
stable inclusion from stable-v5.10.93 commit 413b427f5fff5d658c2605ca889d6b13b88efd0c bugzilla: 186204 https://gitee.com/openeuler/kernel/issues/I5311N Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=413b427f5fff5d658c2605ca889d6b13b88efd0c -------------------------------- commit f4b027c5 upstream. Override the Processor Trace (PT) interrupt handler for guest mode if and only if PT is configured for host+guest mode, i.e. is being used independently by both host and guest. If PT is configured for system mode, the host fully controls PT and must handle all events. Fixes: 8479e04e ("KVM: x86: Inject PMI for KVM guest") Reported-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Reported-by: NArtem Kashkanov <artem.kashkanov@intel.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211111020738.2512932-4-seanjc@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 14 1月, 2022 1 次提交
-
-
由 Vitaly Kuznetsov 提交于
stable inclusion from stable-v5.10.88 commit 9439fabfc349c3e8eacfb6b0856d6739de129969 bugzilla: 186058 https://gitee.com/openeuler/kernel/issues/I4QW6A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9439fabfc349c3e8eacfb6b0856d6739de129969 -------------------------------- [ Upstream commit 1aa2abb3 ] The ability to write to MSR_IA32_PERF_CAPABILITIES from the host should not depend on guest visible CPUID entries, even if just to allow creating/restoring guest MSRs and CPUIDs in any sequence. Fixes: 27461da3 ("KVM: x86/pmu: Support full width counting") Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211216165213.338923-3-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 07 1月, 2022 1 次提交
-
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit d855066f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d855066f81726155caf766e47eea58ae10b1fd57 ------------------- SVM already has specific handlers of MSR_IA32_DEBUGCTLMSR in the svm_get/set_msr, so the x86 common part can be safely moved to VMX. This allows KVM to store the bits it supports in GUEST_IA32_DEBUGCTL. Add vmx_supported_debugctl() to refactor the throwing logic of #GP. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210108013704.134985-2-like.xu@linux.intel.com> [Merge parts of Chenyi Qiang's "KVM: X86: Expose bus lock debug exception to guest". - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 30 12月, 2021 3 次提交
-
-
由 Sean Christopherson 提交于
mainline inclusion from mainline-v5.14-rc1 commit 0aa18375 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0aa1837533e5f4be8cc21bbc06314c23ba2c5447 ------------------- Reset the MMU context at vCPU INIT (and RESET for good measure) if CR0.PG was set prior to INIT. Simply re-initializing the current MMU is not sufficient as the current root HPA may not be usable in the new context. E.g. if TDP is disabled and INIT arrives while the vCPU is in long mode, KVM will fail to switch to the 32-bit pae_root and bomb on the next VM-Enter due to running with a 64-bit CR3 in 32-bit mode. This bug was papered over in both VMX and SVM, but still managed to rear its head in the MMU role on VMX. Because EFER.LMA=1 requires CR0.PG=1, kvm_calc_shadow_mmu_root_page_role() checks for EFER.LMA without first checking CR0.PG. VMX's RESET/INIT flow writes CR0 before EFER, and so an INIT with the vCPU in 64-bit mode will cause the hack-a-fix to generate the wrong MMU role. In VMX, the INIT issue is specific to running without unrestricted guest since unrestricted guest is available if and only if EPT is enabled. Commit 8668a3c4 ("KVM: VMX: Reset mmu context when entering real mode") resolved the issue by forcing a reset when entering emulated real mode. In SVM, commit ebae871a ("kvm: svm: reset mmu on VCPU reset") forced a MMU reset on every INIT to workaround the flaw in common x86. Note, at the time the bug was fixed, the SVM problem was exacerbated by a complete lack of a CR4 update. The vendor resets will be reverted in future patches, primarily to aid bisection in case there are non-INIT flows that rely on the existing VMX logic. Because CR0.PG is unconditionally cleared on INIT, and because CR0.WP and all CR4/EFER paging bits are ignored if CR0.PG=0, simply checking that CR0.PG was '1' prior to INIT/RESET is sufficient to detect a required MMU context reset. Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-4-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wanpeng Li 提交于
mainline inclusion from mainline-v5.12-rc4 commit c2162e13 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2162e13d6e2f43e5001a356196871642de070ba ------------------- In order to deal with noncoherent DMA, we should execute wbinvd on all dirty pCPUs when guest wbinvd exits to maintain data consistency. smp_call_function_many() does not execute the provided function on the local core, therefore replace it by on_each_cpu_mask(). Reported-by: NNadav Amit <namit@vmware.com> Cc: Nadav Amit <namit@vmware.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1615517151-7465-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 xulei 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA ------------------- fix losing SMI problem Signed-off-by: Nxulei <stone.xulei@huawei.com> Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 15 11月, 2021 2 次提交
-
-
由 Fares Mehanna 提交于
stable inclusion from stable-5.10.72 commit ae34f26d4a8487e37ab9cfdca72f447c1ef49aa5 bugzilla: 182982 https://gitee.com/openeuler/kernel/issues/I4I3L1 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ae34f26d4a8487e37ab9cfdca72f447c1ef49aa5 -------------------------------- [ Upstream commit e1fc1553 ] Intel PMU MSRs is in msrs_to_save_all[], so add AMD PMU MSRs to have a consistent behavior between Intel and AMD when using KVM_GET_MSRS, KVM_SET_MSRS or KVM_GET_MSR_INDEX_LIST. We have to add legacy and new MSRs to handle guests running without X86_FEATURE_PERFCTR_CORE. Signed-off-by: NFares Mehanna <faresx@amazon.de> Message-Id: <20210915133951.22389-1-faresx@amazon.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Haimin Zhang 提交于
stable inclusion from stable-5.10.71 commit deb2949417677649e2413266d7ce8c2ff73952b4 bugzilla: 182981 https://gitee.com/openeuler/kernel/issues/I4I3KD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=deb2949417677649e2413266d7ce8c2ff73952b4 -------------------------------- commit eb7511bf upstream. Check the return of init_srcu_struct(), which can fail due to OOM, when initializing the page track mechanism. Lack of checking leads to a NULL pointer deref found by a modified syzkaller. Reported-by: NTCS Robot <tcs_robot@tencent.com> Signed-off-by: NHaimin Zhang <tcs_kernel@tencent.com> Message-Id: <1630636626-12262-1-git-send-email-tcs_kernel@tencent.com> [Move the call towards the beginning of kvm_arch_init_vm. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 19 10月, 2021 3 次提交
-
-
由 Zelin Deng 提交于
stable inclusion from stable-5.10.65 commit c06e6ff2fcc34476eda29c5c14ca0db962f500cf bugzilla: 182361 https://gitee.com/openeuler/kernel/issues/I4EH3U Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c06e6ff2fcc34476eda29c5c14ca0db962f500cf -------------------------------- commit d9130a2d upstream. When MSR_IA32_TSC_ADJUST is written by guest due to TSC ADJUST feature especially there's a big tsc warp (like a new vCPU is hot-added into VM which has been up for a long time), tsc_offset is added by a large value then go back to guest. This causes system time jump as tsc_timestamp is not adjusted in the meantime and pvclock monotonic character. To fix this, just notify kvm to update vCPU's guest time before back to guest. Cc: stable@vger.kernel.org Signed-off-by: NZelin Deng <zelin.deng@linux.alibaba.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <1619576521-81399-2-git-send-email-zelin.deng@linux.alibaba.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wanpeng Li 提交于
stable inclusion from stable-5.10.61 commit 8e100c72b6ded07d5900c20ca5ee5d0c6d9c4293 bugzilla: 177029 https://gitee.com/openeuler/kernel/issues/I4EAXD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8e100c72b6ded07d5900c20ca5ee5d0c6d9c4293 -------------------------------- [ Upstream commit da6393cd ] Reported by syzkaller: WARNING: CPU: 7 PID: 10526 at linux/arch/x86/kvm//x86.c:7621 x86_emulate_instruction+0x41b/0x510 [kvm] RIP: 0010:x86_emulate_instruction+0x41b/0x510 [kvm] Call Trace: kvm_mmu_page_fault+0x126/0x8f0 [kvm] vmx_handle_exit+0x11e/0x680 [kvm_intel] vcpu_enter_guest+0xd95/0x1b40 [kvm] kvm_arch_vcpu_ioctl_run+0x377/0x6a0 [kvm] kvm_vcpu_ioctl+0x389/0x630 [kvm] __x64_sys_ioctl+0x8e/0xd0 do_syscall_64+0x3c/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Commit 4a1e10d5 ("KVM: x86: handle hardware breakpoints during emulation()) adds hardware breakpoints check before emulation the instruction and parts of emulation context initialization, actually we don't have the EMULTYPE_NO_DECODE flag here and the emulation context will not be reused. Commit c8848cee ("KVM: x86: set ctxt->have_exception in x86_decode_insn()) triggers the warning because it catches the stale emulation context has #UD, however, it is not during instruction decoding which should result in EMULATION_FAILED. This patch fixes it by moving the second part emulation context initialization into init_emulate_ctxt() and before hardware breakpoints check. The ctxt->ud will be dropped by a follow-up patch. syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=134683fdd00000 Reported-by: syzbot+71271244f206d17f6441@syzkaller.appspotmail.com Fixes: 4a1e10d5 (KVM: x86: handle hardware breakpoints during emulation) Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <1622160097-37633-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wei Huang 提交于
stable inclusion from stable-5.10.61 commit 47d4c799979996e512730de23a08c1d17dc373db bugzilla: 177029 https://gitee.com/openeuler/kernel/issues/I4EAXD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=47d4c799979996e512730de23a08c1d17dc373db -------------------------------- [ Upstream commit 4aa2691d ] Move the instruction decode part out of x86_emulate_instruction() for it to be used in other places. Also kvm_clear_exception_queue() is moved inside the if-statement as it doesn't apply when KVM are coming back from userspace. Co-developed-by: NBandan Das <bsd@redhat.com> Signed-off-by: NBandan Das <bsd@redhat.com> Signed-off-by: NWei Huang <wei.huang2@amd.com> Message-Id: <20210126081831.570253-2-wei.huang2@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 15 10月, 2021 4 次提交
-
-
由 Paolo Bonzini 提交于
stable inclusion from stable-5.10.58 commit 309a31127befd0c9809036f3ea500fcf662f2bf3 bugzilla: 176984 https://gitee.com/openeuler/kernel/issues/I4E2P4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=309a31127befd0c9809036f3ea500fcf662f2bf3 -------------------------------- commit fa7a549d upstream. Once an exception has been injected, any side effects related to the exception (such as setting CR2 or DR6) have been taked place. Therefore, once KVM sets the VM-entry interruption information field or the AMD EVENTINJ field, the next VM-entry must deliver that exception. Pending interrupts are processed after injected exceptions, so in theory it would not be a problem to use KVM_INTERRUPT when an injected exception is present. However, DOSEMU is using run->ready_for_interrupt_injection to detect interrupt windows and then using KVM_SET_SREGS/KVM_SET_REGS to inject the interrupt manually. For this to work, the interrupt window must be delayed after the completion of the previous event injection. Cc: stable@vger.kernel.org Reported-by: NStas Sergeev <stsp2@yandex.ru> Tested-by: NStas Sergeev <stsp2@yandex.ru> Fixes: 71cc849b ("KVM: x86: Fix split-irqchip vs interrupt injection window request") Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vitaly Kuznetsov 提交于
stable inclusion from stable-5.10.56 commit 01f3581d4400f6586403409c5c3a3128fbb3e336 bugzilla: 176004 https://gitee.com/openeuler/kernel/issues/I4DYZ4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=01f3581d4400f6586403409c5c3a3128fbb3e336 -------------------------------- [ Upstream commit 0a31df68 ] MSR_KVM_ASYNC_PF_ACK MSR is part of interrupt based asynchronous page fault interface and not the original (deprecated) KVM_FEATURE_ASYNC_PF. This is stated in Documentation/virt/kvm/msr.rst. Fixes: 66570e96 ("kvm: x86: only provide PV features if enabled in guest's CPUID") Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Reviewed-by: NOliver Upton <oupton@google.com> Message-Id: <20210722123018.260035-1-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Maxim Levitsky 提交于
stable inclusion from stable-5.10.55 commit dee8119eaa9e00f0d1e14f050e1867813a49a1bb bugzilla: 175636 https://gitee.com/openeuler/kernel/issues/I4DYWD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=dee8119eaa9e00f0d1e14f050e1867813a49a1bb -------------------------------- commit b97f0745 upstream. A page fault can be queued while vCPU is in real paged mode on AMD, and AMD manual asks the user to always intercept it (otherwise result is undefined). The resulting VM exit, does have an error code. Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210225154135.405125-2-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NZubin Mithra <zsm@chromium.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lai Jiangshan 提交于
stable inclusion from stable-5.10.52 commit 8b2ae2de536cca8ee725add5bccce4abec235880 bugzilla: 175542 https://gitee.com/openeuler/kernel/issues/I4DTKU Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8b2ae2de536cca8ee725add5bccce4abec235880 -------------------------------- commit f85d4016 upstream. When the host is using debug registers but the guest is not using them nor is the guest in guest-debug state, the kvm code does not reset the host debug registers before kvm_x86->run(). Rather, it relies on the hardware vmentry instruction to automatically reset the dr7 registers which ensures that the host breakpoints do not affect the guest. This however violates the non-instrumentable nature around VM entry and exit; for example, when a host breakpoint is set on vcpu->arch.cr2, Another issue is consistency. When the guest debug registers are active, the host breakpoints are reset before kvm_x86->run(). But when the guest debug registers are inactive, the host breakpoints are delayed to be disabled. The host tracing tools may see different results depending on what the guest is doing. To fix the problems, we clear %db7 unconditionally before kvm_x86->run() if the host has set any breakpoints, no matter if the guest is using them or not. Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20210628172632.81029-1-jiangshanlai@gmail.com> Cc: stable@vger.kernel.org [Only clear %db7 instead of reloading all debug registers. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 13 10月, 2021 1 次提交
-
-
由 Sean Christopherson 提交于
stable inclusion from stable-5.10.50 commit b2c5af71ce4b6e21f6af9fa292e72fd4662a2490 bugzilla: 174522 https://gitee.com/openeuler/kernel/issues/I4DNFY Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b2c5af71ce4b6e21f6af9fa292e72fd4662a2490 -------------------------------- [ Upstream commit 07ffaf34 ] Trigger a full TLB flush on behalf of the guest on nested VM-Enter and VM-Exit when VPID is disabled for L2. kvm_mmu_new_pgd() syncs only the current PGD, which can theoretically leave stale, unsync'd entries in a previous guest PGD, which could be consumed if L2 is allowed to load CR3 with PCID_NOFLUSH=1. Rename KVM_REQ_HV_TLB_FLUSH to KVM_REQ_TLB_FLUSH_GUEST so that it can be utilized for its obvious purpose of emulating a guest TLB flush. Note, there is no change the actual TLB flush executed by KVM, even though the fast PGD switch uses KVM_REQ_TLB_FLUSH_CURRENT. When VPID is disabled for L2, vpid02 is guaranteed to be '0', and thus nested_get_vpid02() will return the VPID that is shared by L1 and L2. Generate the request outside of kvm_mmu_new_pgd(), as getting the common helper to correctly identify which requested is needed is quite painful. E.g. using KVM_REQ_TLB_FLUSH_GUEST when nested EPT is in play is wrong as a TLB flush from the L1 kernel's perspective does not invalidate EPT mappings. And, by using KVM_REQ_TLB_FLUSH_GUEST, nVMX can do future simplification by moving the logic into nested_vmx_transition_tlb_flush(). Fixes: 41fab65e ("KVM: nVMX: Skip MMU sync on nested VMX transition when possible") Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210609234235.1244004-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 06 7月, 2021 1 次提交
-
-
由 Wanpeng Li 提交于
stable inclusion from stable-5.10.46 commit 3a9934d6b8dd8a91d61ed2d0d538fa27cb9192a3 bugzilla: 168323 CVE: NA -------------------------------- commit dfdc0a71 upstream. Commit c9b8b07c (KVM: x86: Dynamically allocate per-vCPU emulation context) tries to allocate per-vCPU emulation context dynamically, however, the x86_emulator slab cache is still exiting after the kvm module is unload as below after destroying the VM and unloading the kvm module. grep x86_emulator /proc/slabinfo x86_emulator 36 36 2672 12 8 : tunables 0 0 0 : slabdata 3 3 0 This patch fixes this slab cache leak by destroying the x86_emulator slab cache when the kvm module is unloaded. Fixes: c9b8b07c (KVM: x86: Dynamically allocate per-vCPU emulation context) Cc: stable@vger.kernel.org Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1623387573-5969-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-