提交 · 54d8e6b98fb03cb95cd8b416315d1048d1dcd80b · openeuler / Kernel

06 1月, 2023 1 次提交

KVM: VMX: Execute IBPB on emulated VM-exit when guest has IBRS · 54d8e6b9

由 Jim Mattson 提交于 1月 06, 2023

mainline inclusion
from mainline-v6.2-rc1
commit 2e7eab81
category: bugfix
bugzilla: 188212
CVE: CVE-2022-2196

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=2e7eab81425ad6c875f2ed47c0ce01e78afc38a5

--------------------------------

According to Intel's document on Indirect Branch Restricted
Speculation, "Enabling IBRS does not prevent software from controlling
the predicted targets of indirect branches of unrelated software
executed later at the same predictor mode (for example, between two
different user applications, or two different virtual machines). Such
isolation can be ensured through use of the Indirect Branch Predictor
Barrier (IBPB) command." This applies to both basic and enhanced IBRS.

Since L1 and L2 VMs share hardware predictor modes (guest-user and
guest-kernel), hardware IBRS is not sufficient to virtualize
IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts,
hardware IBRS is actually sufficient in practice, even though it isn't
sufficient architecturally.)

For virtual CPUs that support IBRS, add an indirect branch prediction
barrier on emulated VM-exit, to ensure that the predicted targets of
indirect branches executed in L1 cannot be controlled by software that
was executed in L2.

Since we typically don't intercept guest writes to IA32_SPEC_CTRL,
perform the IBPB at emulated VM-exit regardless of the current
IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be
deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is
clear at emulated VM-exit.

This is CVE-2022-2196.

Fixes: 5c911bef ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02")
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20221019213620.1953281-3-jmattson@google.comSigned-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NChenXiaoSong <chenxiaosong2@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
(cherry picked from commit 04856b0e)

54d8e6b9

24 11月, 2022 3 次提交

kvm: x86: Disable interception for IA32_XFD on demand · 7b32cbb5

由 Kevin Tian 提交于 1月 05, 2022

mainline inclusion
from mainline-v5.17-rc1
commit b5274b1b
category: feature
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RQLJ
CVE: NA

Intel-SIG: commit b5274b1b kvm: x86: Disable interception for IA32_XFD on demand.

--------------------------------

Always intercepting IA32_XFD causes non-negligible overhead when this
register is updated frequently in the guest.

Disable r/w emulation after intercepting the first WRMSR(IA32_XFD)
with a non-zero value.

Disable WRMSR emulation implies that IA32_XFD becomes out-of-sync
with the software states in fpstate and the per-cpu xfd cache. This
leads to two additional changes accordingly:

  - Call fpu_sync_guest_vmexit_xfd_state() after vm-exit to bring
    software states back in-sync with the MSR, before handle_exit_irqoff()
    is called.

  - Always trap #NM once write interception is disabled for IA32_XFD.
    The #NM exception is rare if the guest doesn't use dynamic
    features. Otherwise, there is at most one exception per guest
    task given a dynamic feature.

p.s. We have confirmed that SDM is being revised to say that
when setting IA32_XFD[18] the AMX register state is not guaranteed
to be preserved. This clarification avoids adding mess for a creative
guest which sets IA32_XFD[18]=1 before saving active AMX state to
its own storage.
Signed-off-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NJing Liu <jing2.liu@intel.com>
Signed-off-by: NYang Zhong <yang.zhong@intel.com>
Message-Id: <20220105123532.12586-22-yang.zhong@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NLin Wang <lin.x.wang@intel.com>

7b32cbb5

kvm: x86: Disable RDMSR interception of IA32_XFD_ERR · 8f85b372

由 Jing Liu 提交于 1月 05, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 61f20813
category: feature
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RQLJ
CVE: NA

Intel-SIG: commit 61f20813 kvm: x86: Disable RDMSR interception of IA32_XFD_ERR.

--------------------------------

This saves one unnecessary VM-exit in guest #NM handler, given that the
MSR is already restored with the guest value before the guest is resumed.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJing Liu <jing2.liu@intel.com>
Signed-off-by: NYang Zhong <yang.zhong@intel.com>
Message-Id: <20220105123532.12586-15-yang.zhong@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NLin Wang <lin.x.wang@intel.com>

8f85b372

kvm: x86: Intercept #NM for saving IA32_XFD_ERR · 4a642360

由 Jing Liu 提交于 1月 05, 2022

mainline inclusion
from mainline-v5.17-rc1
commit ec5be88a
category: feature
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RQLJ
CVE: NA

Intel-SIG: commit ec5be88a kvm: x86: Intercept #NM for saving IA32_XFD_ERR.

--------------------------------

Guest IA32_XFD_ERR is generally modified in two places:

  - Set by CPU when #NM is triggered;
  - Cleared by guest in its #NM handler;

Intercept #NM for the first case when a nonzero value is written
to IA32_XFD. Nonzero indicates that the guest is willing to do
dynamic fpstate expansion for certain xfeatures, thus KVM needs to
manage and virtualize guest XFD_ERR properly. The vcpu exception
bitmap is updated in XFD write emulation according to guest_fpu::xfd.

Save the current XFD_ERR value to the guest_fpu container in the #NM
VM-exit handler. This must be done with interrupt disabled, otherwise
the unsaved MSR value may be clobbered by host activity.

The saving operation is conducted conditionally only when guest_fpu:xfd
includes a non-zero value. Doing so also avoids misread on a platform
which doesn't support XFD but #NM is triggered due to L1 interception.

Queueing #NM to the guest is postponed to handle_exception_nmi(). This
goes through the nested_vmx check so a virtual vmexit is queued instead
when #NM is triggered in L2 but L1 wants to intercept it.

Restore the host value (always ZERO outside of the host #NM
handler) before enabling interrupt.

Restore the guest value from the guest_fpu container right before
entering the guest (with interrupt disabled).
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJing Liu <jing2.liu@intel.com>
Signed-off-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NYang Zhong <yang.zhong@intel.com>
Message-Id: <20220105123532.12586-13-yang.zhong@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NLin Wang <lin.x.wang@intel.com>

4a642360

18 11月, 2022 4 次提交

x86/fpu: Replace the includes of fpu/internal.h · 6a08c572

由 Thomas Gleixner 提交于 10月 15, 2021

mainline inclusion
from mainline-v5.16-rc1
commit b56d2795
category: feature
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC
CVE: NA

Intel-SIG: commit b56d2795 x86/fpu: Replace the includes of fpu/internal.h.

--------------------------------

Now that the file is empty, fixup all references with the proper includes
and delete the former kitchen sink.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011540.001197214@linutronix.deSigned-off-by: NLin Wang <lin.x.wang@intel.com>
Signed-off-by: NAichun Shi <aichun.shi@intel.com>

6a08c572

KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook · 3c6b4607

由 Sean Christopherson 提交于 11月 18, 2022

stable inclusion
from stable-v5.10.137
commit c72a9b1d0dadfd85d19eaea81f61f7b286c57a31
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I60PLB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c72a9b1d0dadfd85d19eaea81f61f7b286c57a31

--------------------------------

[ Upstream commit c2fe3cd4 ]

Split out VMX's checks on CR4.VMXE to a dedicated hook, .is_valid_cr4(),
and invoke the new hook from kvm_valid_cr4().  This fixes an issue where
KVM_SET_SREGS would return success while failing to actually set CR4.

Fixing the issue by explicitly checking kvm_x86_ops.set_cr4()'s return
in __set_sregs() is not a viable option as KVM has already stuffed a
variety of vCPU state.

Note, kvm_valid_cr4() and is_valid_cr4() have different return types and
inverted semantics.  This will be remedied in a future patch.

Fixes: 5e1746d6 ("KVM: nVMX: Allow setting the VMXE bit in CR4")
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201007014417.29276-5-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

3c6b4607

KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4() · 26296e1c

由 Sean Christopherson 提交于 11月 18, 2022

stable inclusion
from stable-v5.10.137
commit da7f731f2ed5b4a082567967ce74be274aab2daf
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I60PLB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=da7f731f2ed5b4a082567967ce74be274aab2daf

--------------------------------

[ Upstream commit a447e38a ]

Drop vmx_set_cr4()'s explicit check on the 'nested' module param now
that common x86 handles the check by incorporating VMXE into the CR4
reserved bits, via kvm_cpu_caps.  X86_FEATURE_VMX is set in kvm_cpu_caps
(by vmx_set_cpu_caps()), if and only if 'nested' is true.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201007014417.29276-3-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

26296e1c

KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4() · be916451

由 Sean Christopherson 提交于 11月 18, 2022

stable inclusion
from stable-v5.10.137
commit 8b8b376903b32d3d854f39eeebe018169c920cb6
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I60PLB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8b8b376903b32d3d854f39eeebe018169c920cb6

--------------------------------

[ Upstream commit d3a9e414 ]

Drop vmx_set_cr4()'s somewhat hidden guest_cpuid_has() check on VMXE now
that common x86 handles the check by incorporating VMXE into the CR4
reserved bits, i.e. in cr4_guest_rsvd_bits.  This fixes a bug where KVM
incorrectly rejects KVM_SET_SREGS with CR4.VMXE=1 if it's executed
before KVM_SET_CPUID{,2}.

Fixes: 5e1746d6 ("KVM: nVMX: Allow setting the VMXE bit in CR4")
Reported-by: NStas Sergeev <stsp@users.sourceforge.net>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201007014417.29276-2-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

be916451

03 11月, 2022 5 次提交

KVM: X86: Expose bus lock debug exception to guest · b9ddddea

由 Paolo Bonzini 提交于 5月 06, 2021

mainline inclusion
from mainline-v5.13-rc2
commit 76ea438b
category: feature
feature: KVM Bus Lock Debug Exception
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RHW7
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
commit/?id=76ea438b

Intel-SIG: commit 76ea438b ("KVM: X86: Expose bus lock debug exception to guest")

-------------------------------------

KVM: X86: Expose bus lock debug exception to guest

Bus lock debug exception is an ability to notify the kernel by an #DB
trap after the instruction acquires a bus lock and is executed when
CPL>0. This allows the kernel to enforce user application throttling or
mitigations.

Existence of bus lock debug exception is enumerated via
CPUID.(EAX=7,ECX=0).ECX[24]. Software can enable these exceptions by
setting bit 2 of the MSR_IA32_DEBUGCTL. Expose the CPUID to guest and
emulate the MSR handling when guest enables it.

Support for this feature was originally developed by Xiaoyao Li and
Chenyi Qiang, but code has since changed enough that this patch has
nothing in common with theirs, except for this commit message.
Co-developed-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210202090433.13441-4-chenyi.qiang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAichun Shi <aichun.shi@intel.com>

b9ddddea

KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW · 232e5522

由 Chenyi Qiang 提交于 2月 02, 2021

mainline inclusion
from mainline-v5.12-rc1
commit 9a3ecd5e
category: feature
feature: KVM Bus Lock Debug Exception
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RHW7
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
commit/?id=9a3ecd5e

Intel-SIG: commit 9a3ecd5e ("KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW")

-------------------------------------

KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW

DR6_INIT contains the 1-reserved bits as well as the bit that is cleared
to 0 when the condition (e.g. RTM) happens. The value can be used to
initialize dr6 and also be the XOR mask between the #DB exit
qualification (or payload) and DR6.

Concerning that DR6_INIT is used as initial value only once, rename it
to DR6_ACTIVE_LOW and apply it in other places, which would make the
incoming changes for bus lock debug exception more simple.
Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210202090433.13441-2-chenyi.qiang@intel.com>
[Define DR6_FIXED_1 from DR6_ACTIVE_LOW and DR6_VOLATILE. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAichun Shi <aichun.shi@intel.com>

232e5522

KVM: VMX: Enable Notify VM exit · 0cbdfd9b

由 Tao Xu 提交于 5月 24, 2022

mainline inclusion
from mainline-v6.0-rc1
commit 2f4073e0
category: feature
feature: Notify VM exit
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
commit/?id=2f4073e0

Intel-SIG: commit 2f4073e0 ("KVM: VMX: Enable Notify VM exit")

-------------------------------------

KVM: VMX: Enable Notify VM exit

There are cases that malicious virtual machines can cause CPU stuck (due
to event windows don't open up), e.g., infinite loop in microcode when
nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
IRQ) can be delivered. It leads the CPU to be unavailable to host or
other VMs.

VMM can enable notify VM exit that a VM exit generated if no event
window occurs in VM non-root mode for a specified amount of time (notify
window).

Feature enabling:
- The new vmcs field SECONDARY_EXEC_NOTIFY_VM_EXITING is introduced to
  enable this feature. VMM can set NOTIFY_WINDOW vmcs field to adjust
  the expected notify window.
- Add a new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT so that user space
  can query and enable this feature in per-VM scope. The argument is a
  64bit value: bits 63:32 are used for notify window, and bits 31:0 are
  for flags. Current supported flags:
  - KVM_X86_NOTIFY_VMEXIT_ENABLED: enable the feature with the notify
    window provided.
  - KVM_X86_NOTIFY_VMEXIT_USER: exit to userspace once the exits happen.
- It's safe to even set notify window to zero since an internal hardware
  threshold is added to vmcs.notify_window.

VM exit handling:
- Introduce a vcpu state notify_window_exits to records the count of
  notify VM exits and expose it through the debugfs.
- Notify VM exit can happen incident to delivery of a vector event.
  Allow it in KVM.
- Exit to userspace unconditionally for handling when VM_CONTEXT_INVALID
  bit is set.

Nested handling
- Nested notify VM exits are not supported yet. Keep the same notify
  window control in vmcs02 as vmcs01, so that L1 can't escape the
  restriction of notify VM exits through launching L2 VM.

Notify VM exit is defined in latest Intel Architecture Instruction Set
Extensions Programming Reference, chapter 9.2.
Co-developed-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NTao Xu <tao3.xu@intel.com>
Co-developed-by: NChenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20220524135624.22988-5-chenyi.qiang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAichun Shi <aichun.shi@intel.com>

0cbdfd9b

KVM: VMX: Remove redundant handling of bus lock vmexit · 265cc29f

由 Hao Xiang 提交于 10月 15, 2021

mainline inclusion
from mainline-v5.15-rc7
commit d61863c6
category: feature
feature: KVM Bus Lock VM Exit
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
commit/?id=d61863c6

Intel-SIG: commit d61863c6 ("KVM: VMX: Remove redundant handling of bus lock vmexit")

-------------------------------------

KVM: VMX: Remove redundant handling of bus lock vmexit

Hardware may or may not set exit_reason.bus_lock_detected on BUS_LOCK
VM-Exits. Dealing with KVM_RUN_X86_BUS_LOCK in handle_bus_lock_vmexit
could be redundant when exit_reason.basic is EXIT_REASON_BUS_LOCK.

We can remove redundant handling of bus lock vmexit. Unconditionally Set
exit_reason.bus_lock_detected in handle_bus_lock_vmexit(), and deal with
KVM_RUN_X86_BUS_LOCK only in vmx_handle_exit().
Signed-off-by: NHao Xiang <hao.xiang@linux.alibaba.com>
Message-Id: <1634299161-30101-1-git-send-email-hao.xiang@linux.alibaba.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAichun Shi <aichun.shi@intel.com>

265cc29f

KVM: VMX: Enable bus lock VM exit · 26bba696

由 Chenyi Qiang 提交于 11月 06, 2020

mainline inclusion
from mainline-v5.12-rc1
commit fe6b6bc8
category: feature
feature: KVM Bus Lock VM Exit
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
commit/?id=fe6b6bc8

Intel-SIG: commit fe6b6bc8 ("KVM: VMX: Enable bus lock VM exit")

-------------------------------------

KVM: VMX: Enable bus lock VM exit

Virtual Machine can exploit bus locks to degrade the performance of
system. Bus lock can be caused by split locked access to writeback(WB)
memory or by using locks on uncacheable(UC) memory. The bus lock is
typically >1000 cycles slower than an atomic operation within a cache
line. It also disrupts performance on other cores (which must wait for
the bus lock to be released before their memory operations can
complete).

To address the threat, bus lock VM exit is introduced to notify the VMM
when a bus lock was acquired, allowing it to enforce throttling or other
policy based mitigations.

A VMM can enable VM exit due to bus locks by setting a new "Bus Lock
Detection" VM-execution control(bit 30 of Secondary Processor-based VM
execution controls). If delivery of this VM exit was preempted by a
higher priority VM exit (e.g. EPT misconfiguration, EPT violation, APIC
access VM exit, APIC write VM exit, exception bitmap exiting), bit 26 of
exit reason in vmcs field is set to 1.

In current implementation, the KVM exposes this capability through
KVM_CAP_X86_BUS_LOCK_EXIT. The user can get the supported mode bitmap
(i.e. off and exit) and enable it explicitly (disabled by default). If
bus locks in guest are detected by KVM, exit to user space even when
current exit reason is handled by KVM internally. Set a new field
KVM_RUN_BUS_LOCK in vcpu->run->flags to inform the user space that there
is a bus lock detected in guest.

Document for Bus Lock VM exit is now available at the latest "Intel
Architecture Instruction Set Extensions Programming Reference".

Document Link:
https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.htmlCo-developed-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20201106090315.18606-4-chenyi.qiang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAichun Shi <aichun.shi@intel.com>

26bba696

27 10月, 2022 1 次提交

KVM: x86: Account a variety of miscellaneous allocations · 3ed16a7d

由 Sean Christopherson 提交于 10月 26, 2022

stable inclusion
from stable-v5.10.124
commit d6be031a2f5e27f27f3648bac98d2a35874eaddc
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6E7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d6be031a2f5e27f27f3648bac98d2a35874eaddc

--------------------------------

commit eba04b20 upstream.

Switch to GFP_KERNEL_ACCOUNT for a handful of allocations that are
clearly associated with a single task/VM.

Note, there are a several SEV allocations that aren't accounted, but
those can (hopefully) be fixed by using the local stack for memory.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210331023025.2485960-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
[sudip: adjust context]
Signed-off-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

3ed16a7d

08 10月, 2022 4 次提交

KVM: VMX: enable IPI virtualization · c52cf141

由 Chao Gao 提交于 4月 19, 2022

mainline inclusion
from mainline-v6.0-rc1
commit d588bb9b
category: feature
feature: IPI Virtualization
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d588bb9be1da6aa750aa64875fe57369db983d8b

Intel-SIG: commit d588bb9b ("KVM: VMX: enable IPI virtualization")

-------------------------------------

KVM: VMX: enable IPI virtualization

With IPI virtualization enabled, the processor emulates writes to
APIC registers that would send IPIs. The processor sets the bit
corresponding to the vector in target vCPU's PIR and may send a
notification (IPI) specified by NDST and NV fields in target vCPU's
Posted-Interrupt Descriptor (PID). It is similar to what IOMMU
engine does when dealing with posted interrupt from devices.

A PID-pointer table is used by the processor to locate the PID of a
vCPU with the vCPU's APIC ID. The table size depends on maximum APIC
ID assigned for current VM session from userspace. Allocating memory
for PID-pointer table is deferred to vCPU creation, because irqchip
mode and VM-scope maximum APIC ID is settled at that point. KVM can
skip PID-pointer table allocation if !irqchip_in_kernel().

Like VT-d PI, if a vCPU goes to blocked state, VMM needs to switch its
notification vector to wakeup vector. This can ensure that when an IPI
for blocked vCPUs arrives, VMM can get control and wake up blocked
vCPUs. And if a VCPU is preempted, its posted interrupt notification
is suppressed.

Note that IPI virtualization can only virualize physical-addressing,
flat mode, unicast IPIs. Sending other IPIs would still cause a
trap-like APIC-write VM-exit and need to be handled by VMM.
Signed-off-by: NChao Gao <chao.gao@intel.com>
Signed-off-by: NZeng Guang <guang.zeng@intel.com>
Message-Id: <20220419154510.11938-1-guang.zeng@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJason Zeng <jason.zeng@intel.com>

c52cf141

KVM: VMX: Clean up vmx_refresh_apicv_exec_ctrl() · d178b8c3

由 Zeng Guang 提交于 4月 19, 2022

mainline inclusion
from mainline-v6.0-rc1
commit f08a06c9
category: feature
feature: IPI Virtualization
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f08a06c9a35706349f74b7a18deefe3f89f73e8e

Intel-SIG: commit f08a06c9 ("KVM: VMX: Clean up vmx_refresh_apicv_exec_ctrl()")

-------------------------------------

KVM: VMX: Clean up vmx_refresh_apicv_exec_ctrl()

Remove the condition check cpu_has_secondary_exec_ctrls(). Calling
vmx_refresh_apicv_exec_ctrl() premises secondary controls activated
and VMCS fields related to APICv valid as well. If it's invoked in
wrong circumstance at the worst case, VMX operation will report
VMfailValid error without further harmful impact and just functions
as if all the secondary controls were 0.
Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NZeng Guang <guang.zeng@intel.com>
Message-Id: <20220419153604.11786-1-guang.zeng@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJason Zeng <jason.zeng@intel.com>

d178b8c3

KVM: VMX: Report tertiary_exec_control field in dump_vmcs() · 88642907

由 Robert Hoo 提交于 4月 19, 2022

mainline inclusion
from mainline-v6.0-rc1
commit 0b85baa5
category: feature
feature: IPI Virtualization
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0b85baa5f46de1c6ad6e4b987905df041f2f80f0

Intel-SIG: commit 0b85baa5 ("KVM: VMX: Report tertiary_exec_control field in dump_vmcs()")

-------------------------------------

KVM: VMX: Report tertiary_exec_control field in dump_vmcs()

Add tertiary_exec_control field report in dump_vmcs(). Meanwhile,
reorganize the dump output of VMCS category as follows.

Before change:
*** Control State ***
 PinBased=0x000000ff CPUBased=0xb5a26dfa SecondaryExec=0x061037eb
 EntryControls=0000d1ff ExitControls=002befff

After change:
*** Control State ***
 CPUBased=0xb5a26dfa SecondaryExec=0x061037eb TertiaryExec=0x0000000000000010
 PinBased=0x000000ff EntryControls=0000d1ff ExitControls=002befff
Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: NRobert Hoo <robert.hu@linux.intel.com>
Signed-off-by: NZeng Guang <guang.zeng@intel.com>
Message-Id: <20220419153441.11687-1-guang.zeng@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJason Zeng <jason.zeng@intel.com>

88642907

KVM: VMX: Detect Tertiary VM-Execution control when setup VMCS config · d2ad10f2

由 Robert Hoo 提交于 4月 19, 2022

mainline inclusion
from mainline-v6.0-rc1
commit 1ad4e543
category: feature
feature: IPI Virtualization
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1ad4e5438c67a01620ed67cea959de89f4430515

Intel-SIG: commit 1ad4e543 ("KVM: VMX: Detect Tertiary VM-Execution control when setup VMCS config")

-------------------------------------

KVM: VMX: Detect Tertiary VM-Execution control when setup VMCS config

Check VMX features on tertiary execution control in VMCS config setup.
Sub-features in tertiary execution control to be enabled are adjusted
according to hardware capabilities although no sub-feature is enabled
in this patch.

EVMCSv1 doesn't support tertiary VM-execution control, so disable it
when EVMCSv1 is in use. And define the auxiliary functions for Tertiary
control field here, using the new BUILD_CONTROLS_SHADOW().
Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: NRobert Hoo <robert.hu@linux.intel.com>
Signed-off-by: NZeng Guang <guang.zeng@intel.com>
Message-Id: <20220419153400.11642-1-guang.zeng@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJason Zeng <jason.zeng@intel.com>

d2ad10f2

20 9月, 2022 6 次提交

KVM: VMX: Fix IBRS handling after vmexit · f52a3871

由 Josh Poimboeuf 提交于 9月 20, 2022

stable inclusion
from stable-v5.10.133
commit 47ae76fb27398e867980d63789058ff7c4f12a35
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PTAS
CVE: CVE-2022-29900,CVE-2022-23816,CVE-2022-29901

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=47ae76fb27398e867980d63789058ff7c4f12a35

--------------------------------

commit bea7e31a upstream.

For legacy IBRS to work, the IBRS bit needs to be always re-written
after vmexit, even if it's already on.
Signed-off-by: NJosh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLin Yujun <linyujun809@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f52a3871

KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS · c261aef7

由 Josh Poimboeuf 提交于 9月 20, 2022

stable inclusion
from stable-v5.10.133
commit 5269be9111e2b66572e78647f2e8948f7fc96466
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PTAS
CVE: CVE-2022-29900,CVE-2022-23816,CVE-2022-29901

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5269be9111e2b66572e78647f2e8948f7fc96466

--------------------------------

commit fc02735b upstream.

On eIBRS systems, the returns in the vmexit return path from
__vmx_vcpu_run() to vmx_vcpu_run() are exposed to RSB poisoning attacks.

Fix that by moving the post-vmexit spec_ctrl handling to immediately
after the vmexit.
Signed-off-by: NJosh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

conflict:
	arch/x86/kvm/vmx/vmx.h
Signed-off-by: NLin Yujun <linyujun809@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c261aef7

KVM: VMX: Convert launched argument to flags · decddf55

由 Josh Poimboeuf 提交于 9月 20, 2022

stable inclusion
from stable-v5.10.133
commit 84061fff2ad98a7809f00e88a54f584f84830388
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PTAS
CVE: CVE-2022-29900,CVE-2022-23816,CVE-2022-29901

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=84061fff2ad98a7809f00e88a54f584f84830388

--------------------------------

commit bb066506 upstream.

Convert __vmx_vcpu_run()'s 'launched' argument to 'flags', in
preparation for doing SPEC_CTRL handling immediately after vmexit, which
will need another flag.

This is much easier than adding a fourth argument, because this code
supports both 32-bit and 64-bit, and the fourth argument on 32-bit would
have to be pushed on the stack.

Note that __vmx_vcpu_run_flags() is called outside of the noinstr
critical section because it will soon start calling potentially
traceable functions.
Signed-off-by: NJosh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

conflict:
	arch/x86/kvm/vmx/vmx.h
Signed-off-by: NLin Yujun <linyujun809@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

decddf55

x86/kvm/vmx: Make noinstr clean · 55b31b25

由 Peter Zijlstra 提交于 9月 20, 2022

stable inclusion
from stable-v5.10.133
commit 7070bbb66c5303117e4c7651711ea7daae4c64b5
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PTAS
CVE: CVE-2022-29900,CVE-2022-23816,CVE-2022-29901

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7070bbb66c5303117e4c7651711ea7daae4c64b5

--------------------------------

commit 742ab6df upstream.

The recent mmio_stale_data fixes broke the noinstr constraints:

  vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0x15b: call to wrmsrl.constprop.0() leaves .noinstr.text section
  vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0x1bf: call to kvm_arch_has_assigned_device() leaves .noinstr.text section

make it all happy again.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLin Yujun <linyujun809@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

55b31b25

KVM/nVMX: Use __vmx_vcpu_run in nested_vmx_check_vmentry_hw · 45a8cb07

由 Uros Bizjak 提交于 9月 20, 2022

stable inclusion
from stable-v5.10.133
commit dd87aa5f610be44f195cf5a99b7bc153faf30a3d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PTAS
CVE: CVE-2022-29900,CVE-2022-23816,CVE-2022-29901

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=dd87aa5f610be44f195cf5a99b7bc153faf30a3d

--------------------------------

commit 150f17bf upstream.

Replace inline assembly in nested_vmx_check_vmentry_hw
with a call to __vmx_vcpu_run.  The function is not
performance critical, so (double) GPR save/restore
in __vmx_vcpu_run can be tolerated, as far as performance
effects are concerned.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Reviewed-and-tested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
[sean: dropped versioning info from changelog]
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20201231002702.22237077-5-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

conflict:
	arch/x86/kvm/vmx/vmx.h
Signed-off-by: NLin Yujun <linyujun809@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

45a8cb07

KVM: x86: do not report a vCPU as preempted outside instruction boundaries · 0b66a631

由 Paolo Bonzini 提交于 9月 20, 2022

mainline inclusion
from mainline-v5.19-rc2
commit 6cd88243
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PJ7H
CVE: CVE-2022-39189

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6cd88243c7e03845a450795e134b488fc2afb736

----------------------------------------

If a vCPU is outside guest mode and is scheduled out, it might be in the
process of making a memory access.  A problem occurs if another vCPU uses
the PV TLB flush feature during the period when the vCPU is scheduled
out, and a virtual address has already been translated but has not yet
been accessed, because this is equivalent to using a stale TLB entry.

To avoid this, only report a vCPU as preempted if sure that the guest
is at an instruction boundary.  A rescheduling request will be delivered
to the host physical CPU as an external interrupt, so for simplicity
consider any vmexit *not* instruction boundary except for external
interrupts.

It would in principle be okay to report the vCPU as preempted also
if it is sleeping in kvm_vcpu_block(): a TLB flush IPI will incur the
vmentry/vmexit overhead unnecessarily, and optimistic spinning is
also unlikely to succeed.  However, leave it for later because right
now kvm_vcpu_check_block() is doing memory accesses.  Even
though the TLB flush issue only applies to virtual memory address,
it's very much preferrable to be conservative.
Reported-by: NJann Horn <jannh@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

conflict:
	arch/x86/kvm/x86.c
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0b66a631

08 7月, 2022 4 次提交

KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC · 5698b7e8

由 Sean Christopherson 提交于 4月 12, 2021

mainline inclusion
from mainline-5.13
commit 72add915
category: feature
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK
CVE: NA

Intel-SIG: commit 72add915 KVM: VMX: Enable SGX virtualization for
SGX1, SGX2 and LC.
Backport for SGX virtualization support

--------------------------------

Enable SGX virtualization now that KVM has the VM-Exit handlers needed
to trap-and-execute ENCLS to ensure correctness and/or enforce the CPU
model exposed to the guest.  Add a KVM module param, "sgx", to allow an
admin to disable SGX virtualization independent of the kernel.

When supported in hardware and the kernel, advertise SGX1, SGX2 and SGX
LC to userspace via CPUID and wire up the ENCLS_EXITING bitmap based on
the guest's SGX capabilities, i.e. to allow ENCLS to be executed in an
SGX-enabled guest.  With the exception of the provision key, all SGX
attribute bits may be exposed to the guest.  Guest access to the
provision key, which is controlled via securityfs, will be added in a
future patch.

Note, KVM does not yet support exposing ENCLS_C leafs or ENCLV leafs.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NKai Huang <kai.huang@intel.com>
Message-Id: <a99e9c23310c79f2f4175c1af4c4cbcef913c3e5.1618196135.git.kai.huang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NFan Du <fan.du@intel.com>
Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>

5698b7e8

KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs · 1766b14e

由 Sean Christopherson 提交于 4月 12, 2021

mainline inclusion
from mainline-5.13
commit 8f102445
category: feature
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK
CVE: NA

Intel-SIG: commit 8f102445 KVM: VMX: Add emulation of SGX Launch
Control LE hash MSRs.
Backport for SGX virtualization support

--------------------------------

Emulate the four Launch Enclave public key hash MSRs (LE hash MSRs) that
exist on CPUs that support SGX Launch Control (LC).  SGX LC modifies the
behavior of ENCLS[EINIT] to use the LE hash MSRs when verifying the key
used to sign an enclave.  On CPUs without LC support, the LE hash is
hardwired into the CPU to an Intel controlled key (the Intel key is also
the reset value of the LE hash MSRs). Track the guest's desired hash so
that a future patch can stuff the hash into the hardware MSRs when
executing EINIT on behalf of the guest, when those MSRs are writable in
host.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: NKai Huang <kai.huang@intel.com>
Signed-off-by: NKai Huang <kai.huang@intel.com>
Message-Id: <c58ef601ddf88f3a113add837969533099b1364a.1618196135.git.kai.huang@intel.com>
[Add a comment regarding the MSRs being available until SGX is locked.
 - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NFan Du <fan.du@intel.com>
Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>

1766b14e

KVM: VMX: Frame in ENCLS handler for SGX virtualization · e4e22234

由 Sean Christopherson 提交于 4月 12, 2021

mainline inclusion
from mainline-5.13
commit 9798adbc
category: feature
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK
CVE: NA

Intel-SIG: commit 9798adbc KVM: VMX: Frame in ENCLS handler for
SGX virtualization.
Backport for SGX virtualization support

--------------------------------

Introduce sgx.c and sgx.h, along with the framework for handling ENCLS
VM-Exits.  Add a bool, enable_sgx, that will eventually be wired up to a
module param to control whether or not SGX virtualization is enabled at
runtime.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NKai Huang <kai.huang@intel.com>
Message-Id: <1c782269608b2f5e1034be450f375a8432fb705d.1618196135.git.kai.huang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NFan Du <fan.du@intel.com>
Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>

e4e22234

KVM: VMX: Add basic handling of VM-Exit from SGX enclave · 08f41fc4

由 Sean Christopherson 提交于 4月 12, 2021

mainline inclusion
from mainline-5.13
commit 3c0c2ad1
category: feature
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK
CVE: NA

Intel-SIG: commit 3c0c2ad1 KVM: VMX: Add basic handling of VM-Exit
from SGX enclave
Backport for SGX virtualization support

--------------------------------

Add support for handling VM-Exits that originate from a guest SGX
enclave.  In SGX, an "enclave" is a new CPL3-only execution environment,
wherein the CPU and memory state is protected by hardware to make the
state inaccesible to code running outside of the enclave.  When exiting
an enclave due to an asynchronous event (from the perspective of the
enclave), e.g. exceptions, interrupts, and VM-Exits, the enclave's state
is automatically saved and scrubbed (the CPU loads synthetic state), and
then reloaded when re-entering the enclave.  E.g. after an instruction
based VM-Exit from an enclave, vmcs.GUEST_RIP will not contain the RIP
of the enclave instruction that trigered VM-Exit, but will instead point
to a RIP in the enclave's untrusted runtime (the guest userspace code
that coordinates entry/exit to/from the enclave).

To help a VMM recognize and handle exits from enclaves, SGX adds bits to
existing VMCS fields, VM_EXIT_REASON.VMX_EXIT_REASON_FROM_ENCLAVE and
GUEST_INTERRUPTIBILITY_INFO.GUEST_INTR_STATE_ENCLAVE_INTR.  Define the
new architectural bits, and add a boolean to struct vcpu_vmx to cache
VMX_EXIT_REASON_FROM_ENCLAVE.  Clear the bit in exit_reason so that
checks against exit_reason do not need to account for SGX, e.g.
"if (exit_reason == EXIT_REASON_EXCEPTION_NMI)" continues to work.

KVM is a largely a passive observer of the new bits, e.g. KVM needs to
account for the bits when propagating information to a nested VMM, but
otherwise doesn't need to act differently for the majority of VM-Exits
from enclaves.

The one scenario that is directly impacted is emulation, which is for
all intents and purposes impossible[1] since KVM does not have access to
the RIP or instruction stream that triggered the VM-Exit.  The inability
to emulate is a non-issue for KVM, as most instructions that might
trigger VM-Exit unconditionally #UD in an enclave (before the VM-Exit
check.  For the few instruction that conditionally #UD, KVM either never
sets the exiting control, e.g. PAUSE_EXITING[2], or sets it if and only
if the feature is not exposed to the guest in order to inject a #UD,
e.g. RDRAND_EXITING.

But, because it is still possible for a guest to trigger emulation,
e.g. MMIO, inject a #UD if KVM ever attempts emulation after a VM-Exit
from an enclave.  This is architecturally accurate for instruction
VM-Exits, and for MMIO it's the least bad choice, e.g. it's preferable
to killing the VM.  In practice, only broken or particularly stupid
guests should ever encounter this behavior.

Add a WARN in skip_emulated_instruction to detect any attempt to
modify the guest's RIP during an SGX enclave VM-Exit as all such flows
should either be unreachable or must handle exits from enclaves before
getting to skip_emulated_instruction.

[1] Impossible for all practical purposes.  Not truly impossible
    since KVM could implement some form of para-virtualization scheme.

[2] PAUSE_LOOP_EXITING only affects CPL0 and enclaves exist only at
    CPL3, so we also don't need to worry about that interaction.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NKai Huang <kai.huang@intel.com>
Message-Id: <315f54a8507d09c292463ef29104e1d4c62e9090.1618196135.git.kai.huang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NFan Du <fan.du@intel.com>
Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>

08f41fc4

06 7月, 2022 2 次提交

KVM: x86/speculation: Disable Fill buffer clear within guests · 5f200803

由 Pawan Gupta 提交于 7月 06, 2022

stable inclusion
from stable-v5.10.123
commit bde15fdcce44956278b4f50680b7363ca126ffb9
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS
CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=bde15fdcce44956278b4f50680b7363ca126ffb9

--------------------------------

commit 027bbb88 upstream

The enumeration of MD_CLEAR in CPUID(EAX=7,ECX=0).EDX{bit 10} is not an
accurate indicator on all CPUs of whether the VERW instruction will
overwrite fill buffers. FB_CLEAR enumeration in
IA32_ARCH_CAPABILITIES{bit 17} covers the case of CPUs that are not
vulnerable to MDS/TAA, indicating that microcode does overwrite fill
buffers.

Guests running in VMM environments may not be aware of all the
capabilities/vulnerabilities of the host CPU. Specifically, a guest may
apply MDS/TAA mitigations when a virtual CPU is enumerated as vulnerable
to MDS/TAA even when the physical CPU is not. On CPUs that enumerate
FB_CLEAR_CTRL the VMM may set FB_CLEAR_DIS to skip overwriting of fill
buffers by the VERW instruction. This is done by setting FB_CLEAR_DIS
during VMENTER and resetting on VMEXIT. For guests that enumerate
FB_CLEAR (explicitly asking for fill buffer clear capability) the VMM
will not use FB_CLEAR_DIS.

Irrespective of guest state, host overwrites CPU buffers before VMENTER
to protect itself from an MMIO capable guest, as part of mitigation for
MMIO Stale Data vulnerabilities.
Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Conflicts:
                arch/x86/kvm/vmx/vmx.h
Signed-off-by: NYipeng Zou <zouyipeng@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5f200803

x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data · 7f6007bc

由 Pawan Gupta 提交于 7月 06, 2022

stable inclusion
from stable-v5.10.123
commit 26f6f231f6a5a79ccc274967939b22602dec76e8
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS
CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=26f6f231f6a5a79ccc274967939b22602dec76e8

--------------------------------

commit 8cb861e9 upstream

Processor MMIO Stale Data is a class of vulnerabilities that may
expose data after an MMIO operation. For details please refer to
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst.

These vulnerabilities are broadly categorized as:

Device Register Partial Write (DRPW):
Some endpoint MMIO registers incorrectly handle writes that are
smaller than the register size. Instead of aborting the write or only
copying the correct subset of bytes (for example, 2 bytes for a 2-byte
write), more bytes than specified by the write transaction may be
written to the register. On some processors, this may expose stale
data from the fill buffers of the core that created the write
transaction.

Shared Buffers Data Sampling (SBDS):
After propagators may have moved data around the uncore and copied
stale data into client core fill buffers, processors affected by MFBDS
can leak data from the fill buffer.

Shared Buffers Data Read (SBDR):
It is similar to Shared Buffer Data Sampling (SBDS) except that the
data is directly read into the architectural software-visible state.

An attacker can use these vulnerabilities to extract data from CPU fill
buffers using MDS and TAA methods. Mitigate it by clearing the CPU fill
buffers using the VERW instruction before returning to a user or a
guest.

On CPUs not affected by MDS and TAA, user application cannot sample data
from CPU fill buffers using MDS or TAA. A guest with MMIO access can
still use DRPW or SBDR to extract data architecturally. Mitigate it with
VERW instruction to clear fill buffers before VMENTER for MMIO capable
guests.

Add a kernel parameter mmio_stale_data={off|full|full,nosmt} to control
the mitigation.
Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYipeng Zou <zouyipeng@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7f6007bc

19 5月, 2022 1 次提交

KVM: VMX: Set vmcs.PENDING_DBG.BS on #DB in STI/MOVSS blocking shadow · 617a368d

由 Sean Christopherson 提交于 5月 18, 2022

stable inclusion
from stable-v5.10.101
commit 3aa5c8657292e05e6dfa8fe2316951001dab7e3a
bugzilla: https://gitee.com/openeuler/kernel/issues/I5669Z

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3aa5c8657292e05e6dfa8fe2316951001dab7e3a

--------------------------------

[ Upstream commit b9bed78e ]

Set vmcs.GUEST_PENDING_DBG_EXCEPTIONS.BS, a.k.a. the pending single-step
breakpoint flag, when re-injecting a #DB with RFLAGS.TF=1, and STI or
MOVSS blocking is active.  Setting the flag is necessary to make VM-Entry
consistency checks happy, as VMX has an invariant that if RFLAGS.TF is
set and STI/MOVSS blocking is true, then the previous instruction must
have been STI or MOV/POP, and therefore a single-step #DB must be pending
since the RFLAGS.TF cannot have been set by the previous instruction,
i.e. the one instruction delay after setting RFLAGS.TF must have already
expired.

Normally, the CPU sets vmcs.GUEST_PENDING_DBG_EXCEPTIONS.BS appropriately
when recording guest state as part of a VM-Exit, but #DB VM-Exits
intentionally do not treat the #DB as "guest state" as interception of
the #DB effectively makes the #DB host-owned, thus KVM needs to manually
set PENDING_DBG.BS when forwarding/re-injecting the #DB to the guest.

Note, although this bug can be triggered by guest userspace, doing so
requires IOPL=3, and guest userspace running with IOPL=3 has full access
to all I/O ports (from the guest's perspective) and can crash/reboot the
guest any number of ways.  IOPL=3 is required because STI blocking kicks
in if and only if RFLAGS.IF is toggled 0=>1, and if CPL>IOPL, STI either
takes a #GP or modifies RFLAGS.VIF, not RFLAGS.IF.

MOVSS blocking can be initiated by userspace, but can be coincident with
a #DB if and only if DR7.GD=1 (General Detect enabled) and a MOV DR is
executed in the MOVSS shadow.  MOV DR #GPs at CPL>0, thus MOVSS blocking
is problematic only for CPL0 (and only if the guest is crazy enough to
access a DR in a MOVSS shadow).  All other sources of #DBs are either
suppressed by MOVSS blocking (single-step, code fetch, data, and I/O),
are mutually exclusive with MOVSS blocking (T-bit task switch), or are
already handled by KVM (ICEBP, a.k.a. INT1).

This bug was originally found by running tests[1] created for XSA-308[2].
Note that Xen's userspace test emits ICEBP in the MOVSS shadow, which is
presumably why the Xen bug was deemed to be an exploitable DOS from guest
userspace.  KVM already handles ICEBP by skipping the ICEBP instruction
and thus clears MOVSS blocking as a side effect of its "emulation".

[1] http://xenbits.xenproject.org/docs/xtf/xsa-308_2main_8c_source.html
[2] https://xenbits.xen.org/xsa/advisory-308.htmlReported-by: NDavid Woodhouse <dwmw2@infradead.org>
Reported-by: NAlexander Graf <graf@amazon.de>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20220120000624.655815-1-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

617a368d

28 4月, 2022 1 次提交

x86: KVM: Fixed the bug that WAITmax cannot be updated in real time · af36aed7

由 liangtian 提交于 4月 27, 2022

virt inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I53PTV?from=project-issue
CVE: NA

-----------------------------------------------------

Since the reset function is in kvm_intel module instead of kvm
module, the attribute weak function in kvm_main.c could not be found, which
would cause st_max in X86 never be refreshed.
The solution is to define the reset function in x86.c under the kvm module.
Signed-off-by: Nliangtian <liangtian13@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

af36aed7

19 4月, 2022 1 次提交

KVM: x86: Register Processor Trace interrupt hook iff PT enabled in guest · d58ae1b4

由 Sean Christopherson 提交于 4月 19, 2022

stable inclusion
from stable-v5.10.93
commit 413b427f5fff5d658c2605ca889d6b13b88efd0c
bugzilla: 186204 https://gitee.com/openeuler/kernel/issues/I5311N

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=413b427f5fff5d658c2605ca889d6b13b88efd0c

--------------------------------

commit f4b027c5 upstream.

Override the Processor Trace (PT) interrupt handler for guest mode if and
only if PT is configured for host+guest mode, i.e. is being used
independently by both host and guest.  If PT is configured for system
mode, the host fully controls PT and must handle all events.

Fixes: 8479e04e ("KVM: x86: Inject PMI for KVM guest")
Reported-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: NArtem Kashkanov <artem.kashkanov@intel.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211111020738.2512932-4-seanjc@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

d58ae1b4

26 1月, 2022 1 次提交

KVM: VMX: Wake vCPU when delivering posted IRQ even if vCPU == this vCPU · fec8390f

由 Sean Christopherson 提交于 1月 26, 2022

stable inclusion
from stable-v5.10.89
commit 28626e76baf50e6b37d8a92564844d873aa6b51f
bugzilla: 186140 https://gitee.com/openeuler/kernel/issues/I4S8HA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=28626e76baf50e6b37d8a92564844d873aa6b51f

--------------------------------

commit fdba608f upstream.

Drop a check that guards triggering a posted interrupt on the currently
running vCPU, and more importantly guards waking the target vCPU if
triggering a posted interrupt fails because the vCPU isn't IN_GUEST_MODE.
If a vIRQ is delivered from asynchronous context, the target vCPU can be
the currently running vCPU and can also be blocking, in which case
skipping kvm_vcpu_wake_up() is effectively dropping what is supposed to
be a wake event for the vCPU.

The "do nothing" logic when "vcpu == running_vcpu" mostly works only
because the majority of calls to ->deliver_posted_interrupt(), especially
when using posted interrupts, come from synchronous KVM context.  But if
a device is exposed to the guest using vfio-pci passthrough, the VFIO IRQ
and vCPU are bound to the same pCPU, and the IRQ is _not_ configured to
use posted interrupts, wake events from the device will be delivered to
KVM from IRQ context, e.g.

  vfio_msihandler()
  |
  |-> eventfd_signal()
      |
      |-> ...
          |
          |->  irqfd_wakeup()
               |
               |->kvm_arch_set_irq_inatomic()
                  |
                  |-> kvm_irq_delivery_to_apic_fast()
                      |
                      |-> kvm_apic_set_irq()

This also aligns the non-nested and nested usage of triggering posted
interrupts, and will allow for additional cleanups.

Fixes: 379a3c8e ("KVM: VMX: Optimize posted-interrupt delivery for timer fastpath")
Cc: stable@vger.kernel.org
Reported-by: NLongpeng (Mike) <longpeng2@huawei.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20211208015236.1616697-18-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fec8390f

14 1月, 2022 1 次提交

KVM: nVMX: Flush current VPID (L1 vs. L2) for KVM_REQ_TLB_FLUSH_GUEST · 9cbea932

由 Sean Christopherson 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.84
commit 7722e88505226d64d7b2158b470e6945ef759832
bugzilla: 186030 https://gitee.com/openeuler/kernel/issues/I4QV2F

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7722e88505226d64d7b2158b470e6945ef759832

--------------------------------

commit 2b4a5a5d upstream.

Flush the current VPID when handling KVM_REQ_TLB_FLUSH_GUEST instead of
always flushing vpid01.  Any TLB flush that is triggered when L2 is
active is scoped to L2's VPID (if it has one), e.g. if L2 toggles CR4.PGE
and L1 doesn't intercept PGE writes, then KVM's emulation of the TLB
flush needs to be applied to L2's VPID.
Reported-by: NLai Jiangshan <jiangshanlai+lkml@gmail.com>
Fixes: 07ffaf34 ("KVM: nVMX: Sync all PGDs on nested transition with shadow paging")
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20211125014944.536398-2-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9cbea932

07 1月, 2022 5 次提交

KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI · 82d2ccbb

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit e6209a3b
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e6209a3bef793e8fe29c873a7612023916eaa611

-------------------

The current vPMU only supports Architecture Version 2. According to
Intel SDM "17.4.7 Freezing LBR and Performance Counters on PMI", if
IA32_DEBUGCTL.Freeze_LBR_On_PMI = 1, the LBR is frozen on the virtual
PMI and the KVM would emulate to clear the LBR bit (bit 0) in
IA32_DEBUGCTL. Also, guest needs to re-enable IA32_DEBUGCTL.LBR
to resume recording branches.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210201051039.255478-9-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

82d2ccbb

KVM: vmx/pmu: Pass-through LBR msrs when the guest LBR event is ACTIVE · e6aea025

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 1b5ac322
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1b5ac3226a1aa071135fe0ee5d1055d9e88b717c

-------------------

In addition to DEBUGCTLMSR_LBR, any KVM trap caused by LBR msrs access
will result in a creation of guest LBR event per-vcpu.

If the guest LBR event is scheduled on with the corresponding vcpu context,
KVM will pass-through all LBR records msrs to the guest. The LBR callstack
mechanism implemented in the host could help save/restore the guest LBR
records during the event context switches, which reduces a lot of overhead
if we save/restore tens of LBR msrs (e.g. 32 LBR records entries) in the
much more frequent VMX transitions.

To avoid reclaiming LBR resources from any higher priority event on host,
KVM would always check the exist of guest LBR event and its state before
vm-entry as late as possible. A negative result would cancel the
pass-through state, and it also prevents real registers accesses and
potential data leakage. If host reclaims the LBR between two checks, the
interception state and LBR records can be safely preserved due to native
save/restore support from guest LBR event.

The KVM emits a pr_warn() when the LBR hardware is unavailable to the
guest LBR event. The administer is supposed to reminder users that the
guest result may be inaccurate if someone is using LBR to record
hypervisor on the host side.
Suggested-by: NAndi Kleen <ak@linux.intel.com>
Co-developed-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210201051039.255478-7-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e6aea025

KVM: vmx/pmu: Create a guest LBR event when vcpu sets DEBUGCTLMSR_LBR · 157add0e

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 8e12911b
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8e12911b243e485f5e4c7c5fbc79cdf185728700

-------------------

When vcpu sets DEBUGCTLMSR_LBR in the MSR_IA32_DEBUGCTLMSR, the KVM handler
would create a guest LBR event which enables the callstack mode and none of
hardware counter is assigned. The host perf would schedule and enable this
event as usual but in an exclusive way.

The guest LBR event will be released when the vPMU is reset but soon,
the lazy release mechanism would be applied to this event like a vPMC.
Suggested-by: NAndi Kleen <ak@linux.intel.com>
Co-developed-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210201051039.255478-6-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

157add0e

KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled · 1c78a8dc

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit c6462363
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c646236344e9054cc84cd5a9f763163b9654cf7e

-------------------

Usespace could set the bits [0, 5] of the IA32_PERF_CAPABILITIES
MSR which tells about the record format stored in the LBR records.

The LBR will be enabled on the guest if host perf supports LBR
(checked via x86_perf_get_lbr()) and the vcpu model is compatible
with the host one.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Message-Id: <20210201051039.255478-4-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1c78a8dc

KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled · 81d99ef6

由 Paolo Bonzini 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 9c9520ce
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9c9520ce883386dc3794c7d60204487ff1db09cb

-------------------

Usespace could set the bits [0, 5] of the IA32_PERF_CAPABILITIES
MSR which tells about the record format stored in the LBR records.

The LBR will be enabled on the guest if host perf supports LBR
(checked via x86_perf_get_lbr()) and the vcpu model is compatible
with the host one.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Message-Id: <20210201051039.255478-4-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

81d99ef6

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功