提交 · 5ef8acbdd687c9d72582e2c05c0b9756efb37863 · openeuler / Kernel

23 2月, 2020 1 次提交

KVM: nVMX: Emulate MTF when performing instruction emulation · 5ef8acbd

由 Oliver Upton 提交于 2月 07, 2020

Since commit 5f3d45e7 ("kvm/x86: add support for
MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
flag processor-based execution control for its L2 guest. KVM simply
forwards any MTF VM-exits to the L1 guest, which works for normal
instruction execution.

However, when KVM needs to emulate an instruction on the behalf of an L2
guest, the monitor trap flag is not emulated. Add the necessary logic to
kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
instruction emulation for L2.

Fixes: 5f3d45e7 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
Signed-off-by: NOliver Upton <oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5ef8acbd

22 2月, 2020 3 次提交

KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when... · a4443267

由 Vitaly Kuznetsov 提交于 2月 20, 2020

KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled

When apicv is disabled on a vCPU (e.g. by enabling KVM_CAP_HYPERV_SYNIC*),
nothing happens to VMX MSRs on the already existing vCPUs, however, all new
ones are created with PIN_BASED_POSTED_INTR filtered out. This is very
confusing and results in the following picture inside the guest:

$ rdmsr -ax 0x48d
ff00000016
7f00000016
7f00000016
7f00000016

This is observed with QEMU and 4-vCPU guest: QEMU creates vCPU0, does
KVM_CAP_HYPERV_SYNIC2 and then creates the remaining three.

L1 hypervisor may only check CPU0's controls to find out what features
are available and it will be very confused later. Switch to setting
PIN_BASED_POSTED_INTR control based on global 'enable_apicv' setting.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a4443267

KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1 · 91a5f413

由 Vitaly Kuznetsov 提交于 2月 20, 2020

Even when APICv is disabled for L1 it can (and, actually, is) still
available for L2, this means we need to always call
vmx_deliver_nested_posted_interrupt() when attempting an interrupt
delivery.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

91a5f413

KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow · c9dfd3fb

由 wanpeng li 提交于 2月 17, 2020

For the duration of mapping eVMCS, it derefences ->memslots without holding
->srcu or ->slots_lock when accessing hv assist page. This patch fixes it by
moving nested_sync_vmcs12_to_shadow to prepare_guest_switch, where the SRCU
is already taken.

It can be reproduced by running kvm's evmcs_test selftest.

  =============================
  warning: suspicious rcu usage
  5.6.0-rc1+ #53 tainted: g        w ioe
  -----------------------------
  ./include/linux/kvm_host.h:623 suspicious rcu_dereference_check() usage!

  other info that might help us debug this:

   rcu_scheduler_active = 2, debug_locks = 1
  1 lock held by evmcs_test/8507:
   #0: ffff9ddd156d00d0 (&vcpu->mutex){+.+.}, at:
kvm_vcpu_ioctl+0x85/0x680 [kvm]

  stack backtrace:
  cpu: 6 pid: 8507 comm: evmcs_test tainted: g        w ioe     5.6.0-rc1+ #53
  hardware name: dell inc. optiplex 7040/0jctf8, bios 1.4.9 09/12/2016
  call trace:
   dump_stack+0x68/0x9b
   kvm_read_guest_cached+0x11d/0x150 [kvm]
   kvm_hv_get_assist_page+0x33/0x40 [kvm]
   nested_enlightened_vmentry+0x2c/0x60 [kvm_intel]
   nested_vmx_handle_enlightened_vmptrld.part.52+0x32/0x1c0 [kvm_intel]
   nested_sync_vmcs12_to_shadow+0x439/0x680 [kvm_intel]
   vmx_vcpu_run+0x67a/0xe60 [kvm_intel]
   vcpu_enter_guest+0x35e/0x1bc0 [kvm]
   kvm_arch_vcpu_ioctl_run+0x40b/0x670 [kvm]
   kvm_vcpu_ioctl+0x370/0x680 [kvm]
   ksys_ioctl+0x235/0x850
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x77/0x780
   entry_syscall_64_after_hwframe+0x49/0xbe
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c9dfd3fb

13 2月, 2020 1 次提交

KVM: nVMX: Use correct root level for nested EPT shadow page tables · 148d735e

由 Sean Christopherson 提交于 2月 07, 2020

Hardcode the EPT page-walk level for L2 to be 4 levels, as KVM's MMU
currently also hardcodes the page walk level for nested EPT to be 4
levels. The L2 guest is all but guaranteed to soft hang on its first
instruction when L1 is using EPT, as KVM will construct 4-level page
tables and then tell hardware to use 5-level page tables.

Fixes: 855feb67 ("KVM: MMU: Add 5 level EPT & Shadow page table support.")
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

148d735e

12 2月, 2020 1 次提交

KVM: x86: do not reset microcode version on INIT or RESET · bab0c318

由 Paolo Bonzini 提交于 2月 11, 2020

Do not initialize the microcode version at RESET or INIT, only on vCPU
creation.   Microcode updates are not lost during INIT, and exact
behavior across a warm RESET is not specified by the architecture.

Since we do not support a microcode update directly from the hypervisor,
but only as a result of userspace setting the microcode version MSR,
it's simpler for userspace if we do nothing in KVM and let userspace
emulate behavior for RESET as it sees fit.

Userspace can tie the fix to the availability of MSR_IA32_UCODE_REV in
the list of emulated MSRs.
Reported-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bab0c318

05 2月, 2020 6 次提交

KVM: vmx: delete meaningless vmx_decache_cr0_guest_bits() declaration · a8be1ad0

由 Miaohe Lin 提交于 2月 05, 2020

The function vmx_decache_cr0_guest_bits() is only called below its
implementation. So this is meaningless and should be removed.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a8be1ad0

x86/kvm/hyper-v: move VMX controls sanitization out of nested_enable_evmcs() · 31de3d25

由 Vitaly Kuznetsov 提交于 2月 05, 2020

With fine grained VMX feature enablement QEMU>=4.2 tries to do KVM_SET_MSRS
with default (matching CPU model) values and in case eVMCS is also enabled,
fails.

It would be possible to drop VMX feature filtering completely and make
this a guest's responsibility: if it decides to use eVMCS it should know
which fields are available and which are not. Hyper-V mostly complies to
this, however, there are some problematic controls:
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES
VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL

which Hyper-V enables. As there are no corresponding fields in eVMCS, we
can't handle this properly in KVM. This is a Hyper-V issue.

Move VMX controls sanitization from nested_enable_evmcs() to vmx_get_msr(),
and do the bare minimum (only clear controls which are known to cause issues).
This allows userspace to keep setting controls it wants and at the same
time hides them from the guest.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

31de3d25

kvm: x86: hyperv: Use APICv update request interface · f4fdc0a2

由 Suravee Suthikulpanit 提交于 11月 14, 2019

Since disabling APICv has to be done for all vcpus on AMD-based
system, adopt the newly introduced kvm_request_apicv_update()
interface, and introduce a new APICV_INHIBIT_REASON_HYPERV.

Also, remove the kvm_vcpu_deactivate_apicv() since no longer used.

Cc: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f4fdc0a2

kvm: x86: Introduce APICv x86 ops for checking APIC inhibit reasons · ef8efd7a

由 Suravee Suthikulpanit 提交于 11月 14, 2019

Inibit reason bits are used to determine if APICv deactivation is
applicable for a particular hardware virtualization architecture.
Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ef8efd7a

P
KVM: x86: remove get_enable_apicv from kvm_x86_ops · 7e3e67a9
由 Paolo Bonzini 提交于 1月 22, 2020
```
It is unused now.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
7e3e67a9

kvm: x86: Introduce APICv inhibit reason bits · 4e19c36f

由 Suravee Suthikulpanit 提交于 11月 14, 2019

There are several reasons in which a VM needs to deactivate APICv
e.g. disable APICv via parameter during module loading, or when
enable Hyper-V SynIC support. Additional inhibit reasons will be
introduced later on when dynamic APICv is supported,

Introduce KVM APICv inhibit reason bits along with a new variable,
apicv_inhibit_reasons, to help keep track of APICv state for each VM,

Initially, the APICV_INHIBIT_REASON_DISABLE bit is used to indicate
the case where APICv is disabled during KVM module load.
(e.g. insmod kvm_amd avic=0 or insmod kvm_intel enable_apicv=0).
Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
[Do not use get_enable_apicv; consider irqchip_split in svm.c. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4e19c36f

28 1月, 2020 5 次提交

KVM: VMX: remove duplicated segment cache clear · cef6db76

由 Miaohe Lin 提交于 1月 21, 2020

vmx_set_segment() clears segment cache unconditionally, so we should not
clear it again by calling vmx_segment_cache_clear().
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cef6db76

KVM: X86: Drop x86_set_memory_region() · 6a3c623b

由 Peter Xu 提交于 1月 09, 2020

The helper x86_set_memory_region() is only used in vmx_set_tss_addr()
and kvm_arch_destroy_vm().  Push the lock upper in both cases.  With
that, drop x86_set_memory_region().

This prepares to allow __x86_set_memory_region() to return a HVA
mapped, because the HVA will need to be protected by the lock too even
after __x86_set_memory_region() returns.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6a3c623b

KVM: X86: Don't take srcu lock in init_rmode_identity_map() · 2a5755bb

由 Peter Xu 提交于 1月 09, 2020

We've already got the slots_lock, so we should be safe.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2a5755bb

KVM: x86: Protect exit_reason from being used in Spectre-v1/L1TF attacks · c926f2f7

由 Marios Pomonis 提交于 12月 11, 2019

This fixes a Spectre-v1/L1TF vulnerability in vmx_handle_exit().
While exit_reason is set by the hardware and therefore should not be
attacker-influenced, an unknown exit_reason could potentially be used to
perform such an attack.

Fixes: 55d2375e ("KVM: nVMX: Move nested code to dedicated files")
Signed-off-by: NMarios Pomonis <pomonis@google.com>
Signed-off-by: NNick Finco <nifi@google.com>
Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NAndrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c926f2f7

kvm/svm: PKU not currently supported · a47970ed

由 John Allen 提交于 12月 19, 2019

Current SVM implementation does not have support for handling PKU. Guests
running on a host with future AMD cpus that support the feature will read
garbage from the PKRU register and will hit segmentation faults on boot as
memory is getting marked as protected that should not be. Ensure that cpuid
from SVM does not advertise the feature.
Signed-off-by: NJohn Allen <john.allen@amd.com>
Cc: stable@vger.kernel.org
Fixes: 0556cbdc ("x86/pkeys: Don't check if PKRU is zero before writing it")
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a47970ed

24 1月, 2020 6 次提交

KVM: x86: Move kvm_vcpu_init() invocation to common code · 987b2594

由 Sean Christopherson 提交于 12月 18, 2019

Move the kvm_cpu_{un}init() calls to common x86 code as an intermediate
step to removing kvm_cpu_{un}init() altogether.

Note, VMX'x alloc_apic_access_page() and init_rmode_identity_map() are
per-VM allocations and are intentionally kept if vCPU creation fails.
They are freed by kvm_arch_destroy_vm().

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

987b2594

KVM: x86: Move FPU allocation to common x86 code · fc6e2a18

由 Sean Christopherson 提交于 12月 18, 2019

The allocation of FPU structs is identical across VMX and SVM, move it
to common x86 code. Somewhat arbitrarily place the allocation so that
it resides directly above the associated initialization via fx_init(),
e.g. instead of retaining its position with respect to the overall vcpu
creation flow. Although the names names kvm_arch_vcpu_create() and
kvm_arch_vcpu_init() might suggest otherwise, x86 does not have a clean
split between 'create' and 'init'. Allocating the struct immediately
prior to the first use arguably improves readability *now*, and will
yield even bigger improvements when kvm_arch_vcpu_init() is removed in
a future patch.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fc6e2a18

KVM: x86: Allocate vcpu struct in common x86 code · a9dd6f09

由 Sean Christopherson 提交于 12月 18, 2019

Move allocation of VMX and SVM vcpus to common x86. Although the struct
being allocated is technically a VMX/SVM struct, it can be interpreted
directly as a 'struct kvm_vcpu' because of the pre-existing requirement
that 'struct kvm_vcpu' be located at offset zero of the arch/vendor vcpu
struct.

Remove the message from the build-time assertions regarding placement of
the struct, as compatibility with the arch usercopy region is no longer
the sole dependent on 'struct kvm_vcpu' being at offset zero.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a9dd6f09

KVM: VMX: Use direct vcpu pointer during vCPU create/free · 34109c04

由 Sean Christopherson 提交于 12月 18, 2019

Capture the vcpu pointer in a local varaible and replace '&vmx->vcpu'
references with a direct reference to the pointer in anticipation of
moving bits of the code to common x86 and passing the vcpu pointer into
vmx_create_vcpu(), i.e. eliminate unnecessary noise from future patches.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

34109c04

KVM: VMX: Allocate VPID after initializing VCPU · 034d8e2c

由 Sean Christopherson 提交于 12月 18, 2019

Do VPID allocation after calling the common kvm_vcpu_init() as a step
towards doing vCPU allocation (via kmem_cache_zalloc()) and calling
kvm_vcpu_init() back-to-back.  Squishing allocation and initialization
together will eventually allow the sequence to be moved to arch-agnostic
creation code.

Note, the VPID is not consumed until KVM_RUN, slightly delaying its
allocation should have no real function impact.  VPID allocation was
arbitrarily placed in the original patch, commit 2384d2b3 ("KVM:
VMX: Enable Virtual Processor Identification (VPID)").
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

034d8e2c

KVM: x86: avoid incorrect writes to host MSR_IA32_SPEC_CTRL · 6441fa61

由 Paolo Bonzini 提交于 1月 20, 2020

If the guest is configured to have SPEC_CTRL but the host does not
(which is a nonsensical configuration but these are not explicitly
forbidden) then a host-initiated MSR write can write vmx->spec_ctrl
(respectively svm->spec_ctrl) and trigger a #GP when KVM tries to
restore the host value of the MSR.  Add a more comprehensive check
for valid bits of SPEC_CTRL, covering host CPUID flags and,
since we are at it and it is more correct that way, guest CPUID
flags too.

For AMD, remove the unnecessary is_guest_mode check around setting
the MSR interception bitmap, so that the code looks the same as
for Intel.

Cc: Jim Mattson <jmattson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6441fa61

21 1月, 2020 8 次提交

KVM: x86: Refactor and rename bit() to feature_bit() macro · 87382003

由 Sean Christopherson 提交于 12月 17, 2019

Rename bit() to __feature_bit() to give it a more descriptive name, and
add a macro, feature_bit(), to stuff the X68_FEATURE_ prefix to keep
line lengths manageable for code that hardcodes the bit to be retrieved.

No functional change intended.

Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

87382003

KVM: x86: Drop special XSAVE handling from guest_cpuid_has() · 96be4e06

由 Sean Christopherson 提交于 12月 10, 2019

Now that KVM prevents setting host-reserved CR4 bits, drop the dedicated
XSAVE check in guest_cpuid_has() in favor of open coding similar checks
in the SVM/VMX XSAVES enabling flows.

Note, checking boot_cpu_has(X86_FEATURE_XSAVE) in the XSAVES flows is
technically redundant with respect to the CR4 reserved bit checks, e.g.
XSAVES #UDs if CR4.OSXSAVE=0 and arch.xsaves_enabled is consumed if and
only if CR4.OXSAVE=1 in guest. Keep (add?) the explicit boot_cpu_has()
checks to help document KVM's usage of arch.xsaves_enabled.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

96be4e06

KVM: VMX: Add helper to consolidate up PT/RTIT WRMSR fault logic · e348ac7c

由 Sean Christopherson 提交于 12月 10, 2019

Add a helper to consolidate the common checks for writing PT MSRs,
and opportunistically clean up the formatting of the affected code.

No functional change intended.

Cc: Chao Peng <chao.p.peng@linux.intel.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e348ac7c

KVM: VMX: Add non-canonical check on writes to RTIT address MSRs · fe6ed369

由 Sean Christopherson 提交于 12月 10, 2019

Reject writes to RTIT address MSRs if the data being written is a
non-canonical address as the MSRs are subject to canonical checks, e.g.
KVM will trigger an unchecked #GP when loading the values to hardware
during pt_guest_enter().

Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fe6ed369

KVM: Fix some writing mistakes · 311497e0

由 Miaohe Lin 提交于 12月 11, 2019

Fix some writing mistakes in the comments.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

311497e0

KVM: Fix some comment typos and missing parentheses · 67b0ae43

由 Miaohe Lin 提交于 12月 11, 2019

Fix some typos and add missing parentheses in the comments.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

67b0ae43

KVM: Fix some out-dated function names in comment · 4d516fe7

由 Miaohe Lin 提交于 12月 11, 2019

Since commit b1346ab2 ("KVM: nVMX: Rename prepare_vmcs02_*_full to
prepare_vmcs02_*_rare"), prepare_vmcs02_full has been renamed to
prepare_vmcs02_rare.
nested_vmx_merge_msr_bitmap is renamed to nested_vmx_prepare_msr_bitmap
since commit c992384b ("KVM: vmx: speed up MSR bitmap merge").
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4d516fe7

KVM: VMX: FIXED+PHYSICAL mode single target IPI fastpath · 1e9e2622

由 Wanpeng Li 提交于 11月 21, 2019

ICR and TSCDEADLINE MSRs write cause the main MSRs write vmexits in our
product observation, multicast IPIs are not as common as unicast IPI like
RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc.

This patch introduce a mechanism to handle certain performance-critical
WRMSRs in a very early stage of KVM VMExit handler.

This mechanism is specifically used for accelerating writes to x2APIC ICR
that attempt to send a virtual IPI with physical destination-mode, fixed
delivery-mode and single target. Which was found as one of the main causes
of VMExits for Linux workloads.

The reason this mechanism significantly reduce the latency of such virtual
IPIs is by sending the physical IPI to the target vCPU in a very early stage
of KVM VMExit handler, before host interrupts are enabled and before expensive
operations such as reacquiring KVM’s SRCU lock.
Latency is reduced even more when KVM is able to use APICv posted-interrupt
mechanism (which allows to deliver the virtual IPI directly to target vCPU
without the need to kick it to host).

Testing on Xeon Skylake server:

The virtual IPI latency from sender send to receiver receive reduces
more than 200+ cpu cycles.
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1e9e2622

14 1月, 2020 4 次提交

KVM: VMX: Check for full VMX support when verifying CPU compatibility · ff10e22e

由 Sean Christopherson 提交于 12月 20, 2019

Explicitly check the current CPU's IA32_FEAT_CTL and VMX feature flags
when verifying compatibility across physical CPUs.  This effectively
adds a check on IA32_FEAT_CTL to ensure that VMX is fully enabled on
all CPUs.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-17-sean.j.christopherson@intel.com

ff10e22e

KVM: VMX: Use VMX feature flag to query BIOS enabling · a4d0b2fd

由 Sean Christopherson 提交于 12月 20, 2019

Replace KVM's manual checks on IA32_FEAT_CTL with a query on the boot
CPU's MSR_IA32_FEAT_CTL and VMX feature flags. The MSR_IA32_FEAT_CTL
indicates that IA32_FEAT_CTL has been configured and that dependent
features are accurately reflected in cpufeatures, e.g. the VMX flag is
now cleared during boot if VMX isn't fully enabled via IA32_FEAT_CTL,
including the case where the MSR isn't supported.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-16-sean.j.christopherson@intel.com

a4d0b2fd

KVM: VMX: Drop initialization of IA32_FEAT_CTL MSR · 21bd3467

由 Sean Christopherson 提交于 12月 20, 2019

Remove KVM's code to initialize IA32_FEAT_CTL MSR when KVM is loaded now
that the MSR is initialized during boot on all CPUs that support VMX,
i.e. on all CPUs that can possibly load kvm_intel.

Note, don't WARN if IA32_FEAT_CTL is unlocked, even though the MSR is
unconditionally locked by init_ia32_feat_ctl(). KVM isn't tied directly
to a CPU vendor detection, whereas init_ia32_feat_ctl() is invoked if
and only if the CPU vendor is recognized and known to support VMX. As a
result, vmx_disabled_by_bios() may be reached without going through
init_ia32_feat_ctl() and thus without locking IA32_FEAT_CTL. This quirk
will be eliminated in a future patch.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NJim Mattson <jmattson@google.com>
Link: https://lkml.kernel.org/r/20191221044513.21680-15-sean.j.christopherson@intel.com

21bd3467

x86/msr-index: Clean up bit defines for IA32_FEATURE_CONTROL MSR · 32ad73db

由 Sean Christopherson 提交于 12月 20, 2019

As pointed out by Boris, the defines for bits in IA32_FEATURE_CONTROL
are quite a mouthful, especially the VMX bits which must differentiate
between enabling VMX inside and outside SMX (TXT) operation.  Rename the
MSR and its bit defines to abbreviate FEATURE_CONTROL as FEAT_CTL to
make them a little friendlier on the eyes.

Arguably, the MSR itself should keep the full IA32_FEATURE_CONTROL name
to match Intel's SDM, but a future patch will add a dedicated Kconfig,
file and functions for the MSR. Using the full name for those assets is
rather unwieldy, so bite the bullet and use IA32_FEAT_CTL so that its
nomenclature is consistent throughout the kernel.

Opportunistically, fix a few other annoyances with the defines:

  - Relocate the bit defines so that they immediately follow the MSR
    define, e.g. aren't mistaken as belonging to MISC_FEATURE_CONTROL.
  - Add whitespace around the block of feature control defines to make
    it clear they're all related.
  - Use BIT() instead of manually encoding the bit shift.
  - Use "VMX" instead of "VMXON" to match the SDM.
  - Append "_ENABLED" to the LMCE (Local Machine Check Exception) bit to
    be consistent with the kernel's verbiage used for all other feature
    control bits.  Note, the SDM refers to the LMCE bit as LMCE_ON,
    likely to differentiate it from IA32_MCG_EXT_CTL.LMCE_EN.  Ignore
    the (literal) one-off usage of _ON, the SDM is simply "wrong".
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-2-sean.j.christopherson@intel.com

32ad73db

09 1月, 2020 4 次提交

KVM: VMX: Fix the spelling of CPU_BASED_USE_TSC_OFFSETTING · 5e3d394f

由 Xiaoyao Li 提交于 12月 06, 2019

The mis-spelling is found by checkpatch.pl, so fix them.
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5e3d394f

KVM: VMX: Rename NMI_PENDING to NMI_WINDOW · 4e2a0bc5

由 Xiaoyao Li 提交于 12月 06, 2019

Rename the NMI-window exiting related definitions to match the latest
Intel SDM. No functional changes.
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4e2a0bc5

KVM: VMX: Rename INTERRUPT_PENDING to INTERRUPT_WINDOW · 9dadc2f9

由 Xiaoyao Li 提交于 12月 06, 2019

Rename interrupt-windown exiting related definitions to match the
latest Intel SDM. No functional changes.
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9dadc2f9

KVM: vmx: remove unreachable statement in vmx_get_msr_feature() · 4fb7b452

由 Miaohe Lin 提交于 12月 05, 2019

We have no way to reach the final statement, remove it.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4fb7b452

04 12月, 2019 1 次提交

kvm: vmx: Stop wasting a page for guest_msrs · 7d73710d

由 Jim Mattson 提交于 12月 03, 2019

We will never need more guest_msrs than there are indices in
vmx_msr_index. Thus, at present, the guest_msrs array will not exceed
168 bytes.
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7d73710d

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功