提交 · 5c8beb474665220374abcafa79e40d78778606d1 · openeuler / Kernel

07 4月, 2020 1 次提交

KVM: nVMX: don't clear mtf_pending when nested events are blocked · 5c8beb47

由 Oliver Upton 提交于 4月 06, 2020

If nested events are blocked, don't clear the mtf_pending flag to avoid
missing later delivery of the MTF VM-exit.

Fixes: 5ef8acbd ("KVM: nVMX: Emulate MTF when performing instruction emulation")
Signed-off-by: NOliver Upton <oupton@google.com>
Message-Id: <20200406201237.178725-1-oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5c8beb47

31 3月, 2020 2 次提交

KVM: x86: Copy kvm_x86_ops by value to eliminate layer of indirection · afaf0b2f

由 Sean Christopherson 提交于 3月 21, 2020

Replace the kvm_x86_ops pointer in common x86 with an instance of the
struct to save one pointer dereference when invoking functions. Copy the
struct by value to set the ops during kvm_init().

Arbitrarily use kvm_x86_ops.hardware_enable to track whether or not the
ops have been initialized, i.e. a vendor KVM module has been loaded.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200321202603.19355-7-sean.j.christopherson@intel.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

afaf0b2f

KVM: VMX: Configure runtime hooks using vmx_x86_ops · 72b0eaa9

由 Sean Christopherson 提交于 3月 21, 2020

Configure VMX's runtime hooks by modifying vmx_x86_ops directly instead
of using the global kvm_x86_ops.  This sets the stage for waiting until
after ->hardware_setup() to set kvm_x86_ops with the vendor's
implementation.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200321202603.19355-5-sean.j.christopherson@intel.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

72b0eaa9

18 3月, 2020 1 次提交

KVM: nVMX: remove side effects from nested_vmx_exit_reflected · 96b100cd

由 Paolo Bonzini 提交于 3月 17, 2020

The name of nested_vmx_exit_reflected suggests that it's purely
a test, but it actually marks VMCS12 pages as dirty.  Move this to
vmx_handle_exit, observing that the initial nested_run_pending check in
nested_vmx_exit_reflected is pointless---nested_run_pending has just
been cleared in vmx_vcpu_run and won't be set until handle_vmlaunch
or handle_vmresume.
Suggested-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

96b100cd

17 3月, 2020 14 次提交

KVM: nVMX: properly handle errors in nested_vmx_handle_enlightened_vmptrld() · b6a0653a

由 Vitaly Kuznetsov 提交于 3月 09, 2020

nested_vmx_handle_enlightened_vmptrld() fails in two cases:
- when we fail to kvm_vcpu_map() the supplied GPA
- when revision_id is incorrect.
Genuine Hyper-V raises #UD in the former case (at least with *some*
incorrect GPAs) and does VMfailInvalid() in the later. KVM doesn't do
anything so L1 just gets stuck retrying the same faulty VMLAUNCH.

nested_vmx_handle_enlightened_vmptrld() has two call sites:
nested_vmx_run() and nested_get_vmcs12_pages(). The former needs to queue
do much: the failure there happens after migration when L2 was running (and
L1 did something weird like wrote to VP assist page from a different vCPU),
just kill L1 with KVM_EXIT_INTERNAL_ERROR.
Reported-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
[Squash kbuild autopatch. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b6a0653a

KVM: nVMX: stop abusing need_vmcs12_to_shadow_sync for eVMCS mapping · e942dbf8

由 Vitaly Kuznetsov 提交于 3月 09, 2020

When vmx_set_nested_state() happens, we may not have all the required
data to map enlightened VMCS: e.g. HV_X64_MSR_VP_ASSIST_PAGE MSR may not
yet be restored so we need a postponed action. Currently, we (ab)use
need_vmcs12_to_shadow_sync/nested_sync_vmcs12_to_shadow() for that but
this is not ideal:
- We may not need to sync anything if L2 is running
- It is hard to propagate errors from nested_sync_vmcs12_to_shadow()
 as we call it from vmx_prepare_switch_to_guest() which happens just
 before we do VMLAUNCH, the code is not ready to handle errors there.

Move eVMCS mapping to nested_get_vmcs12_pages() and request
KVM_REQ_GET_VMCS12_PAGES, it seems to be is less abusive in nature.
It would probably be possible to introduce a specialized KVM_REQ_EVMCS_MAP
but it is undesirable to propagate eVMCS specifics all the way up to x86.c

Note, we don't need to request KVM_REQ_GET_VMCS12_PAGES from
vmx_set_nested_state() directly as nested_vmx_enter_non_root_mode() already
does that. Requesting KVM_REQ_GET_VMCS12_PAGES is done to document the
(non-obvious) side-effect and to be future proof.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e942dbf8

KVM: nVMX: Consolidate nested MTF checks to helper function · 212617db

由 Oliver Upton 提交于 2月 24, 2020

commit 5ef8acbd ("KVM: nVMX: Emulate MTF when performing
instruction emulation") introduced a helper to check the MTF
VM-execution control in vmcs12. Change pre-existing check in
nested_vmx_exit_reflected() to instead use the helper.
Signed-off-by: NOliver Upton <oupton@google.com>
Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

212617db

KVM: x86: rename set_cr3 callback and related flags to load_mmu_pgd · 727a7e27

由 Paolo Bonzini 提交于 3月 05, 2020

The set_cr3 callback is not setting the guest CR3, it is setting the
root of the guest page tables, either shadow or two-dimensional.
To make this clearer as well as to indicate that the MMU calls it
via kvm_mmu_load_cr3, rename it to load_mmu_pgd.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

727a7e27

KVM: x86: unify callbacks to load paging root · 689f3bf2

由 Paolo Bonzini 提交于 3月 03, 2020

Similar to what kvm-intel.ko is doing, provide a single callback that
merges svm_set_cr3, set_tdp_cr3 and nested_svm_set_tdp_cr3.

This lets us unify the set_cr3 and set_tdp_cr3 entries in kvm_x86_ops.
I'm doing that in this same patch because splitting it adds quite a bit
of churn due to the need for forward declarations. For the same reason
the assignment to vcpu->arch.mmu->set_cr3 is moved to kvm_init_shadow_mmu
from init_kvm_softmmu and nested_svm_init_mmu_context.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

689f3bf2

KVM: VMX: Add helpers to query Intel PT mode · 2ef7619d

由 Sean Christopherson 提交于 3月 02, 2020

Add helpers to query which of the (two) supported PT modes is active.
The primary motivation is to help document that there is a third PT mode
(host-only) that's currently not supported by KVM.  As is, it's not
obvious that PT_MODE_SYSTEM != !PT_MODE_HOST_GUEST and vice versa, e.g.
that "pt_mode == PT_MODE_SYSTEM" and "pt_mode != PT_MODE_HOST_GUEST" are
two distinct checks.

No functional change intended.
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2ef7619d

KVM: nVMX: Drop unnecessary check on ept caps for execute-only · 96d47010

由 Sean Christopherson 提交于 3月 02, 2020

Drop the call to cpu_has_vmx_ept_execute_only() when calculating which
EPT capabilities will be exposed to L1 for nested EPT. The resulting
configuration is immediately sanitized by the passed in @ept_caps, and
except for the call from vmx_check_processor_compat(), @ept_caps is the
capabilities that are queried by cpu_has_vmx_ept_execute_only(). For
vmx_check_processor_compat(), KVM *wants* to ignore vmx_capability.ept
so that a divergence in EPT capabilities between CPUs is detected.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

96d47010

KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_pgd() · d8dd54e0

由 Sean Christopherson 提交于 3月 02, 2020

Rename kvm_mmu->get_cr3() to call out that it is retrieving a guest
value, as opposed to kvm_mmu->set_cr3(), which sets a host value, and to
note that it will return something other than CR3 when nested EPT is in
use.  Hopefully the new name will also make it more obvious that L1's
nested_cr3 is returned in SVM's nested NPT case.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d8dd54e0

KVM: nVMX: Rename EPTP validity helper and associated variables · ac6389ab

由 Sean Christopherson 提交于 3月 02, 2020

Rename valid_ept_address() to nested_vmx_check_eptp() to follow the nVMX
nomenclature and to reflect that the function now checks a lot more than
just the address contained in the EPTP.  Rename address to new_eptp in
associated code.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ac6389ab

KVM: nVMX: Rename nested_ept_get_cr3() to nested_ept_get_eptp() · ac69dfaa

由 Sean Christopherson 提交于 3月 02, 2020

Rename the accessor for vmcs12.EPTP to use "eptp" instead of "cr3".  The
accessor has no relation to cr3 whatsoever, other than it being assigned
to the also poorly named kvm_mmu->get_cr3() hook.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ac69dfaa

KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT · bb1fcc70

由 Sean Christopherson 提交于 3月 02, 2020

Add support for 5-level nested EPT, and advertise said support in the
EPT capabilities MSR. KVM's MMU can already handle 5-level legacy page
tables, there's no reason to force an L1 VMM to use shadow paging if it
wants to employ 5-level page tables.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bb1fcc70

KVM: nVMX: Properly handle userspace interrupt window request · a1c77abb

由 Sean Christopherson 提交于 3月 02, 2020

Return true for vmx_interrupt_allowed() if the vCPU is in L2 and L1 has
external interrupt exiting enabled.  IRQs are never blocked in hardware
if the CPU is in the guest (L2 from L1's perspective) when IRQs trigger
VM-Exit.

The new check percolates up to kvm_vcpu_ready_for_interrupt_injection()
and thus vcpu_run(), and so KVM will exit to userspace if userspace has
requested an interrupt window (to inject an IRQ into L1).

Remove the @external_intr param from vmx_check_nested_events(), which is
actually an indicator that userspace wants an interrupt window, e.g.
it's named @req_int_win further up the stack.  Injecting a VM-Exit into
L1 to try and bounce out to L0 userspace is all kinds of broken and is
no longer necessary.

Remove the hack in nested_vmx_vmexit() that attempted to workaround the
breakage in vmx_check_nested_events() by only filling interrupt info if
there's an actual interrupt pending.  The hack actually made things
worse because it caused KVM to _never_ fill interrupt info when the
LAPIC resides in userspace (kvm_cpu_has_interrupt() queries
interrupt.injected, which is always cleared by prepare_vmcs12() before
reaching the hack in nested_vmx_vmexit()).

Fixes: 6550c4df ("KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"")
Cc: stable@vger.kernel.org
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a1c77abb

KVM: Fix some obsolete comments · 49f933d4

由 Miaohe Lin 提交于 2月 27, 2020

Remove some obsolete comments, fix wrong function name and description.
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

49f933d4

KVM: x86: Fix print format and coding style · e6302698

由 Miaohe Lin 提交于 2月 15, 2020

Use %u to print u32 var and correct some coding style.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e6302698

14 3月, 2020 1 次提交

KVM: nVMX: avoid NULL pointer dereference with incorrect EVMCS GPAs · 95fa1010

由 Vitaly Kuznetsov 提交于 3月 09, 2020

When an EVMCS enabled L1 guest on KVM will tries doing enlightened VMEnter
with EVMCS GPA = 0 the host crashes because the

evmcs_gpa != vmx->nested.hv_evmcs_vmptr

condition in nested_vmx_handle_enlightened_vmptrld() will evaluate to
false (as nested.hv_evmcs_vmptr is zeroed after init). The crash will
happen on vmx->nested.hv_evmcs pointer dereference.

Another problematic EVMCS ptr value is '-1' but it only causes host crash
after nested_release_evmcs() invocation. The problem is exactly the same as
with '0', we mistakenly think that the EVMCS pointer hasn't changed and
thus nested.hv_evmcs_vmptr is valid.

Resolve the issue by adding an additional !vmx->nested.hv_evmcs
check to nested_vmx_handle_enlightened_vmptrld(), this way we will
always be trying kvm_vcpu_map() when nested.hv_evmcs is NULL
and this is supposed to catch all invalid EVMCS GPAs.

Also, initialize hv_evmcs_vmptr to '0' in nested_release_evmcs()
to be consistent with initialization where we don't currently
set hv_evmcs_vmptr to '-1'.

Cc: stable@vger.kernel.org
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

95fa1010

23 2月, 2020 3 次提交

KVM: nVMX: Check IO instruction VM-exit conditions · 35a57134

由 Oliver Upton 提交于 2月 04, 2020

Consult the 'unconditional IO exiting' and 'use IO bitmaps' VM-execution
controls when checking instruction interception. If the 'use IO bitmaps'
VM-execution control is 1, check the instruction access against the IO
bitmaps to determine if the instruction causes a VM-exit.
Signed-off-by: NOliver Upton <oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

35a57134

KVM: nVMX: Refactor IO bitmap checks into helper function · e71237d3

由 Oliver Upton 提交于 2月 04, 2020

Checks against the IO bitmap are useful for both instruction emulation
and VM-exit reflection. Refactor the IO bitmap checks into a helper
function.
Signed-off-by: NOliver Upton <oupton@google.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e71237d3

KVM: nVMX: Emulate MTF when performing instruction emulation · 5ef8acbd

由 Oliver Upton 提交于 2月 07, 2020

Since commit 5f3d45e7 ("kvm/x86: add support for
MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
flag processor-based execution control for its L2 guest. KVM simply
forwards any MTF VM-exits to the L1 guest, which works for normal
instruction execution.

However, when KVM needs to emulate an instruction on the behalf of an L2
guest, the monitor trap flag is not emulated. Add the necessary logic to
kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
instruction emulation for L2.

Fixes: 5f3d45e7 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
Signed-off-by: NOliver Upton <oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5ef8acbd

22 2月, 2020 1 次提交

KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when... · a4443267

由 Vitaly Kuznetsov 提交于 2月 20, 2020

KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled

When apicv is disabled on a vCPU (e.g. by enabling KVM_CAP_HYPERV_SYNIC*),
nothing happens to VMX MSRs on the already existing vCPUs, however, all new
ones are created with PIN_BASED_POSTED_INTR filtered out. This is very
confusing and results in the following picture inside the guest:

$ rdmsr -ax 0x48d
ff00000016
7f00000016
7f00000016
7f00000016

This is observed with QEMU and 4-vCPU guest: QEMU creates vCPU0, does
KVM_CAP_HYPERV_SYNIC2 and then creates the remaining three.

L1 hypervisor may only check CPU0's controls to find out what features
are available and it will be very confused later. Switch to setting
PIN_BASED_POSTED_INTR control based on global 'enable_apicv' setting.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a4443267

17 2月, 2020 1 次提交

KVM: nVMX: Fix some obsolete comments and grammar error · 463bfeee

由 Miaohe Lin 提交于 2月 14, 2020

Fix wrong variable names and grammar error in comment.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

463bfeee

13 2月, 2020 1 次提交

KVM: nVMX: Fix some comment typos and coding style · ffdbd50d

由 Miaohe Lin 提交于 2月 07, 2020

Fix some typos in the comments. Also fix coding style.
[Sean Christopherson rewrites the comment of write_fault_to_shadow_pgtable
field in struct kvm_vcpu_arch.]
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ffdbd50d

12 2月, 2020 1 次提交

KVM: nVMX: Handle pending #DB when injecting INIT VM-exit · 684c0422

由 Oliver Upton 提交于 2月 07, 2020

SDM 27.3.4 states that the 'pending debug exceptions' VMCS field will
be populated if a VM-exit caused by an INIT signal takes priority over a
debug-trap. Emulate this behavior when synthesizing an INIT signal
VM-exit into L1.

Fixes: 4b9852f4 ("KVM: x86: Fix INIT signal handling in various CPU states")
Signed-off-by: NOliver Upton <oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

684c0422

05 2月, 2020 3 次提交

x86/kvm/hyper-v: don't allow to turn on unsupported VMX controls for nested guests · a8350231

由 Vitaly Kuznetsov 提交于 2月 05, 2020

Sane L1 hypervisors are not supposed to turn any of the unsupported VMX
controls on for its guests and nested_vmx_check_controls() checks for
that. This is, however, not the case for the controls which are supported
on the host but are missing in enlightened VMCS and when eVMCS is in use.

It would certainly be possible to add these missing checks to
nested_check_vm_execution_controls()/_vm_exit_controls()/.. but it seems
preferable to keep eVMCS-specific stuff in eVMCS and reduce the impact on
non-eVMCS guests by doing less unrelated checks. Create a separate
nested_evmcs_check_controls() for this purpose.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a8350231

KVM: nVMX: Remove stale comment from nested_vmx_load_cr3() · ea79a750

由 Sean Christopherson 提交于 2月 04, 2020

The blurb pertaining to the return value of nested_vmx_load_cr3() no
longer matches reality, remove it entirely as the behavior it is
attempting to document is quite obvious when reading the actual code.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ea79a750

KVM: nVMX: delete meaningless nested_vmx_run() declaration · 33aabd02

由 Miaohe Lin 提交于 1月 23, 2020

The function nested_vmx_run() declaration is below its implementation. So
this is meaningless and should be removed.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

33aabd02

28 1月, 2020 3 次提交

KVM: nVMX: Check GUEST_DR7 on vmentry of nested guests · b91991bf

由 Krish Sadhukhan 提交于 1月 15, 2020

According to section "Checks on Guest Control Registers, Debug Registers, and
and MSRs" in Intel SDM vol 3C, the following checks are performed on vmentry
of nested guests:

    If the "load debug controls" VM-entry control is 1, bits 63:32 in the DR7
    field must be 0.

In KVM, GUEST_DR7 is set prior to the vmcs02 VM-entry by kvm_set_dr() and the
latter synthesizes a #GP if any bit in the high dword in the former is set.
Hence this field needs to be checked in software.
Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: NKarl Heubaum <karl.heubaum@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b91991bf

KVM: x86: Perform non-canonical checks in 32-bit KVM · de761ea7

由 Sean Christopherson 提交于 1月 15, 2020

Remove the CONFIG_X86_64 condition from the low level non-canonical
helpers to effectively enable non-canonical checks on 32-bit KVM.
Non-canonical checks are performed by hardware if the CPU *supports*
64-bit mode, whether or not the CPU is actually in 64-bit mode is
irrelevant.

For the most part, skipping non-canonical checks on 32-bit KVM is ok-ish
because 32-bit KVM always (hopefully) drops bits 63:32 of whatever value
it's checking before propagating it to hardware, and architecturally,
the expected behavior for the guest is a bit of a grey area since the
vCPU itself doesn't support 64-bit mode. I.e. a 32-bit KVM guest can
observe the missed checks in several paths, e.g. INVVPID and VM-Enter,
but it's debatable whether or not the missed checks constitute a bug
because technically the vCPU doesn't support 64-bit mode.

The primary motivation for enabling the non-canonical checks is defense
in depth. As mentioned above, a guest can trigger a missed check via
INVVPID or VM-Enter. INVVPID is straightforward as it takes a 64-bit
virtual address as part of its 128-bit INVVPID descriptor and fails if
the address is non-canonical, even if INVVPID is executed in 32-bit PM.
Nested VM-Enter is a bit more convoluted as it requires the guest to
write natural width VMCS fields via memory accesses and then VMPTRLD the
VMCS, but it's still possible. In both cases, KVM is saved from a true
bug only because its flows that propagate values to hardware (correctly)
take "unsigned long" parameters and so drop bits 63:32 of the bad value.

Explicitly performing the non-canonical checks makes it less likely that
a bad value will be propagated to hardware, e.g. in the INVVPID case,
if __invvpid() didn't implicitly drop bits 63:32 then KVM would BUG() on
the resulting unexpected INVVPID failure due to hardware rejecting the
non-canonical address.

The only downside to enabling the non-canonical checks is that it adds a
relatively small amount of overhead, but the affected flows are not hot
paths, i.e. the overhead is negligible.

Note, KVM technically could gate the non-canonical checks on 32-bit KVM
with static_cpu_has(X86_FEATURE_LM), but on bare metal that's an even
bigger waste of code for everyone except the 0.00000000000001% of the
population running on Yonah, and nested 32-bit on 64-bit already fudges
things with respect to 64-bit CPU behavior.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
[Also do so in nested_vmx_check_host_state as reported by Krish. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

de761ea7

KVM: nVMX: WARN on failure to set IA32_PERF_GLOBAL_CTRL · d1968421

由 Oliver Upton 提交于 12月 13, 2019

Writes to MSR_CORE_PERF_GLOBAL_CONTROL should never fail if the VM-exit
and VM-entry controls are exposed to L1. Promote the checks to perform a
full WARN if kvm_set_msr() fails and remove the now unused macro
SET_MSR_OR_WARN().
Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NOliver Upton <oupton@google.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d1968421

21 1月, 2020 3 次提交

KVM: nVMX: vmread should not set rflags to specify success in case of #PF · a4d956b9

由 Miaohe Lin 提交于 12月 28, 2019

In case writing to vmread destination operand result in a #PF, vmread
should not call nested_vmx_succeed() to set rflags to specify success.
Similar to as done in VMPTRST (See handle_vmptrst()).
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Cc: stable@vger.kernel.org
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a4d956b9

KVM: vmx: delete meaningless nested_vmx_prepare_msr_bitmap() declaration · d8010a77

由 Miaohe Lin 提交于 12月 14, 2019

The function nested_vmx_prepare_msr_bitmap() declaration is below its
implementation. So this is meaningless and should be removed.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d8010a77

KVM: Fix some comment typos and missing parentheses · 67b0ae43

由 Miaohe Lin 提交于 12月 11, 2019

Fix some typos and add missing parentheses in the comments.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

67b0ae43

14 1月, 2020 1 次提交

x86/msr-index: Clean up bit defines for IA32_FEATURE_CONTROL MSR · 32ad73db

由 Sean Christopherson 提交于 12月 20, 2019

As pointed out by Boris, the defines for bits in IA32_FEATURE_CONTROL
are quite a mouthful, especially the VMX bits which must differentiate
between enabling VMX inside and outside SMX (TXT) operation.  Rename the
MSR and its bit defines to abbreviate FEATURE_CONTROL as FEAT_CTL to
make them a little friendlier on the eyes.

Arguably, the MSR itself should keep the full IA32_FEATURE_CONTROL name
to match Intel's SDM, but a future patch will add a dedicated Kconfig,
file and functions for the MSR. Using the full name for those assets is
rather unwieldy, so bite the bullet and use IA32_FEAT_CTL so that its
nomenclature is consistent throughout the kernel.

Opportunistically, fix a few other annoyances with the defines:

  - Relocate the bit defines so that they immediately follow the MSR
    define, e.g. aren't mistaken as belonging to MISC_FEATURE_CONTROL.
  - Add whitespace around the block of feature control defines to make
    it clear they're all related.
  - Use BIT() instead of manually encoding the bit shift.
  - Use "VMX" instead of "VMXON" to match the SDM.
  - Append "_ENABLED" to the LMCE (Local Machine Check Exception) bit to
    be consistent with the kernel's verbiage used for all other feature
    control bits.  Note, the SDM refers to the LMCE bit as LMCE_ON,
    likely to differentiate it from IA32_MCG_EXT_CTL.LMCE_EN.  Ignore
    the (literal) one-off usage of _ON, the SDM is simply "wrong".
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-2-sean.j.christopherson@intel.com

32ad73db

09 1月, 2020 4 次提交

kvm: nVMX: Aesthetic cleanup of handle_vmread and handle_vmwrite · c90f4d03

由 Jim Mattson 提交于 12月 06, 2019

Apply reverse fir tree declaration order, shorten some variable names
to avoid line wrap, reformat a block comment, delete an extra blank
line, and use BIT(10) instead of (1u << 10).
Signed-off-by: NJim Mattson <jmattson@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NPeter Shier <pshier@google.com>
Reviewed-by: NOliver Upton <oupton@google.com>
Reviewed-by: NJon Cargille <jcargill@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c90f4d03

kvm: nVMX: VMWRITE checks unsupported field before read-only field · 693e02cc

由 Jim Mattson 提交于 12月 06, 2019

According to the SDM, VMWRITE checks to see if the secondary source
operand corresponds to an unsupported VMCS field before it checks to
see if the secondary source operand corresponds to a VM-exit
information field and the processor does not support writing to
VM-exit information fields.

Fixes: 49f705c5 ("KVM: nVMX: Implement VMREAD and VMWRITE")
Signed-off-by: NJim Mattson <jmattson@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NPeter Shier <pshier@google.com>
Reviewed-by: NOliver Upton <oupton@google.com>
Reviewed-by: NJon Cargille <jcargill@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

693e02cc

kvm: nVMX: VMWRITE checks VMCS-link pointer before VMCS field · dd2d6042

由 Jim Mattson 提交于 12月 06, 2019

According to the SDM, a VMWRITE in VMX non-root operation with an
invalid VMCS-link pointer results in VMfailInvalid before the validity
of the VMCS field in the secondary source operand is checked.

For consistency, modify both handle_vmwrite and handle_vmread, even
though there was no problem with the latter.

Fixes: 6d894f49 ("KVM: nVMX: vmread/vmwrite: Use shadow vmcs12 if running L2")
Signed-off-by: NJim Mattson <jmattson@google.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NPeter Shier <pshier@google.com>
Reviewed-by: NOliver Upton <oupton@google.com>
Reviewed-by: NJon Cargille <jcargill@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

dd2d6042

KVM: VMX: Fix the spelling of CPU_BASED_USE_TSC_OFFSETTING · 5e3d394f

由 Xiaoyao Li 提交于 12月 06, 2019

The mis-spelling is found by checkpatch.pl, so fix them.
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5e3d394f

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功