提交 · 8a8b097c6cd0041da6ba3a0701f911bfee05c652 · openeuler / Kernel

21 4月, 2020 6 次提交

KVM: VMX: Move vpid_sync_vcpu_addr() down a few lines · 8a8b097c

由 Sean Christopherson 提交于 3月 20, 2020

Move vpid_sync_vcpu_addr() below vpid_sync_context() so that it can be
refactored in a future patch to call vpid_sync_context() directly when
the "individual address" INVVPID variant isn't supported.

No functional change intended.
Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-11-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8a8b097c

KVM: VMX: Use vpid_sync_context() directly when possible · 446ace4b

由 Sean Christopherson 提交于 3月 20, 2020

Use vpid_sync_context() directly for flows that run if and only if
enable_vpid=1, or more specifically, nested VMX flows that are gated by
vmx->nested.msrs.secondary_ctls_high.SECONDARY_EXEC_ENABLE_VPID being
set, which is allowed if and only if enable_vpid=1.  Because these flows
call __vmx_flush_tlb() with @invalidate_gpa=false, the if-statement that
decides between INVEPT and INVVPID will always go down the INVVPID path,
i.e. call vpid_sync_context() because
"enable_ept && (invalidate_gpa || !enable_vpid)" always evaluates false.

This helps pave the way toward removing @invalidate_gpa and @vpid from
__vmx_flush_tlb() and its callers.

Opportunstically drop unnecessary brackets in handle_invvpid() around an
affected __vmx_flush_tlb()->vpid_sync_context() conversion.

No functional change intended.
Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-10-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

446ace4b

KVM: VMX: Skip global INVVPID fallback if vpid==0 in vpid_sync_context() · c746b3a4

由 Sean Christopherson 提交于 3月 20, 2020

Skip the global INVVPID in the unlikely scenario that vpid==0 and the
SINGLE_CONTEXT variant of INVVPID is unsupported.  If vpid==0, there's
no need to INVVPID as it's impossible to do VM-Enter with VPID enabled
and vmcs.VPID==0, i.e. there can't be any TLB entries for the vCPU with
vpid==0.  The fact that the SINGLE_CONTEXT variant isn't supported is
irrelevant.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-9-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c746b3a4

KVM: x86: Sync SPTEs when injecting page/EPT fault into L1 · ee1fa209

由 Junaid Shahid 提交于 3月 20, 2020

When injecting a page fault or EPT violation/misconfiguration, KVM is
not syncing any shadow PTEs associated with the faulting address,
including those in previous MMUs that are associated with L1's current
EPTP (in a nested EPT scenario), nor is it flushing any hardware TLB
entries. All this is done by kvm_mmu_invalidate_gva.

Page faults that are either !PRESENT or RSVD are exempt from the flushing,
as the CPU is not allowed to cache such translations.
Signed-off-by: NJunaid Shahid <junaids@google.com>
Co-developed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-8-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ee1fa209

KVM: x86: cleanup kvm_inject_emulated_page_fault · 0cd665bd

由 Paolo Bonzini 提交于 3月 25, 2020

To reconstruct the kvm_mmu to be used for page fault injection, we
can simply use fault->nested_page_fault.  This matches how
fault->nested_page_fault is assigned in the first place by
FNAME(walk_addr_generic).
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0cd665bd

KVM: x86: introduce kvm_mmu_invalidate_gva · 5efac074

由 Paolo Bonzini 提交于 3月 23, 2020

Wrap the combination of mmu->invlpg and kvm_x86_ops->tlb_flush_gva
into a new function. This function also lets us specify the host PGD to
invalidate and also the MMU, both of which will be useful in fixing and
simplifying kvm_inject_emulated_page_fault.

A nested guest's MMU however has g_context->invlpg == NULL. Instead of
setting it to nonpaging_invlpg, make kvm_mmu_invalidate_gva the only
entry point to mmu->invlpg and make a NULL invlpg pointer equivalent
to nonpaging_invlpg, saving a retpoline.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5efac074

16 4月, 2020 24 次提交

KVM: x86: Export kvm_propagate_fault() (as kvm_inject_emulated_page_fault) · 53b3d8e9

由 Sean Christopherson 提交于 3月 20, 2020

Export the page fault propagation helper so that VMX can use it to
correctly emulate TLB invalidation on page faults in an upcoming patch.

In the (hopefully) not-too-distant future, SGX virtualization will also
want access to the helper for injecting page faults to the correct level
(L1 vs. L2) when emulating ENCLS instructions.

Rename the function to kvm_inject_emulated_page_fault() to clarify that
it is (a) injecting a fault and (b) only for page faults.  WARN if it's
invoked with an exception other than PF_VECTOR.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-6-sean.j.christopherson@intel.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

53b3d8e9

KVM: nVMX: Invalidate all roots when emulating INVVPID without EPT · d6e3f838

由 Junaid Shahid 提交于 3月 20, 2020

Free all roots when emulating INVVPID for L1 and EPT is disabled, as
outstanding changes to the page tables managed by L1 need to be
recognized.  Because L1 and L2 share an MMU when EPT is disabled, and
because VPID is not tracked by the MMU role, all roots in the current
MMU (root_mmu) need to be freed, otherwise a future nested VM-Enter or
VM-Exit could do a fast CR3 switch (without a flush/sync) and consume
stale SPTEs.

Fixes: 5c614b35 ("KVM: nVMX: nested VPID emulation")
Signed-off-by: NJunaid Shahid <junaids@google.com>
[sean: ported to upstream KVM, reworded the comment and changelog]
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-5-sean.j.christopherson@intel.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d6e3f838

KVM: nVMX: Invalidate all EPTP contexts when emulating INVEPT for L1 · f8aa7e39

由 Sean Christopherson 提交于 3月 20, 2020

Free all L2 (guest_mmu) roots when emulating INVEPT for L1.  Outstanding
changes to the EPT tables managed by L1 need to be recognized, and
relying on KVM to always flush L2's EPTP context on nested VM-Enter is
dangerous.

Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote
TLB flush if necessary, e.g. if L1 has never entered L2 then there is
nothing to be done.

Nuking all L2 roots is overkill for the single-context variant, but it's
the safe and easy bet.  A more precise zap mechanism will be added in
the future.  Add a TODO to call out that KVM only needs to invalidate
affected contexts.

Fixes: 14c07ad8 ("x86/kvm/mmu: introduce guest_mmu")
Reported-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-4-sean.j.christopherson@intel.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f8aa7e39

KVM: nVMX: Validate the EPTP when emulating INVEPT(EXTENT_CONTEXT) · eed0030e

由 Sean Christopherson 提交于 3月 20, 2020

Signal VM-Fail for the single-context variant of INVEPT if the specified
EPTP is invalid.  Per the INEVPT pseudocode in Intel's SDM, it's subject
to the standard EPT checks:

  If VM entry with the "enable EPT" VM execution control set to 1 would
  fail due to the EPTP value then VMfail(Invalid operand to INVEPT/INVVPID);

Fixes: bfd0a56b ("nEPT: Nested INVEPT")
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-3-sean.j.christopherson@intel.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

eed0030e

KVM: VMX: Flush all EPTP/VPID contexts on remote TLB flush · e8eff282

由 Sean Christopherson 提交于 3月 20, 2020

Flush all EPTP/VPID contexts if a TLB flush _may_ have been triggered by
a remote or deferred TLB flush, i.e. by KVM_REQ_TLB_FLUSH.  Remote TLB
flushes require all contexts to be invalidated, not just the active
contexts, e.g. all mappings in all contexts for a given HVA need to be
invalidated on a mmu_notifier invalidation.  Similarly, the instigator
of the deferred TLB flush may be expecting all contexts to be flushed,
e.g. vmx_vcpu_load_vmcs().

Without nested VMX, flushing only the current EPTP/VPID context isn't
problematic because KVM uses a constant VPID for each vCPU, and
mmu_alloc_direct_roots() all but guarantees KVM will use a single EPTP
for L1.  In the rare case where a different EPTP is created or reused,
KVM (currently) unconditionally flushes the new EPTP context prior to
entering the guest.

With nested VMX, KVM conditionally uses a different VPID for L2, and
unconditionally uses a different EPTP for L2.  Because KVM doesn't
_intentionally_ guarantee L2's EPTP/VPID context is flushed on nested
VM-Enter, it'd be possible for a malicious L1 to attack the host and/or
different VMs by exploiting the lack of flushing for L2.

  1) Launch nested guest from malicious L1.

  2) Nested VM-Enter to L2.

  3) Access target GPA 'g'.  CPU inserts TLB entry tagged with L2's ASID
     mapping 'g' to host PFN 'x'.

  2) Nested VM-Exit to L1.

  3) L1 triggers kernel same-page merging (ksm) by duplicating/zeroing
     the page for PFN 'x'.

  4) Host kernel merges PFN 'x' with PFN 'y', i.e. unmaps PFN 'x' and
     remaps the page to PFN 'y'.  mmu_notifier sends invalidate command,
     KVM flushes TLB only for L1's ASID.

  4) Host kernel reallocates PFN 'x' to some other task/guest.

  5) Nested VM-Enter to L2.  KVM does not invalidate L2's EPTP or VPID.

  6) L2 accesses GPA 'g' and gains read/write access to PFN 'x' via its
     stale TLB entry.

However, current KVM unconditionally flushes L1's EPTP/VPID context on
nested VM-Exit.  But, that behavior is mostly unintentional, KVM doesn't
go out of its way to flush EPTP/VPID on nested VM-Enter/VM-Exit, rather
a TLB flush is guaranteed to occur prior to re-entering L1 due to
__kvm_mmu_new_cr3() always being called with skip_tlb_flush=false.  On
nested VM-Enter, this happens via kvm_init_shadow_ept_mmu() (nested EPT
enabled) or in nested_vmx_load_cr3() (nested EPT disabled).  On nested
VM-Exit it occurs via nested_vmx_load_cr3().

This also fixes a bug where a deferred TLB flush in the context of L2,
with EPT disabled, would flush L1's VPID instead of L2's VPID, as
vmx_flush_tlb() flushes L1's VPID regardless of is_guest_mode().

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Ben Gardon <bgardon@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Junaid Shahid <junaids@google.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: John Haxby <john.haxby@oracle.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Fixes: efebf0aa ("KVM: nVMX: Do not flush TLB on L1<->L2 transitions if L1 uses VPID and EPT")
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320212833.3507-2-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e8eff282

selftests: kvm: Add testcase for creating max number of memslots · 909e0aba

由 Wainer dos Santos Moschetta 提交于 4月 10, 2020

This patch introduces test_add_max_memory_regions(), which checks
that a VM can have added memory slots up to the limit defined in
KVM_CAP_NR_MEMSLOTS. Then attempt to add one more slot to
verify it fails as expected.
Signed-off-by: NWainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200410231707.7128-11-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

909e0aba

KVM: selftests: Make set_memory_region_test common to all architectures · 5b4f758f

由 Sean Christopherson 提交于 4月 10, 2020

Make set_memory_region_test available on all architectures by wrapping
the bits that are x86-specific in ifdefs.  A future testcase
to create the maximum number of memslots will be architecture
agnostic.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200410231707.7128-10-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5b4f758f

KVM: selftests: Add "zero" testcase to set_memory_region_test · 8cc2dd63

由 Sean Christopherson 提交于 4月 10, 2020

Add a testcase for running a guest with no memslots to the memory region
test. The expected result on x86_64 is that the guest will trigger an
internal KVM error due to the initial code fetch encountering a
non-existent memslot and resulting in an emulation failure.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200410231707.7128-9-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8cc2dd63

selftests: kvm: Add vm_get_fd() in kvm_util · 4cd94d12

由 Wainer dos Santos Moschetta 提交于 4月 10, 2020

Introduces the vm_get_fd() function in kvm_util which returns
the VM file descriptor.
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NWainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200410231707.7128-8-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4cd94d12

KVM: selftests: Add "delete" testcase to set_memory_region_test · 8fb38f05

由 Sean Christopherson 提交于 4月 10, 2020

Add a testcase for deleting memslots while the guest is running.
Like the "move" testcase, this is x86_64-only as it relies on MMIO
happening when a non-existent memslot is encountered.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200410231707.7128-7-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8fb38f05

KVM: sefltests: Add explicit synchronization to move mem region test · 8a0639fe

由 Sean Christopherson 提交于 4月 10, 2020

Use sem_post() and sem_timedwait() to synchronize test stages between
the vCPU thread and the main thread instead of using usleep() to wait
for the vCPU thread and hoping for the best.

Opportunistically refactor the code to make it suck less in general,
and to prepare for adding more testcases.
Suggested-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200410231707.7128-6-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8a0639fe

KVM: selftests: Add GUEST_ASSERT variants to pass values to host · 3e6b9412

由 Sean Christopherson 提交于 4月 10, 2020

Add variants of GUEST_ASSERT to pass values back to the host, e.g. to
help debug/understand a failure when the the cause of the assert isn't
necessarily binary.

It'd probably be possible to auto-calculate the number of arguments and
just have a single GUEST_ASSERT, but there are a limited number of
variants and silently eating arguments could lead to subtle code bugs.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200410231707.7128-5-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3e6b9412

KVM: selftests: Add util to delete memory region · 8c996e4d

由 Sean Christopherson 提交于 4月 10, 2020

Add a utility to delete a memory region, it will be used by x86's
set_memory_region_test.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NWainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Message-Id: <20200410231707.7128-4-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8c996e4d

KVM: selftests: Use kernel's list instead of homebrewed replacement · 4d9bba90

由 Sean Christopherson 提交于 4月 10, 2020

Replace the KVM selftests' homebrewed linked lists for vCPUs and memory
regions with the kernel's 'struct list_head'.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Message-Id: <20200410231707.7128-3-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4d9bba90

KVM: selftests: Take vcpu pointer instead of id in vm_vcpu_rm() · 238022ff

由 Sean Christopherson 提交于 4月 10, 2020

The sole caller of vm_vcpu_rm() already has the vcpu pointer, take it
directly instead of doing an extra lookup.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NWainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Message-Id: <20200410231707.7128-2-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

238022ff

KVM: pass through CPUID(0x80000006) · 43d05de2

由 Eric Northup 提交于 4月 14, 2020

Return the host's L2 cache and TLB information for CPUID.0x80000006
instead of zeroing out the entry as part of KVM_GET_SUPPORTED_CPUID.
This allows a userspace VMM to feed KVM_GET_SUPPORTED_CPUID's output
directly into KVM_SET_CPUID2 (without breaking the guest).
Signed-off-by: NEric Northup (Google) <digitaleric@gmail.com>
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NJon Cargille <jcargill@google.com>
Message-Id: <20200415012320.236065-1-jcargill@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

43d05de2

KVM: x86: Return updated timer current count register from KVM_GET_LAPIC · 24647e0a

由 Peter Shier 提交于 10月 10, 2018

kvm_vcpu_ioctl_get_lapic (implements KVM_GET_LAPIC ioctl) does a bulk copy
of the LAPIC registers but must take into account that the one-shot and
periodic timer current count register is computed upon reads and is not
present in register state. When restoring LAPIC state (e.g. after
migration), restart timers from their their current count values at time of
save.

Note: When a one-shot timer expires, the code in arch/x86/kvm/lapic.c does
not zero the value of the LAPIC initial count register (emulating HW
behavior). If no other timer is run and pending prior to a subsequent
KVM_GET_LAPIC call, the returned register set will include the expired
one-shot initial count. On a subsequent KVM_SET_LAPIC call the code will
see a non-zero initial count and start a new one-shot timer using the
expired timer's count. This is a prior existing bug and will be addressed
in a separate patch. Thanks to jmattson@google.com for this find.
Signed-off-by: NPeter Shier <pshier@google.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <20181010225653.238911-1-pshier@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

24647e0a

KVM: remove redundant assignment to variable r · 788109c1

由 Colin Ian King 提交于 4月 10, 2020

The variable r is being assigned  with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Message-Id: <20200410113526.13822-1-colin.king@canonical.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

788109c1

KVM: SVM: Fix __svm_vcpu_run declaration. · 56a87e5d

由 Uros Bizjak 提交于 4月 09, 2020

The function returns no value.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Fixes: 199cd1d7 ("KVM: SVM: Split svm_vcpu_run inline assembly to separate file")
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
Message-Id: <20200409114926.1407442-1-ubizjak@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

56a87e5d

KVM: SVM: Do not setup frame pointer in __svm_vcpu_run · b61f62d4

由 Uros Bizjak 提交于 4月 09, 2020

__svm_vcpu_run is a leaf function and does not need
a frame pointer.  %rbp is also destroyed a few instructions
later when guest registers are loaded.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
Message-Id: <20200409120440.1427215-1-ubizjak@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b61f62d4

KVM: SVM: Fix build error due to missing release_pages() include · b2bce0a5

由 Borislav Petkov 提交于 4月 11, 2020

Fix:

  arch/x86/kvm/svm/sev.c: In function ‘sev_pin_memory’:
  arch/x86/kvm/svm/sev.c:360:3: error: implicit declaration of function ‘release_pages’;\
	  did you mean ‘reclaim_pages’? [-Werror=implicit-function-declaration]
    360 |   release_pages(pages, npinned);
        |   ^~~~~~~~~~~~~
        |   reclaim_pages

because svm.c includes pagemap.h but the carved out sev.c needs it too.
Triggered by a randconfig build.

Fixes: eaf78265 ("KVM: SVM: Move SEV code to separate file")
Signed-off-by: NBorislav Petkov <bp@suse.de>
Message-Id: <20200411160927.27954-1-bp@alien8.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b2bce0a5

KVM: SVM: Do not mark svm_vcpu_run with STACK_FRAME_NON_STANDARD · b4fd6308

由 Uros Bizjak 提交于 4月 14, 2020

svm_vcpu_run does not change stack or frame pointer anymore.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
Message-Id: <20200414113612.104501-1-ubizjak@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b4fd6308

kvm: nVMX: match comment with return type for nested_vmx_exit_reflected · 69c09755

由 Oliver Upton 提交于 4月 14, 2020

nested_vmx_exit_reflected() returns a bool, not int. As such, refer to
the return values as true/false in the comment instead of 1/0.
Signed-off-by: NOliver Upton <oupton@google.com>
Message-Id: <20200414221241.134103-1-oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

69c09755

kvm: nVMX: reflect MTF VM-exits if injected by L1 · b045ae90

由 Oliver Upton 提交于 4月 14, 2020

According to SDM 26.6.2, it is possible to inject an MTF VM-exit via the
VM-entry interruption-information field regardless of the 'monitor trap
flag' VM-execution control. KVM appropriately copies the VM-entry
interruption-information field from vmcs12 to vmcs02. However, if L1
has not set the 'monitor trap flag' VM-execution control, KVM fails to
reflect the subsequent MTF VM-exit into L1.

Fix this by consulting the VM-entry interruption-information field of
vmcs12 to determine if L1 has injected the MTF VM-exit. If so, reflect
the exit, regardless of the 'monitor trap flag' VM-execution control.

Fixes: 5f3d45e7 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
Signed-off-by: NOliver Upton <oupton@google.com>
Reviewed-by: NPeter Shier <pshier@google.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Message-Id: <20200414224746.240324-1-oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b045ae90

14 4月, 2020 5 次提交

KVM: s390: Return last valid slot if approx index is out-of-bounds · 97daa028

由 Sean Christopherson 提交于 4月 07, 2020

Return the index of the last valid slot from gfn_to_memslot_approx() if
its binary search loop yielded an out-of-bounds index.  The index can
be out-of-bounds if the specified gfn is less than the base of the
lowest memslot (which is also the last valid memslot).

Note, the sole caller, kvm_s390_get_cmma(), ensures used_slots is
non-zero.

Fixes: afdad616 ("KVM: s390: Fix storage attributes migration with memory slots")
Cc: stable@vger.kernel.org # 4.19.x: 0774a964: KVM: Fix out of range accesses to memslots
Cc: stable@vger.kernel.org # 4.19.x
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200408064059.8957-3-sean.j.christopherson@intel.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

97daa028

KVM: Check validity of resolved slot when searching memslots · b6467ab1

由 Sean Christopherson 提交于 4月 07, 2020

Check that the resolved slot (somewhat confusingly named 'start') is a
valid/allocated slot before doing the final comparison to see if the
specified gfn resides in the associated slot. The resolved slot can be
invalid if the binary search loop terminated because the search index
was incremented beyond the number of used slots.

This bug has existed since the binary search algorithm was introduced,
but went unnoticed because KVM statically allocated memory for the max
number of slots, i.e. the access would only be truly out-of-bounds if
all possible slots were allocated and the specified gfn was less than
the base of the lowest memslot. Commit 36947254 ("KVM: Dynamically
size memslot array based on number of used slots") eliminated the "all
possible slots allocated" condition and made the bug embarrasingly easy
to hit.

Fixes: 9c1a5d38 ("kvm: optimize GFN to memslot lookup with large slots amount")
Reported-by: syzbot+d889b59b2bb87d4047a2@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200408064059.8957-2-sean.j.christopherson@intel.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b6467ab1

KVM: VMX: Enable machine check support for 32bit targets · fb56baae

由 Uros Bizjak 提交于 4月 14, 2020

There is no reason to limit the use of do_machine_check
to 64bit targets. MCE handling works for both target familes.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: stable@vger.kernel.org
Fixes: a0861c02 ("KVM: Add VT-x machine check support")
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
Message-Id: <20200414071414.45636-1-ubizjak@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fb56baae

KVM: SVM: move more vmentry code to assembly · f14eec0a

由 Paolo Bonzini 提交于 4月 13, 2020

Manipulate IF around vmload/vmsave to remove the confusing usage of
local_irq_enable where interrupts are actually disabled via GIF.
And stuff the RSB immediately without waiting for a RET to avoid
Spectre-v2 attacks.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f14eec0a

KVM: SVM: fix compilation with modular PSP and non-modular KVM · 9ef1530c

由 Paolo Bonzini 提交于 4月 13, 2020

Use svm_sev_enabled() in order to cull all calls to PSP code.  Otherwise,
compilation fails with undefined symbols if the PSP device driver is compiled
as a module and KVM is not.
Reported-by: NUros Bizjak <ubizjak@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9ef1530c

07 4月, 2020 5 次提交

KVM: VMX: fix crash cleanup when KVM wasn't used · dbef2808

由 Vitaly Kuznetsov 提交于 4月 01, 2020

If KVM wasn't used at all before we crash the cleanup procedure fails with
 BUG: unable to handle page fault for address: ffffffffffffffc8
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 23215067 P4D 23215067 PUD 23217067 PMD 0
 Oops: 0000 [#8] SMP PTI
 CPU: 0 PID: 3542 Comm: bash Kdump: loaded Tainted: G      D           5.6.0-rc2+ #823
 RIP: 0010:crash_vmclear_local_loaded_vmcss.cold+0x19/0x51 [kvm_intel]

The root cause is that loaded_vmcss_on_cpu list is not yet initialized,
we initialize it in hardware_enable() but this only happens when we start
a VM.

Previously, we used to have a bitmap with enabled CPUs and that was
preventing [masking] the issue.

Initialized loaded_vmcss_on_cpu list earlier, right before we assign
crash_vmclear_loaded_vmcss pointer. blocked_vcpu_on_cpu list and
blocked_vcpu_on_cpu_lock are moved altogether for consistency.

Fixes: 31603d4f ("KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support")
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200401081348.1345307-1-vkuznets@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

dbef2808

KVM: X86: Filter out the broadcast dest for IPI fastpath · 4064a4c6

由 Wanpeng Li 提交于 4月 02, 2020

Except destination shorthand, a destination value 0xffffffff is used to
broadcast interrupts, let's also filter out this for single target IPI
fastpath.
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <1585815626-28370-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4064a4c6

Merge tag 'kvm-s390-master-5.7-1' of... · 1b0c58a3

由 Paolo Bonzini 提交于 4月 07, 2020

Merge tag 'kvm-s390-master-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixes for vsie (nested hypervisors)

- Several fixes for corner cases of nesting. Still relevant as it might
  crash host or first level guest or temporarily leak memory.

1b0c58a3

KVM: s390: vsie: Fix possible race when shadowing region 3 tables · 1493e0f9

由 David Hildenbrand 提交于 4月 03, 2020

We have to properly retry again by returning -EINVAL immediately in case
somebody else instantiated the table concurrently. We missed to add the
goto in this function only. The code now matches the other, similar
shadowing functions.

We are overwriting an existing region 2 table entry. All allocated pages
are added to the crst_list to be freed later, so they are not lost
forever. However, when unshadowing the region 2 table, we wouldn't trigger
unshadowing of the original shadowed region 3 table that we replaced. It
would get unshadowed when the original region 3 table is modified. As it's
not connected to the page table hierarchy anymore, it's not going to get
used anymore. However, for a limited time, this page table will stick
around, so it's in some sense a temporary memory leak.

Identified by manual code inspection. I don't think this classifies as
stable material.

Fixes: 998f637c ("s390/mm: avoid races on region/segment/page table shadowing")
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200403153050.20569-4-david@redhat.comReviewed-by: NClaudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

1493e0f9

KVM: s390: vsie: Fix delivery of addressing exceptions · 4d4cee96

由 David Hildenbrand 提交于 4月 03, 2020

Whenever we get an -EFAULT, we failed to read in guest 2 physical
address space. Such addressing exceptions are reported via a program
intercept to the nested hypervisor.

We faked the intercept, we have to return to guest 2. Instead, right
now we would be returning -EFAULT from the intercept handler, eventually
crashing the VM.
the correct thing to do is to return 1 as rc == 1 is the internal
representation of "we have to go back into g2".

Addressing exceptions can only happen if the g2->g3 page tables
reference invalid g2 addresses (say, either a table or the final page is
not accessible - so something that basically never happens in sane
environments.

Identified by manual code inspection.

Fixes: a3508fbe ("KVM: s390: vsie: initial support for nested virtualization")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200403153050.20569-3-david@redhat.comReviewed-by: NClaudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
[borntraeger@de.ibm.com: fix patch description]
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

4d4cee96

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功