提交 · f3a0af7060b5f57cbc6404aada31d8539d158ed4 · openeuler / Kernel

03 6月, 2021 5 次提交

KVM: VMX: Do not advertise RDPID if ENABLE_RDTSCP control is unsupported · f3a0af70

由 Sean Christopherson 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 79abde761e05ea1cb5996d458c0d31f0d80813f1
bugzilla: 51875
CVE: NA

--------------------------------

commit 8aec21c0 upstream.

Clear KVM's RDPID capability if the ENABLE_RDTSCP secondary exec control is
unsupported.  Despite being enumerated in a separate CPUID flag, RDPID is
bundled under the same VMCS control as RDTSCP and will #UD in VMX non-root
if ENABLE_RDTSCP is not enabled.

Fixes: 41cd02c6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210504171734.1434054-2-seanjc@google.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NReiji Watanabe <reijiw@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f3a0af70

KVM: x86: Move RDPID emulation intercept to its own enum · 852f8421

由 Sean Christopherson 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 2f86dd3d2bcfda3e14e8ee734e970dc05287d5fc
bugzilla: 51875
CVE: NA

--------------------------------

commit 2183de41 upstream.

Add a dedicated intercept enum for RDPID instead of piggybacking RDTSCP.
Unlike VMX's ENABLE_RDTSCP, RDPID is not bound to SVM's RDTSCP intercept.

Fixes: fb6d4d34 ("KVM: x86: emulate RDPID")
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210504171734.1434054-5-seanjc@google.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

852f8421

KVM/VMX: Invoke NMI non-IST entry instead of IST entry · 85348c13

由 Lai Jiangshan 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit bfccc4eade2bec1493f891ebcd3c6751eee971c9
bugzilla: 51875
CVE: NA

--------------------------------

commit a217a659 upstream.

In VMX, the host NMI handler needs to be invoked after NMI VM-Exit.
Before commit 1a5488ef ("KVM: VMX: Invoke NMI handler via indirect
call instead of INTn"), this was done by INTn ("int $2"). But INTn
microcode is relatively expensive, so the commit reworked NMI VM-Exit
handling to invoke the kernel handler by function call.

But this missed a detail. The NMI entry point for direct invocation is
fetched from the IDT table and called on the kernel stack.  But on 64-bit
the NMI entry installed in the IDT expects to be invoked on the IST stack.
It relies on the "NMI executing" variable on the IST stack to work
correctly, which is at a fixed position in the IST stack.  When the entry
point is unexpectedly called on the kernel stack, the RSP-addressed "NMI
executing" variable is obviously also on the kernel stack and is
"uninitialized" and can cause the NMI entry code to run in the wrong way.

Provide a non-ist entry point for VMX which shares the C-function with
the regular NMI entry and invoke the new asm entry point instead.

On 32-bit this just maps to the regular NMI entry point as 32-bit has no
ISTs and is not affected.

[ tglx: Made it independent for backporting, massaged changelog ]

Fixes: 1a5488ef ("KVM: VMX: Invoke NMI handler via indirect call instead of INTn")
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NLai Jiangshan <laijs@linux.alibaba.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87r1imi8i1.ffs@nanos.tec.linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

85348c13

KVM: VMX: Intercept FS/GS_BASE MSR accesses for 32-bit KVM · 67819ad6

由 Sean Christopherson 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 8fcdfa71ba6a1baa7bff73353b914df2a15b1bb8
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit dbdd096a ]

Disable pass-through of the FS and GS base MSRs for 32-bit KVM.  Intel's
SDM unequivocally states that the MSRs exist if and only if the CPU
supports x86-64.  FS_BASE and GS_BASE are mostly a non-issue; a clever
guest could opportunistically use the MSRs without issue.  KERNEL_GS_BASE
is a bigger problem, as a clever guest would subtly be broken if it were
migrated, as KVM disallows software access to the MSRs, and unlike the
direct variants, KERNEL_GS_BASE needs to be explicitly migrated as it's
not captured in the VMCS.

Fixes: 25c5f225 ("KVM: VMX: Enable MSR Bitmap feature")
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210422023831.3473491-1-seanjc@google.com>
[*NOT* for stable kernels. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

67819ad6

KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is valid · 83149619

由 David Edmondson 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit e9bd1af4c038061c67789e298067478c79cedb2b
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit d9e46d34 ]

If the VM entry/exit controls for loading/saving MSR_EFER are either
not available (an older processor or explicitly disabled) or not
used (host and guest values are the same), reading GUEST_IA32_EFER
from the VMCS returns an inaccurate value.

Because of this, in dump_vmcs() don't use GUEST_IA32_EFER to decide
whether to print the PDPTRs - always do so if the fields exist.

Fixes: 4eb64dce ("KVM: x86: dump VMCS on invalid entry")
Signed-off-by: NDavid Edmondson <david.edmondson@oracle.com>
Message-Id: <20210318120841.133123-2-david.edmondson@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

83149619

26 4月, 2021 2 次提交

KVM: VMX: Don't use vcpu->run->internal.ndata as an array index · 93b4ca3d

由 Reiji Watanabe 提交于 4月 23, 2021

stable inclusion
from stable-5.10.32
commit 7f64753835a78c7d2cc2932a5808ef3b7fd4c050
bugzilla: 51796

--------------------------------

[ Upstream commit 04c4f2ee ]

__vmx_handle_exit() uses vcpu->run->internal.ndata as an index for
an array access.  Since vcpu->run is (can be) mapped to a user address
space with a writer permission, the 'ndata' could be updated by the
user process at anytime (the user process can set it to outside the
bounds of the array).
So, it is not safe that __vmx_handle_exit() uses the 'ndata' that way.

Fixes: 1aa561b1 ("kvm: x86: Add "last CPU" to some KVM_EXIT information")
Signed-off-by: NReiji Watanabe <reijiw@google.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Message-Id: <20210413154739.490299-1-reijiw@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

93b4ca3d

KVM: VMX: Convert vcpu_vmx.exit_reason to a union · f9d40d74

由 Sean Christopherson 提交于 4月 23, 2021

stable inclusion
from stable-5.10.32
commit c670ff84fac9c92c4260b359f24fff1312b98218
bugzilla: 51796

--------------------------------

[ Upstream commit 8e533240 ]

Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
bits 15:0, and single-bit modifiers in bits 31:16.

Historically, KVM has only had to worry about handling the "failed
VM-Entry" modifier, which could only be set in very specific flows and
required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
bit was a somewhat viable approach.  But even with only a single bit to
worry about, KVM has had several bugs related to comparing a basic exit
reason against the full exit reason store in vcpu_vmx.

Upcoming Intel features, e.g. SGX, will add new modifier bits that can
be set on more or less any VM-Exit, as opposed to the significantly more
restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
flows isn't scalable.  Tracking exit reason in a union forces code to
explicitly choose between consuming the full exit reason and the basic
exit, and is a convenient way to document and access the modifiers.

No functional change intended.

Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20201106090315.18606-2-chenyi.qiang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f9d40d74

19 4月, 2021 2 次提交

kvm: debugfs: add EXIT_REASON_PREEMPTION_TIMER to vcpu_stat · 3beedef8

由 chenjiajun 提交于 4月 07, 2021

virt inclusion
category: feature
bugzilla: 46853
CVE: NA

Export EXIT_REASON_PREEMPTION_TIMER kvm exits to vcpu_stat debugfs.
Add a new column to vcpu_stat, and provide preemption_timer status to
virtualization detection tools.
Signed-off-by: Nchenjiajun <chenjiajun8@huawei.com>
Reviewed-by: NXiangyou Xie <xiexiangyou@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3beedef8

kvm: debugfs: add fastpath msr_wr exits to debugfs statistics · aa35c00a

由 chenjiajun 提交于 4月 07, 2021

virt inclusion
category: feature
bugzilla: 46853
CVE: NA

At present, there is a flaw in the statistics of KVM exits by debugfs,
which only counts trigger times of exits processing function in kvm_vmx_exit_handlers.
The kvm exits handles in vmx_exit_handlers_fastpath is omitted, so there is a large numerical error
in EXIT_REASON_MSR_WRITE statistics sometimes.
Signed-off-by: Nchenjiajun <chenjiajun8@huawei.com>
Reviewed-by: NXiangyou Xie <xiexiangyou@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

aa35c00a

09 3月, 2021 1 次提交

KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off · ec83ed02

由 Paolo Bonzini 提交于 2月 19, 2021

stable inclusion
from stable-5.10.15
commit 6c0e069ac6e8db0c49dcc90d37ede5b1da08fe0b
bugzilla: 48167

--------------------------------

commit 7131636e upstream.

Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST
will generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
When this happens and the host has tsx=on, it is possible to end up with
virtual machines that have HLE and RTM disabled, but TSX_CTRL available.

If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible to
use the tsx=off hosts as migration destinations, even though the guests
do not have TSX enabled.

To allow this migration, allow guests to write to their TSX_CTRL MSR,
while keeping the host MSR unchanged for the entire life of the guests.
This ensures that TSX remains disabled and also saves MSR reads and
writes, and it's okay to do because with tsx=off we know that guests will
not have the HLE and RTM features in their CPUID.  (If userspace sets
bogus CPUID data, we do not expect HLE and RTM to work in guests anyway).

Cc: stable@vger.kernel.org
Fixes: cbbaa272 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

ec83ed02

12 1月, 2021 2 次提交

kvm: debugfs: Export x86 kvm exits to vcpu_stat · a40939df

由 chenjiajun 提交于 12月 23, 2020

virt inclusion
category: feature
bugzilla: 46853
CVE: NA

Export vcpu_stat via debugfs for x86, which contains x86 kvm exits items.
The path of the vcpu_stat is /sys/kernel/debug/kvm/vcpu_stat, and
each line of vcpu_stat is a collection of various kvm exits for a vcpu.
And through vcpu_stat, we only need to open one file to
tail performance of virtual machine, which is more convenient.
Signed-off-by: NFeng Lin <linfeng23@huawei.com>
Signed-off-by: Nchenjiajun <chenjiajun8@huawei.com>
Reviewed-by: NXiangyou Xie <xiexiangyou@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>

a40939df

KVM: x86: reinstate vendor-agnostic check on SPEC_CTRL cpuid bits · 415a3f5a

由 Paolo Bonzini 提交于 1月 07, 2021

stable inclusion
from stable-5.10.4
commit 49830b2d1b91e7d840808a6a9809496e70edeeab
bugzilla: 46903

--------------------------------

commit 39485ed9 upstream.

Until commit e7c587da ("x86/speculation: Use synthetic bits for
IBRS/IBPB/STIBP"), KVM was testing both Intel and AMD CPUID bits before
allowing the guest to write MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.
Testing only Intel bits on VMX processors, or only AMD bits on SVM
processors, fails if the guests are created with the "opposite" vendor
as the host.

While at it, also tweak the host CPU check to use the vendor-agnostic
feature bit X86_FEATURE_IBPB, since we only care about the availability
of the MSR on the host here and not about specific CPUID bits.

Fixes: e7c587da ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP")
Cc: stable@vger.kernel.org
Reported-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

415a3f5a

31 10月, 2020 2 次提交

KVM: vmx: remove unused variable · 9478dec3

由 Paolo Bonzini 提交于 10月 31, 2020

Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9478dec3

KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again · 064eedf2

由 Vitaly Kuznetsov 提交于 10月 14, 2020

It was noticed that evmcs_sanitize_exec_ctrls() is not being executed
nowadays despite the code checking 'enable_evmcs' static key looking
correct. Turns out, static key magic doesn't work in '__init' section
(and it is unclear when things changed) but setup_vmcs_config() is called
only once per CPU so we don't really need it to. Switch to checking
'enlightened_vmcs' instead, it is supposed to be in sync with
'enable_evmcs'.

Opportunistically make evmcs_sanitize_exec_ctrls '__init' and drop unneeded
extra newline from it.
Reported-by: NYang Weijiang <weijiang.yang@intel.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201014143346.2430936-1-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

064eedf2

24 10月, 2020 1 次提交

KVM: vmx: rename pi_init to avoid conflict with paride · a3ff25fc

由 Paolo Bonzini 提交于 10月 24, 2020

allyesconfig results in:

ld: drivers/block/paride/paride.o: in function `pi_init':
(.text+0x1340): multiple definition of `pi_init'; arch/x86/kvm/vmx/posted_intr.o:posted_intr.c:(.init.text+0x0): first defined here
make: *** [Makefile:1164: vmlinux] Error 1

because commit:

commit 8888cdd0
Author: Xiaoyao Li <xiaoyao.li@intel.com>
Date:   Wed Sep 23 11:31:11 2020 -0700

    KVM: VMX: Extract posted interrupt support to separate files

added another pi_init(), though one already existed in the paride code.
Reported-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a3ff25fc

22 10月, 2020 4 次提交

KVM: VMX: Intercept guest reserved CR4 bits to inject #GP fault · 2ed41aa6

由 Sean Christopherson 提交于 9月 29, 2020

Intercept CR4 bits that are guest reserved so that KVM correctly injects
a #GP fault if the guest attempts to set a reserved bit. If a feature
is supported by the CPU but is not exposed to the guest, and its
associated CR4 bit is not intercepted by KVM by default, then KVM will
fail to inject a #GP if the guest sets the CR4 bit without triggering
an exit, e.g. by toggling only the bit in question.

Note, KVM doesn't give the guest direct access to any CR4 bits that are
also dependent on guest CPUID. Yet.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200930041659.28181-5-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2ed41aa6

KVM: x86: Move call to update_exception_bitmap() into VMX code · a6337a35

由 Sean Christopherson 提交于 9月 29, 2020

Now that vcpu_after_set_cpuid() and update_exception_bitmap() are called
back-to-back, subsume the exception bitmap update into the common CPUID
update. Drop the SVM invocation entirely as SVM's exception bitmap
doesn't vary with respect to guest CPUID.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200930041659.28181-4-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a6337a35

KVM: x86: allow kvm_x86_ops.set_efer to return an error value · 72f211ec

由 Maxim Levitsky 提交于 10月 01, 2020

This will be used to signal an error to the userspace, in case
the vendor code failed during handling of this msr. (e.g -ENOMEM)
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201001112954.6258-4-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

72f211ec

KVM: VMX: Ignore userspace MSR filters for x2APIC · 9389b9d5

由 Sean Christopherson 提交于 10月 05, 2020

Rework the resetting of the MSR bitmap for x2APIC MSRs to ignore userspace
filtering.  Allowing userspace to intercept reads to x2APIC MSRs when
APICV is fully enabled for the guest simply can't work; the LAPIC and thus
virtual APIC is in-kernel and cannot be directly accessed by userspace.
To keep things simple we will in fact forbid intercepting x2APIC MSRs
altogether, independent of the default_allow setting.

Cc: Alexander Graf <graf@amazon.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201005195532.8674-3-sean.j.christopherson@intel.com>
[Modified to operate even if APICv is disabled, adjust documentation. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9389b9d5

20 10月, 2020 1 次提交

KVM: VMX: Fix x2APIC MSR intercept handling on !APICV platforms · 628ade2d

由 Peter Xu 提交于 10月 05, 2020

Fix an inverted flag for intercepting x2APIC MSRs and intercept writes
by default, even when APICV is enabled.

Fixes: 3eb90017 ("KVM: x86: VMX: Prevent MSR passthrough when MSR access is denied")
Co-developed-by: NPeter Xu <peterx@redhat.com>
[sean: added changelog]
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201005195532.8674-2-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

628ade2d

03 10月, 2020 1 次提交

KVM: VMX: update PFEC_MASK/PFEC_MATCH together with PF intercept · b502e6ec

由 Paolo Bonzini 提交于 9月 29, 2020

The PFEC_MASK and PFEC_MATCH fields in the VMCS reverse the meaning of
the #PF intercept bit in the exception bitmap when they do not match.
This means that, if PFEC_MASK and/or PFEC_MATCH are set, the
hypervisor can get a vmexit for #PF exceptions even when the
corresponding bit is clear in the exception bitmap.

This is unexpected and is promptly detected by a WARN_ON_ONCE.
To fix it, reset PFEC_MASK and PFEC_MATCH when the #PF intercept
is disabled (as is common with enable_ept && !allow_smaller_maxphyaddr).
Reported-by: NQian Cai <cai@redhat.com&gt;>
Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b502e6ec

29 9月, 2020 1 次提交

KVM: VMX: vmx_uret_msrs_list[] can be static · 6a2e0923

由 kernel test robot 提交于 9月 28, 2020

Fixes: 14a61b64 ("KVM: VMX: Rename "vmx_msr_index" to "vmx_uret_msrs_list"")
Signed-off-by: Nkernel test robot <lkp@intel.com>
Message-Id: <20200928153714.GA6285@a3a878002045>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6a2e0923

28 9月, 2020 18 次提交

KVM: x86: VMX: Prevent MSR passthrough when MSR access is denied · 3eb90017

由 Alexander Graf 提交于 9月 25, 2020

We will introduce the concept of MSRs that may not be handled in kernel
space soon. Some MSRs are directly passed through to the guest, effectively
making them handled by KVM from user space's point of view.

This patch introduces all logic required to ensure that MSRs that
user space wants trapped are not marked as direct access for guests.
Signed-off-by: NAlexander Graf <graf@amazon.com>
Message-Id: <20200925143422.21718-7-graf@amazon.com>
[Replace "_idx" with "_slot". - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3eb90017

KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs · 476c9bd8

由 Aaron Lewis 提交于 9月 25, 2020

Prepare vmx and svm for a subsequent change that ensures the MSR permission
bitmap is set to allow an MSR that userspace is tracking to force a vmx_vmexit
in the guest.
Signed-off-by: NAaron Lewis <aaronlewis@google.com>
Reviewed-by: NOliver Upton <oupton@google.com>
[agraf: rebase, adapt SVM scheme to nested changes that came in between]
Signed-off-by: NAlexander Graf <graf@amazon.com>
Message-Id: <20200925143422.21718-5-graf@amazon.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

476c9bd8

KVM: VMX: Rename vmx_uret_msr's "index" to "slot" · 802145c5

由 Sean Christopherson 提交于 9月 23, 2020

Rename "index" to "slot" in struct vmx_uret_msr to align with the
terminology used by common x86's kvm_user_return_msrs, and to avoid
conflating "MSR's ECX index" with "MSR's index into an array".

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-16-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

802145c5

KVM: VMX: Rename "vmx_msr_index" to "vmx_uret_msrs_list" · 14a61b64

由 Sean Christopherson 提交于 9月 23, 2020

Rename "vmx_msr_index" to "vmx_uret_msrs_list" to associate it with the
uret MSRs array, and to avoid conflating "MSR's ECX index" with "MSR's
index into an array". Similarly, don't use "slot" in the name as that
terminology is claimed by the common x86 "user_return_msrs" mechanism.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-15-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

14a61b64

KVM: VMX: Rename "vmx_set_guest_msr" to "vmx_set_guest_uret_msr" · 7bf662bb

由 Sean Christopherson 提交于 9月 23, 2020

Add "uret" to vmx_set_guest_msr() to explicitly associate it with the
guest_uret_msrs array, and to differentiate it from vmx_set_msr() as
well as VMX's load/store MSRs.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-14-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7bf662bb

KVM: VMX: Rename "find_msr_entry" to "vmx_find_uret_msr" · d85a8034

由 Sean Christopherson 提交于 9月 23, 2020

Rename "find_msr_entry" to scope it to VMX and to associate it with
guest_uret_msrs. Drop the "entry" so that the function name pairs with
the existing __vmx_find_uret_msr(), which intentionally uses a double
underscore prefix instead of appending "index" or "slot" as those names
are already claimed by other pieces of the user return MSR stack.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-13-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d85a8034

KVM: VMX: Add vmx_setup_uret_msr() to handle lookup and swap · bd65ba82

由 Sean Christopherson 提交于 9月 23, 2020

Add vmx_setup_uret_msr() to wrap the lookup and manipulation of the uret
MSRs array during setup_msrs().  In addition to consolidating code, this
eliminates move_msr_up(), which while being a very literally description
of the function, isn't exacly helpful in understanding the net effect of
the code.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-12-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bd65ba82

KVM: VMX: Move uret MSR lookup into update_transition_efer() · 86e3e494

由 Sean Christopherson 提交于 9月 23, 2020

Move checking for the existence of MSR_EFER in the uret MSR array into
update_transition_efer() so that the lookup and manipulation of the
array in setup_msrs() occur back-to-back. This paves the way toward
adding a helper to wrap the lookup and manipulation.

To avoid unnecessary overhead, defer the lookup until the uret array
would actually be modified in update_transition_efer(). EFER obviously
exists on CPUs that support the dedicated VMCS fields for switching
EFER, and EFER must exist for the guest and host EFER.NX value to
diverge, i.e. there is no danger of attempting to read/write EFER when
it doesn't exist.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-11-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

86e3e494

KVM: VMX: Check guest support for RDTSCP before processing MSR_TSC_AUX · ef1d2ee1

由 Sean Christopherson 提交于 9月 23, 2020

Check for RDTSCP support prior to checking if MSR_TSC_AUX is in the uret
MSRs array so that the array lookup and manipulation are back-to-back.
This paves the way toward adding a helper to wrap the lookup and
manipulation.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-10-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ef1d2ee1

KVM: VMX: Rename "__find_msr_index" to "__vmx_find_uret_msr" · 1e7a4830

由 Sean Christopherson 提交于 9月 23, 2020

Rename "__find_msr_index" to scope it to VMX, associate it with
guest_uret_msrs, and to avoid conflating "MSR's ECX index" with "MSR's
array index".  Similarly, don't use "slot" in the name so as to avoid
colliding the common x86's half of "user_return_msrs" (the slot in
kvm_user_return_msrs is not the same slot in guest_uret_msrs).

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-9-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1e7a4830

KVM: VMX: Rename vcpu_vmx's "guest_msrs_ready" to "guest_uret_msrs_loaded" · 658ece84

由 Sean Christopherson 提交于 9月 23, 2020

Add "uret" to "guest_msrs_ready" to explicitly associate it with the
"guest_uret_msrs" array, and replace "ready" with "loaded" to more
precisely reflect what it tracks, e.g. "ready" could be interpreted as
meaning ready for processing (setup_msrs() has run), which is wrong.
"loaded" also aligns with the similar "guest_state_loaded" field.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-8-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

658ece84

KVM: VMX: Rename vcpu_vmx's "save_nmsrs" to "nr_active_uret_msrs" · e9bb1ae9

由 Sean Christopherson 提交于 9月 23, 2020

Add "uret" into the name of "save_nmsrs" to explicitly associate it with
the guest_uret_msrs array, and replace "save" with "active" (for lack of
a better word) to better describe what is being tracked. While "save"
is more or less accurate when viewed as a literal description of the
field, e.g. it holds the number of MSRs that were saved into the array
the last time setup_msrs() was invoked, it can easily be misinterpreted
by the reader, e.g. as meaning the number of MSRs that were saved from
hardware at some point in the past, or as the number of MSRs that need
to be saved at some point in the future, both of which are wrong.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-7-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e9bb1ae9

KVM: VMX: Rename vcpu_vmx's "nmsrs" to "nr_uret_msrs" · fbc18007

由 Sean Christopherson 提交于 9月 23, 2020

Rename vcpu_vmx.nsmrs to vcpu_vmx.nr_uret_msrs to explicitly associate
it with the guest_uret_msrs array.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-6-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fbc18007

KVM: VMX: Rename the "shared_msr_entry" struct to "vmx_uret_msr" · eb3db1b1

由 Sean Christopherson 提交于 9月 23, 2020

Rename struct "shared_msr_entry" to "vmx_uret_msr" to align with x86's
rename of "shared_msrs" to "user_return_msrs", and to call out that the
struct is specific to VMX, i.e. not part of the generic "shared_msrs"
framework.  Abbreviate "user_return" as "uret" to keep line lengths
marginally sane and code more or less readable.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-5-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

eb3db1b1

KVM: VMX: Rename "vmx_find_msr_index" to "vmx_find_loadstore_msr_slot" · a128a934

由 Sean Christopherson 提交于 9月 23, 2020

Add "loadstore" to vmx_find_msr_index() to differentiate it from the so
called shared MSRs helpers (which will soon be renamed), and replace
"index" with "slot" to better convey that the helper returns slot in the
array, not the MSR index (the value that gets stuffed into ECX).

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-4-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a128a934

KVM: VMX: Prepend "MAX_" to MSR array size defines · ce833b23

由 Sean Christopherson 提交于 9月 23, 2020

Add "MAX" to the LOADSTORE and so called SHARED MSR defines to make it
more clear that the define controls the array size, as opposed to the
actual number of valid entries that are in the array.

No functional change intended.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-3-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ce833b23

KVM: x86: Rename "shared_msrs" to "user_return_msrs" · 7e34fbd0

由 Sean Christopherson 提交于 9月 23, 2020

Rename the "shared_msrs" mechanism, which is used to defer restoring
MSRs that are only consumed when running in userspace, to a more banal
but less likely to be confusing "user_return_msrs".

The "shared" nomenclature is confusing as it's not obvious who is
sharing what, e.g. reasonable interpretations are that the guest value
is shared by vCPUs in a VM, or that the MSR value is shared/common to
guest and host, both of which are wrong.

"shared" is also misleading as the MSR value (in hardware) is not
guaranteed to be shared/reused between VMs (if that's indeed the correct
interpretation of the name), as the ability to share values between VMs
is simply a side effect (albiet a very nice side effect) of deferring
restoration of the host value until returning from userspace.

"user_return" avoids the above confusion by describing the mechanism
itself instead of its effects.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923180409.32255-2-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7e34fbd0

KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint · 235ba74f

由 Sean Christopherson 提交于 9月 23, 2020

Extend the kvm_exit tracepoint to align it with kvm_nested_vmexit in
terms of what information is captured. On SVM, add interrupt info and
error code, while on VMX it add IDT vectoring and error code. This
sets the stage for macrofying the kvm_exit tracepoint definition so that
it can be reused for kvm_nested_vmexit without loss of information.

Opportunistically stuff a zero for VM_EXIT_INTR_INFO if the VM-Enter
failed, as the field is guaranteed to be invalid. Note, it'd be
possible to further filter the interrupt/exception fields based on the
VM-Exit reason, but the helper is intended only for tracepoints, i.e.
an extra VMREAD or two is a non-issue, the failed VM-Enter case is just
low hanging fruit.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923201349.16097-5-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

235ba74f

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功