提交 · 7ca7f3b944929c99637522d849138ba15f97e3fe · openeuler / Kernel

04 2月, 2021 3 次提交

x86: kvm: style: Simplify bool comparison · 7ca7f3b9

由 YANG LI 提交于 1月 11, 2021

Fix the following coccicheck warning:
./arch/x86/kvm/x86.c:8012:5-48: WARNING: Comparison to bool
Signed-off-by: NYANG LI <abaci-bugfix@linux.alibaba.com>
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Message-Id: <1610357578-66081-1-git-send-email-abaci-bugfix@linux.alibaba.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7ca7f3b9

KVM: x86: Take KVM's SRCU lock only if steal time update is needed · 15b51dc0

由 Sean Christopherson 提交于 1月 22, 2021

Enter a SRCU critical section for a memslots lookup during steal time
update if and only if a steal time update is actually needed.  Taking
the lock can be avoided if steal time is disabled by the guest, or if
KVM knows it has already flagged the vCPU as being preempted.

Reword the comment to be more precise as to exactly why memslots will
be queried.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210123000334.3123628-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

15b51dc0

KVM: x86: Remove obsolete disabling of page faults in kvm_arch_vcpu_put() · 19979fba

由 Sean Christopherson 提交于 1月 22, 2021

Remove the disabling of page faults across kvm_steal_time_set_preempted()
as KVM now accesses the steal time struct (shared with the guest) via a
cached mapping (see commit b0431382, "x86/KVM: Make sure
KVM_VCPU_FLUSH_TLB flag is not missed".) The cache lookup is flagged as
atomic, thus it would be a bug if KVM tried to resolve a new pfn, i.e.
we want the splat that would be reached via might_fault().
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210123000334.3123628-2-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

19979fba

03 2月, 2021 1 次提交

KVM: x86: cleanup CR3 reserved bits checks · c1c35cf7

由 Paolo Bonzini 提交于 11月 13, 2020

If not in long mode, the low bits of CR3 are reserved but not enforced to
be zero, so remove those checks.  If in long mode, however, the MBZ bits
extend down to the highest physical address bit of the guest, excluding
the encryption bit.

Make the checks consistent with the above, and match them between
nested_vmcb_checks and KVM_SET_SREGS.

Cc: stable@vger.kernel.org
Fixes: 761e4169 ("KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests")
Fixes: a780a3ea ("KVM: X86: Fix reserved bits check for MOV to CR3")
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c1c35cf7

02 2月, 2021 2 次提交

KVM/x86: assign hva with the right value to vm_munmap the pages · b66f9bab

由 Zheng Zhan Liang 提交于 2月 01, 2021

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NZheng Zhan Liang <zhengzhanliang@huorong.cn>
Message-Id: <20210201055310.267029-1-zhengzhanliang@huorong.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b66f9bab

KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off · 7131636e

由 Paolo Bonzini 提交于 1月 28, 2021

Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST
will generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
When this happens and the host has tsx=on, it is possible to end up with
virtual machines that have HLE and RTM disabled, but TSX_CTRL available.

If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible to
use the tsx=off hosts as migration destinations, even though the guests
do not have TSX enabled.

To allow this migration, allow guests to write to their TSX_CTRL MSR,
while keeping the host MSR unchanged for the entire life of the guests.
This ensures that TSX remains disabled and also saves MSR reads and
writes, and it's okay to do because with tsx=off we know that guests will
not have the HLE and RTM features in their CPUID. (If userspace sets
bogus CPUID data, we do not expect HLE and RTM to work in guests anyway).

Cc: stable@vger.kernel.org
Fixes: cbbaa272 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7131636e

26 1月, 2021 3 次提交

KVM: x86: allow KVM_REQ_GET_NESTED_STATE_PAGES outside guest mode for VMX · 9a78e158

由 Paolo Bonzini 提交于 1月 08, 2021

VMX also uses KVM_REQ_GET_NESTED_STATE_PAGES for the Hyper-V eVMCS,
which may need to be loaded outside guest mode.  Therefore we cannot
WARN in that case.

However, that part of nested_get_vmcs12_pages is _not_ needed at
vmentry time.  Split it out of KVM_REQ_GET_NESTED_STATE_PAGES handling,
so that both vmentry and migration (and in the latter case, independent
of is_guest_mode) do the parts that are needed.

Cc: <stable@vger.kernel.org> # 5.10.x: f2c7ef3b: KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES
Cc: <stable@vger.kernel.org> # 5.10.x
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9a78e158

kvm: tracing: Fix unmatched kvm_entry and kvm_exit events · d95df951

由 Lorenzo Brescia 提交于 12月 23, 2020

On VMX, if we exit and then re-enter immediately without leaving
the vmx_vcpu_run() function, the kvm_entry event is not logged.
That means we will see one (or more) kvm_exit, without its (their)
corresponding kvm_entry, as shown here:

 CPU-1979 [002] 89.871187: kvm_entry: vcpu 1
 CPU-1979 [002] 89.871218: kvm_exit:  reason MSR_WRITE
 CPU-1979 [002] 89.871259: kvm_exit:  reason MSR_WRITE

It also seems possible for a kvm_entry event to be logged, but then
we leave vmx_vcpu_run() right away (if vmx->emulation_required is
true). In this case, we will have a spurious kvm_entry event in the
trace.

Fix these situations by moving trace_kvm_entry() inside vmx_vcpu_run()
(where trace_kvm_exit() already is).

A trace obtained with this patch applied looks like this:

 CPU-14295 [000] 8388.395387: kvm_entry: vcpu 0
 CPU-14295 [000] 8388.395392: kvm_exit:  reason MSR_WRITE
 CPU-14295 [000] 8388.395393: kvm_entry: vcpu 0
 CPU-14295 [000] 8388.395503: kvm_exit:  reason EXTERNAL_INTERRUPT

Of course, not calling trace_kvm_entry() in common x86 code any
longer means that we need to adjust the SVM side of things too.
Signed-off-by: NLorenzo Brescia <lorenzo.brescia@edu.unito.it>
Signed-off-by: NDario Faggioli <dfaggioli@suse.com>
Message-Id: <160873470698.11652.13483635328769030605.stgit@Wayrath>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d95df951

KVM: x86: get smi pending status correctly · 1f7becf1

由 Jay Zhou 提交于 1月 18, 2021

The injection process of smi has two steps:

    Qemu                        KVM
Step1:
    cpu->interrupt_request &= \
        ~CPU_INTERRUPT_SMI;
    kvm_vcpu_ioctl(cpu, KVM_SMI)

                                call kvm_vcpu_ioctl_smi() and
                                kvm_make_request(KVM_REQ_SMI, vcpu);

Step2:
    kvm_vcpu_ioctl(cpu, KVM_RUN, 0)

                                call process_smi() if
                                kvm_check_request(KVM_REQ_SMI, vcpu) is
                                true, mark vcpu->arch.smi_pending = true;

The vcpu->arch.smi_pending will be set true in step2, unfortunately if
vcpu paused between step1 and step2, the kvm_run->immediate_exit will be
set and vcpu has to exit to Qemu immediately during step2 before mark
vcpu->arch.smi_pending true.
During VM migration, Qemu will get the smi pending status from KVM using
KVM_GET_VCPU_EVENTS ioctl at the downtime, then the smi pending status
will be lost.
Signed-off-by: NJay Zhou <jianjay.zhou@huawei.com>
Signed-off-by: NShengen Zhuang <zhuangshengen@huawei.com>
Message-Id: <20210118084720.1585-1-jianjay.zhou@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1f7becf1

08 1月, 2021 3 次提交

P
KVM: x86: __kvm_vcpu_halt can be static · 872f36eb
由 Paolo Bonzini 提交于 1月 08, 2021
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
872f36eb

KVM: SVM: Add support for booting APs in an SEV-ES guest · 647daca2

由 Tom Lendacky 提交于 1月 04, 2021

Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
where the guest vCPU register state is updated and then the vCPU is VMRUN
to begin execution of the AP. For an SEV-ES guest, this won't work because
the guest register state is encrypted.

Following the GHCB specification, the hypervisor must not alter the guest
register state, so KVM must track an AP/vCPU boot. Should the guest want
to park the AP, it must use the AP Reset Hold exit event in place of, for
example, a HLT loop.

First AP boot (first INIT-SIPI-SIPI sequence):
Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
support. It is up to the guest to transfer control of the AP to the
proper location.

Subsequent AP boot:
KVM will expect to receive an AP Reset Hold exit event indicating that
the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
awaken it. When the AP Reset Hold exit event is received, KVM will place
the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
sequence, KVM will make the vCPU runnable. It is again up to the guest
to then transfer control of the AP to the proper location.

To differentiate between an actual HLT and an AP Reset Hold, a new MP
state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
placed in upon receiving the AP Reset Hold exit event. Additionally, to
communicate the AP Reset Hold exit event up to userspace (if needed), a
new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.

A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
a new function that, for non SEV-ES guests, invokes the original SIPI
delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
implements the logic above.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

647daca2

KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit · f2c7ef3b

由 Maxim Levitsky 提交于 1月 07, 2021

It is possible to exit the nested guest mode, entered by
svm_set_nested_state prior to first vm entry to it (e.g due to pending event)
if the nested run was not pending during the migration.

In this case we must not switch to the nested msr permission bitmap.
Also add a warning to catch similar cases in the future.

Fixes: a7d5c7ce ("KVM: nSVM: delay MSR permission processing to first nested VM run")
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210107093854.882483-2-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f2c7ef3b

20 12月, 2020 1 次提交

mm, kvm: account kvm_vcpu_mmap to kmemcg · 93bb59ca

由 Shakeel Butt 提交于 12月 18, 2020

A VCPU of a VM can allocate couple of pages which can be mmap'ed by the
user space application. At the moment this memory is not charged to the
memcg of the VMM. On a large machine running large number of VMs or
small number of VMs having large number of VCPUs, this unaccounted
memory can be very significant. So, charge this memory to the memcg of
the VMM. Please note that lifetime of these allocations corresponds to
the lifetime of the VMM.

Link: https://lkml.kernel.org/r/20201106202923.2087414-1-shakeelb@google.comSigned-off-by: NShakeel Butt <shakeelb@google.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

93bb59ca

15 12月, 2020 15 次提交

KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests · 16809ecd