提交 · fe7e948837f312d87853b3fce743795d1ae3715a · openeuler / Kernel

20 4月, 2021 1 次提交

KVM: x86: Add capability to grant VM access to privileged SGX attribute · fe7e9488

由 Sean Christopherson 提交于 4月 12, 2021

Add a capability, KVM_CAP_SGX_ATTRIBUTE, that can be used by userspace
to grant a VM access to a priveleged attribute, with args[0] holding a
file handle to a valid SGX attribute file.

The SGX subsystem restricts access to a subset of enclave attributes to
provide additional security for an uncompromised kernel, e.g. to prevent
malware from using the PROVISIONKEY to ensure its nodes are running
inside a geniune SGX enclave and/or to obtain a stable fingerprint.

To prevent userspace from circumventing such restrictions by running an
enclave in a VM, KVM restricts guest access to privileged attributes by
default.

Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NKai Huang <kai.huang@intel.com>
Message-Id: <0b099d65e933e068e3ea934b0523bab070cb8cea.1618196135.git.kai.huang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fe7e9488

17 4月, 2021 1 次提交

KVM: introduce KVM_CAP_SET_GUEST_DEBUG2 · 8b13c364

由 Paolo Bonzini 提交于 4月 06, 2021

This capability will allow the user to know which KVM_GUESTDBG_* bits
are supported.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210401135451.1004564-3-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8b13c364

03 3月, 2021 1 次提交

KVM: x86/xen: Add support for vCPU runstate information · 30b5c851

由 David Woodhouse 提交于 3月 01, 2021

This is how Xen guests do steal time accounting. The hypervisor records
the amount of time spent in each of running/runnable/blocked/offline
states.

In the Xen accounting, a vCPU is still in state RUNSTATE_running while
in Xen for a hypercall or I/O trap, etc. Only if Xen explicitly schedules
does the state become RUNSTATE_blocked. In KVM this means that even when
the vCPU exits the kvm_run loop, the state remains RUNSTATE_running.

The VMM can explicitly set the vCPU to RUNSTATE_blocked by using the
KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT attribute, and can also use
KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST to retrospectively add a given
amount of time to the blocked state and subtract it from the running
state.

The state_entry_time corresponds to get_kvmclock_ns() at the time the
vCPU entered the current state, and the total times of all four states
should always add up to state_entry_time.
Co-developed-by: NJoao Martins <joao.m.martins@oracle.com>
Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20210301125309.874953-2-dwmw2@infradead.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

30b5c851

10 2月, 2021 1 次提交

KVM: PPC: Book3S HV: Introduce new capability for 2nd DAWR · d9a47eda

由 Ravi Bangoria 提交于 12月 16, 2020

Introduce KVM_CAP_PPC_DAWR1 which can be used by QEMU to query whether
KVM supports 2nd DAWR or not. The capability is by default disabled
even when the underlying CPU supports 2nd DAWR. QEMU needs to check
and enable it manually to use the feature.
Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

d9a47eda

04 2月, 2021 11 次提交

KVM: x86: declare Xen HVM shared info capability and add test case · 8d4e7e80

由 David Woodhouse 提交于 12月 04, 2020

Instead of adding a plethora of new KVM_CAP_XEN_FOO capabilities, just
add bits to the return value of KVM_CAP_XEN_HVM.
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

8d4e7e80

KVM: x86/xen: Add event channel interrupt vector upcall · 40da8ccd

由 David Woodhouse 提交于 12月 09, 2020

It turns out that we can't handle event channels *entirely* in userspace
by delivering them as ExtINT, because KVM is a bit picky about when it
accepts ExtINT interrupts from a legacy PIC. The in-kernel local APIC
has to have LVT0 configured in APIC_MODE_EXTINT and unmasked, which
isn't necessarily the case for Xen guests especially on secondary CPUs.

To cope with this, add kvm_xen_get_interrupt() which checks the
evtchn_pending_upcall field in the Xen vcpu_info, and delivers the Xen
upcall vector (configured by KVM_XEN_ATTR_TYPE_UPCALL_VECTOR) if it's
set regardless of LAPIC LVT0 configuration. This gives us the minimum
support we need for completely userspace-based implementation of event
channels.

This does mean that vcpu_enter_guest() needs to check for the
evtchn_pending_upcall flag being set, because it can't rely on someone
having set KVM_REQ_EVENT unless we were to add some way for userspace to
do so manually.
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

40da8ccd

KVM: x86/xen: register vcpu time info region · f2340cd9

由 Joao Martins 提交于 7月 23, 2018

Allow the Xen emulated guest the ability to register secondary
vcpu time information. On Xen guests this is used in order to be
mapped to userspace and hence allow vdso gettimeofday to work.
Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

f2340cd9

KVM: x86/xen: register vcpu info · 73e69a86

由 Joao Martins 提交于 6月 29, 2018

The vcpu info supersedes the per vcpu area of the shared info page and
the guest vcpus will use this instead.
Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
Signed-off-by: NAnkur Arora <ankur.a.arora@oracle.com>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

73e69a86

KVM: x86/xen: Add KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTR · 3e324615

由 David Woodhouse 提交于 2月 02, 2021

This will be used for per-vCPU setup such as runstate and vcpu_info.
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

3e324615

KVM: x86/xen: register shared_info page · 13ffb97a

由 Joao Martins 提交于 6月 15, 2018

Add KVM_XEN_ATTR_TYPE_SHARED_INFO to allow hypervisor to know where the
guest's shared info page is.
Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

13ffb97a

D
KVM: x86/xen: latch long_mode when hypercall page is set up · a3833b81
由 David Woodhouse 提交于 12月 03, 2020
```
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
```
a3833b81

KVM: x86/xen: add KVM_XEN_HVM_SET_ATTR/KVM_XEN_HVM_GET_ATTR · a76b9641

由 Joao Martins 提交于 12月 03, 2020

This will be used to set up shared info pages etc.
Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

a76b9641

KVM: x86/xen: intercept xen hypercalls if enabled · 23200b7a

由 Joao Martins 提交于 6月 13, 2018

Add a new exit reason for emulator to handle Xen hypercalls.

Since this means KVM owns the ABI, dispense with the facility for the
VMM to provide its own copy of the hypercall pages; just fill them in
directly using VMCALL/VMMCALL as we do for the Hyper-V hypercall page.

This behaviour is enabled by a new INTERCEPT_HCALL flag in the
KVM_XEN_HVM_CONFIG ioctl structure, and advertised by the same flag
being returned from the KVM_CAP_XEN_HVM check.

Rename xen_hvm_config() to kvm_xen_write_hypercall_page() and move it
to the nascent xen.c while we're at it, and add a test case.
Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

23200b7a

KVM: VMX: Enable bus lock VM exit · fe6b6bc8

由 Chenyi Qiang 提交于 11月 06, 2020

Virtual Machine can exploit bus locks to degrade the performance of
system. Bus lock can be caused by split locked access to writeback(WB)
memory or by using locks on uncacheable(UC) memory. The bus lock is
typically >1000 cycles slower than an atomic operation within a cache
line. It also disrupts performance on other cores (which must wait for
the bus lock to be released before their memory operations can
complete).

To address the threat, bus lock VM exit is introduced to notify the VMM
when a bus lock was acquired, allowing it to enforce throttling or other
policy based mitigations.

A VMM can enable VM exit due to bus locks by setting a new "Bus Lock
Detection" VM-execution control(bit 30 of Secondary Processor-based VM
execution controls). If delivery of this VM exit was preempted by a
higher priority VM exit (e.g. EPT misconfiguration, EPT violation, APIC
access VM exit, APIC write VM exit, exception bitmap exiting), bit 26 of
exit reason in vmcs field is set to 1.

In current implementation, the KVM exposes this capability through
KVM_CAP_X86_BUS_LOCK_EXIT. The user can get the supported mode bitmap
(i.e. off and exit) and enable it explicitly (disabled by default). If
bus locks in guest are detected by KVM, exit to user space even when
current exit reason is handled by KVM internally. Set a new field
KVM_RUN_BUS_LOCK in vcpu->run->flags to inform the user space that there
is a bus lock detected in guest.

Document for Bus Lock VM exit is now available at the latest "Intel
Architecture Instruction Set Extensions Programming Reference".

Document Link:
https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.htmlCo-developed-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20201106090315.18606-4-chenyi.qiang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fe6b6bc8

KVM/SVM: add support for SEV attestation command · 2c07ded0

由 Brijesh Singh 提交于 1月 04, 2021

The SEV FW version >= 0.23 added a new command that can be used to query
the attestation report containing the SHA-256 digest of the guest memory
encrypted through the KVM_SEV_LAUNCH_UPDATE_{DATA, VMSA} commands and
sign the report with the Platform Endorsement Key (PEK).

See the SEV FW API spec section 6.8 for more details.

Note there already exist a command (KVM_SEV_LAUNCH_MEASURE) that can be
used to get the SHA-256 digest. The main difference between the
KVM_SEV_LAUNCH_MEASURE and KVM_SEV_ATTESTATION_REPORT is that the latter
can be called while the guest is running and the measurement value is
signed with PEK.

Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: John Allen <john.allen@amd.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: linux-crypto@vger.kernel.org
Reviewed-by: NTom Lendacky <thomas.lendacky@amd.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Tested-by: NJames Bottomley <jejb@linux.ibm.com>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Message-Id: <20210104151749.30248-1-brijesh.singh@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2c07ded0

08 1月, 2021 1 次提交

KVM: SVM: Add support for booting APs in an SEV-ES guest · 647daca2

由 Tom Lendacky 提交于 1月 04, 2021

Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
where the guest vCPU register state is updated and then the vCPU is VMRUN
to begin execution of the AP. For an SEV-ES guest, this won't work because
the guest register state is encrypted.

Following the GHCB specification, the hypervisor must not alter the guest
register state, so KVM must track an AP/vCPU boot. Should the guest want
to park the AP, it must use the AP Reset Hold exit event in place of, for
example, a HLT loop.

First AP boot (first INIT-SIPI-SIPI sequence):
Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
support. It is up to the guest to transfer control of the AP to the
proper location.

Subsequent AP boot:
KVM will expect to receive an AP Reset Hold exit event indicating that
the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
awaken it. When the AP Reset Hold exit event is received, KVM will place
the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
sequence, KVM will make the vCPU runnable. It is again up to the guest
to then transfer control of the AP to the proper location.

To differentiate between an actual HLT and an AP Reset Hold, a new MP
state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
placed in upon receiving the AP Reset Hold exit event. Additionally, to
communicate the AP Reset Hold exit event up to userspace (if needed), a
new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.

A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
a new function that, for non SEV-ES guests, invokes the original SIPI
delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
implements the logic above.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

647daca2

15 11月, 2020 2 次提交

KVM: X86: Implement ring-based dirty memory tracking · fb04a1ed

由 Peter Xu 提交于 9月 30, 2020

This patch is heavily based on previous work from Lei Cao
<lei.cao@stratus.com> and Paolo Bonzini <pbonzini@redhat.com>. [1]

KVM currently uses large bitmaps to track dirty memory. These bitmaps
are copied to userspace when userspace queries KVM for its dirty page
information. The use of bitmaps is mostly sufficient for live
migration, as large parts of memory are be dirtied from one log-dirty
pass to another. However, in a checkpointing system, the number of
dirty pages is small and in fact it is often bounded---the VM is
paused when it has dirtied a pre-defined number of pages. Traversing a
large, sparsely populated bitmap to find set bits is time-consuming,
as is copying the bitmap to user-space.

A similar issue will be there for live migration when the guest memory
is huge while the page dirty procedure is trivial. In that case for
each dirty sync we need to pull the whole dirty bitmap to userspace
and analyse every bit even if it's mostly zeros.

The preferred data structure for above scenarios is a dense list of
guest frame numbers (GFN). This patch series stores the dirty list in
kernel memory that can be memory mapped into userspace to allow speedy
harvesting.

This patch enables dirty ring for X86 only. However it should be
easily extended to other archs as well.

[1] https://patchwork.kernel.org/patch/10471409/Signed-off-by: NLei Cao <lei.cao@stratus.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Message-Id: <20201001012222.5767-1-peterx@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fb04a1ed

KVM: x86: hyper-v: allow KVM_GET_SUPPORTED_HV_CPUID as a system ioctl · c21d54f0

由 Vitaly Kuznetsov 提交于 9月 29, 2020

KVM_GET_SUPPORTED_HV_CPUID is a vCPU ioctl but its output is now
independent from vCPU and in some cases VMMs may want to use it as a system
ioctl instead. In particular, QEMU doesn CPU feature expansion before any
vCPU gets created so KVM_GET_SUPPORTED_HV_CPUID can't be used.

Convert KVM_GET_SUPPORTED_HV_CPUID to 'dual' system/vCPU ioctl with the
same meaning.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200929150944.1235688-2-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c21d54f0

22 10月, 2020 1 次提交

kvm: x86: only provide PV features if enabled in guest's CPUID · 66570e96

由 Oliver Upton 提交于 8月 18, 2020

KVM unconditionally provides PV features to the guest, regardless of the
configured CPUID. An unwitting guest that doesn't check
KVM_CPUID_FEATURES before use could access paravirt features that
userspace did not intend to provide. Fix this by checking the guest's
CPUID before performing any paravirtual operations.

Introduce a capability, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, to gate the
aforementioned enforcement. Migrating a VM from a host w/o this patch to
a host with this patch could silently change the ABI exposed to the
guest, warranting that we default to the old behavior and opt-in for
the new one.
Reviewed-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NPeter Shier <pshier@google.com>
Signed-off-by: NOliver Upton <oupton@google.com>
Change-Id: I202a0926f65035b872bfe8ad15307c026de59a98
Message-Id: <20200818152429.1923996-4-oupton@google.com>
Reviewed-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

66570e96

28 9月, 2020 2 次提交

KVM: x86: Introduce MSR filtering · 1a155254

由 Alexander Graf 提交于 9月 25, 2020

It's not desireable to have all MSRs always handled by KVM kernel space. Some
MSRs would be useful to handle in user space to either emulate behavior (like
uCode updates) or differentiate whether they are valid based on the CPU model.

To allow user space to specify which MSRs it wants to see handled by KVM,
this patch introduces a new ioctl to push filter rules with bitmaps into
KVM. Based on these bitmaps, KVM can then decide whether to reject MSR access.
With the addition of KVM_CAP_X86_USER_SPACE_MSR it can also deflect the
denied MSR events to user space to operate on.

If no filter is populated, MSR handling stays identical to before.
Signed-off-by: NAlexander Graf <graf@amazon.com>

Message-Id: <20200925143422.21718-8-graf@amazon.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1a155254

KVM: x86: Allow deflecting unknown MSR accesses to user space · 1ae09954

由 Alexander Graf 提交于 9月 25, 2020

MSRs are weird. Some of them are normal control registers, such as EFER.
Some however are registers that really are model specific, not very
interesting to virtualization workloads, and not performance critical.
Others again are really just windows into package configuration.

Out of these MSRs, only the first category is necessary to implement in
kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against
certain CPU models and MSRs that contain information on the package level
are much better suited for user space to process. However, over time we have
accumulated a lot of MSRs that are not the first category, but still handled
by in-kernel KVM code.

This patch adds a generic interface to handle WRMSR and RDMSR from user
space. With this, any future MSR that is part of the latter categories can
be handled in user space.

Furthermore, it allows us to replace the existing "ignore_msrs" logic with
something that applies per-VM rather than on the full system. That way you
can run productive VMs in parallel to experimental ones where you don't care
about proper MSR handling.
Signed-off-by: NAlexander Graf <graf@amazon.com>
Reviewed-by: NJim Mattson <jmattson@google.com>

Message-Id: <20200925143422.21718-3-graf@amazon.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1ae09954

12 9月, 2020 1 次提交

KVM: MIPS: Change the definition of kvm type · 15e9e35c

由 Huacai Chen 提交于 9月 10, 2020

MIPS defines two kvm types:

 #define KVM_VM_MIPS_TE          0
 #define KVM_VM_MIPS_VZ          1

In Documentation/virt/kvm/api.rst it is said that "You probably want to
use 0 as machine type", which implies that type 0 be the "automatic" or
"default" type. And, in user-space libvirt use the null-machine (with
type 0) to detect the kvm capability, which returns "KVM not supported"
on a VZ platform.

I try to fix it in QEMU but it is ugly:
https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05629.html

And Thomas Huth suggests me to change the definition of kvm type:
https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg03281.html

So I define like this:

 #define KVM_VM_MIPS_AUTO        0
 #define KVM_VM_MIPS_VZ          1
 #define KVM_VM_MIPS_TE          2

Since VZ and TE cannot co-exists, using type 0 on a TE platform will
still return success (so old user-space tools have no problems on new
kernels); the advantage is that using type 0 on a VZ platform will not
return failure. So, the only problem is "new user-space tools use type
2 on old kernels", but if we treat this as a kernel bug, we can backport
this patch to old stable kernels.
Signed-off-by: NHuacai Chen <chenhc@lemote.com>
Message-Id: <1599734031-28746-1-git-send-email-chenhc@lemote.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

15e9e35c

21 8月, 2020 1 次提交

arm64/x86: KVM: Introduce steal-time cap · 004a0124

由 Andrew Jones 提交于 8月 04, 2020

arm64 requires a vcpu fd (KVM_HAS_DEVICE_ATTR vcpu ioctl) to probe
support for steal-time. However this is unnecessary, as only a KVM
fd is required, and it complicates userspace (userspace may prefer
delaying vcpu creation until after feature probing). Introduce a cap
that can be checked instead. While x86 can already probe steal-time
support with a kvm fd (KVM_GET_SUPPORTED_CPUID), we add the cap there
too for consistency.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20200804170604.42662-7-drjones@redhat.com

004a0124

11 7月, 2020 1 次提交

KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support · 3edd6839

由 Mohammed Gamal 提交于 7月 10, 2020

This patch adds a new capability KVM_CAP_SMALLER_MAXPHYADDR which
allows userspace to query if the underlying architecture would
support GUEST_MAXPHYADDR < HOST_MAXPHYADDR and hence act accordingly
(e.g. qemu can decide if it should warn for -cpu ..,phys-bits=X)

The complications in this patch are due to unexpected (but documented)
behaviour we see with NPF vmexit handling in AMD processor.  If
SVM is modified to add guest physical address checks in the NPF
and guest #PF paths, we see the followning error multiple times in
the 'access' test in kvm-unit-tests:

            test pte.p pte.36 pde.p: FAIL: pte 2000021 expected 2000001
            Dump mapping: address: 0x123400000000
            ------L4: 24c3027
            ------L3: 24c4027
            ------L2: 24c5021
            ------L1: 1002000021

This is because the PTE's accessed bit is set by the CPU hardware before
the NPF vmexit. This is handled completely by hardware and cannot be fixed
in software.

Therefore, availability of the new capability depends on a boolean variable
allow_smaller_maxphyaddr which is set individually by VMX and SVM init
routines. On VMX it's always set to true, on SVM it's only set to true
when NPT is not enabled.

CC: Tom Lendacky <thomas.lendacky@amd.com>
CC: Babu Moger <babu.moger@amd.com>
Signed-off-by: NMohammed Gamal <mgamal@redhat.com>
Message-Id: <20200710154811.418214-10-mgamal@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3edd6839

09 7月, 2020 1 次提交

kvm: x86: Add "last CPU" to some KVM_EXIT information · 1aa561b1

由 Jim Mattson 提交于 6月 03, 2020

More often than not, a failed VM-entry in an x86 production
environment is induced by a defective CPU. To help identify the bad
hardware, include the id of the last logical CPU to run a vCPU in the
information provided to userspace on a KVM exit for failed VM-entry or
for KVM internal errors not associated with emulation. The presence of
this additional information is indicated by a new capability,
KVM_CAP_LAST_CPU.
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NOliver Upton <oupton@google.com>
Reviewed-by: NPeter Shier <pshier@google.com>
Message-Id: <20200603235623.245638-5-jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1aa561b1

23 6月, 2020 1 次提交

s390/kvm: diagnose 0x318 sync and reset · 23a60f83

由 Collin Walling 提交于 6月 22, 2020

DIAGNOSE 0x318 (diag318) sets information regarding the environment
the VM is running in (Linux, z/VM, etc) and is observed via
firmware/service events.

This is a privileged s390x instruction that must be intercepted by
SIE. Userspace handles the instruction as well as migration. Data
is communicated via VCPU register synchronization.

The Control Program Name Code (CPNC) is stored in the SIE block. The
CPNC along with the Control Program Version Code (CPVC) are stored
in the kvm_vcpu_arch struct.

This data is reset on load normal and clear resets.
Signed-off-by: NCollin Walling <walling@linux.ibm.com>
Reviewed-by: NJanosch Frank <frankja@linux.ibm.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200622154636.5499-3-walling@linux.ibm.com
[borntraeger@de.ibm.com: fix sync_reg position]
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

23a60f83

01 6月, 2020 3 次提交

x86/kvm/hyper-v: Add support for synthetic debugger interface · f97f5a56

由 Jon Doron 提交于 5月 29, 2020

Add support for Hyper-V synthetic debugger (syndbg) interface.
The syndbg interface is using MSRs to emulate a way to send/recv packets
data.

The debug transport dll (kdvm/kdnet) will identify if Hyper-V is enabled
and if it supports the synthetic debugger interface it will attempt to
use it, instead of trying to initialize a network adapter.
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NJon Doron <arilou@gmail.com>
Message-Id: <20200529134543.1127440-4-arilou@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f97f5a56

x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit · f7d31e65

由 Jon Doron 提交于 4月 24, 2020

The problem the patch is trying to address is the fact that 'struct
kvm_hyperv_exit' has different layout on when compiling in 32 and 64 bit
modes.

In 64-bit mode the default alignment boundary is 64 bits thus
forcing extra gaps after 'type' and 'msr' but in 32-bit mode the
boundary is at 32 bits thus no extra gaps.

This is an issue as even when the kernel is 64 bit, the userspace using
the interface can be both 32 and 64 bit but the same 32 bit userspace has
to work with 32 bit kernel.

The issue is fixed by forcing the 64 bit layout, this leads to ABI
change for 32 bit builds and while we are obviously breaking '32 bit
userspace with 32 bit kernel' case, we're fixing the '32 bit userspace
with 64 bit kernel' one.

As the interface has no (known) users and 32 bit KVM is rather baroque
nowadays, this seems like a reasonable decision.
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NJon Doron <arilou@gmail.com>
Message-Id: <20200424113746.3473563-2-arilou@gmail.com>
Reviewed-by: NRoman Kagan <rvkagan@yandex-team.ru>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f7d31e65

KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT · 72de5fa4

由 Vitaly Kuznetsov 提交于 5月 25, 2020

Introduce new capability to indicate that KVM supports interrupt based
delivery of 'page ready' APF events. This includes support for both
MSR_KVM_ASYNC_PF_INT and MSR_KVM_ASYNC_PF_ACK.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200525144125.143875-8-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

72de5fa4

25 4月, 2020 1 次提交

kvm: add capability for halt polling · acd05785

由 David Matlack 提交于 4月 17, 2020

KVM_CAP_HALT_POLL is a per-VM capability that lets userspace
control the halt-polling time, allowing halt-polling to be tuned or
disabled on particular VMs.

With dynamic halt-polling, a VM's VCPUs can poll from anywhere from
[0, halt_poll_ns] on each halt. KVM_CAP_HALT_POLL sets the
upper limit on the poll time.
Signed-off-by: NDavid Matlack <dmatlack@google.com>
Signed-off-by: NJon Cargille <jcargill@google.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Message-Id: <20200417221446.108733-1-jcargill@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

acd05785

21 4月, 2020 1 次提交

docs: fix broken references for ReST files that moved around · 3ecad8c2

由 Mauro Carvalho Chehab 提交于 4月 14, 2020

Some broken references happened due to shifting files around
and ReST renames. Those can't be auto-fixed by the script,
so let's fix them manually.
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
Link: https://lore.kernel.org/r/64773a12b4410aaf3e3be89e3ec7e34de2484eea.1586881715.git.mchehab+huawei@kernel.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>

3ecad8c2

26 3月, 2020 1 次提交

KVM: PPC: Book3S HV: Add a capability for enabling secure guests · 9a5788c6

由 Paul Mackerras 提交于 3月 19, 2020

At present, on Power systems with Protected Execution Facility
hardware and an ultravisor, a KVM guest can transition to being a
secure guest at will.  Userspace (QEMU) has no way of knowing
whether a host system is capable of running secure guests.  This
will present a problem in future when the ultravisor is capable of
migrating secure guests from one host to another, because
virtualization management software will have no way to ensure that
secure guests only run in domains where all of the hosts can
support secure guests.

This adds a VM capability which has two functions: (a) userspace
can query it to find out whether the host can support secure guests,
and (b) userspace can enable it for a guest, which allows that
guest to become a secure guest.  If userspace does not enable it,
KVM will return an error when the ultravisor does the hypercall
that indicates that the guest is starting to transition to a
secure guest.  The ultravisor will then abort the transition and
the guest will terminate.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NRam Pai <linuxram@us.ibm.com>

9a5788c6

17 3月, 2020 1 次提交

KVM: x86: enable dirty log gradually in small chunks · 3c9bd400

由 Jay Zhou 提交于 2月 27, 2020

It could take kvm->mmu_lock for an extended period of time when
enabling dirty log for the first time. The main cost is to clear
all the D-bits of last level SPTEs. This situation can benefit from
manual dirty log protect as well, which can reduce the mmu_lock
time taken. The sequence is like this:

1. Initialize all the bits of the dirty bitmap to 1 when enabling
   dirty log for the first time
2. Only write protect the huge pages
3. KVM_GET_DIRTY_LOG returns the dirty bitmap info
4. KVM_CLEAR_DIRTY_LOG will clear D-bit for each of the leaf level
   SPTEs gradually in small chunks

Under the Intel(R) Xeon(R) Gold 6152 CPU @ 2.10GHz environment,
I did some tests with a 128G windows VM and counted the time taken
of memory_global_dirty_log_start, here is the numbers:

VM Size        Before    After optimization
128G           460ms     10ms
Signed-off-by: NJay Zhou <jianjay.zhou@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3c9bd400

28 2月, 2020 4 次提交

KVM: s390: protvirt: introduce and enable KVM_CAP_S390_PROTECTED · 13da9ae1

由 Christian Borntraeger 提交于 2月 18, 2020

Now that everything is in place, we can announce the feature.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>

13da9ae1

KVM: s390: protvirt: UV calls in support of diag308 0, 1 · e0d2773d

由 Janosch Frank 提交于 5月 09, 2019

diag 308 subcode 0 and 1 require several KVM and Ultravisor interactions.
Specific to these "soft" reboots are

* The "unshare all" UVC
* The "prepare for reset" UVC
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Acked-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

e0d2773d

KVM: S390: protvirt: Introduce instruction data area bounce buffer · 19e12277

由 Janosch Frank 提交于 4月 02, 2019

Now that we can't access guest memory anymore, we have a dedicated
satellite block that's a bounce buffer for instruction data.

We re-use the memop interface to copy the instruction data to / from
userspace. This lets us re-use a lot of QEMU code which used that
interface to make logical guest memory accesses which are not possible
anymore in protected mode anyway.
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

19e12277

KVM: s390: protvirt: Add initial vm and cpu lifecycle handling · 29b40f10

由 Janosch Frank 提交于 9月 30, 2019

This contains 3 main changes:
1. changes in SIE control block handling for secure guests
2. helper functions for create/destroy/unpack secure guests
3. KVM_S390_PV_COMMAND ioctl to allow userspace dealing with secure
machines
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

29b40f10

31 1月, 2020 1 次提交

KVM: s390: Add new reset vcpu API · 7de3f142

由 Janosch Frank 提交于 1月 31, 2020

The architecture states that we need to reset local IRQs for all CPU
resets. Because the old reset interface did not support the normal CPU
reset we never did that on a normal reset.

Let's implement an interface for the missing normal and clear resets
and reset all local IRQs, registers and control structures as stated
in the architecture.

Userspace might already reset the registers via the vcpu run struct,
but as we need the interface for the interrupt clearing part anyway,
we implement the resets fully and don't rely on userspace to reset the
rest.
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20200131100205.74720-4-frankja@linux.ibm.comSigned-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

7de3f142

28 11月, 2019 1 次提交

KVM: PPC: Book3S HV: Support reset of secure guest · 22945688

由 Bharata B Rao 提交于 11月 25, 2019

Add support for reset of secure guest via a new ioctl KVM_PPC_SVM_OFF.
This ioctl will be issued by QEMU during reset and includes the
the following steps:

- Release all device pages of the secure guest.
- Ask UV to terminate the guest via UV_SVM_TERMINATE ucall
- Unpin the VPA pages so that they can be migrated back to secure
  side when guest becomes secure again. This is required because
  pinned pages can't be migrated.
- Reinit the partition scoped page tables

After these steps, guest is ready to issue UV_ESM call once again
to switch to secure mode.
Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
	[Implementation of uv_svm_terminate() and its call from
	guest shutdown path]
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
	[Unpinning of VPA pages]
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

22945688

22 10月, 2019 1 次提交

KVM: arm64: Provide VCPU attributes for stolen time · 58772e9a

由 Steven Price 提交于 10月 21, 2019

Allow user space to inform the KVM host where in the physical memory
map the paravirtualized time structures should be located.

User space can set an attribute on the VCPU providing the IPA base
address of the stolen time structure for that VCPU. This must be
repeated for every VCPU in the VM.

The address is given in terms of the physical address visible to
the guest and must be 64 byte aligned. The guest will discover the
address via a hypercall.
Signed-off-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

58772e9a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功