提交 · 13da9ae1cdbf1ec4ea36b7612e606681c27cca13 · openeuler / Kernel

28 2月, 2020 1 次提交

KVM: s390: protvirt: disallow one_reg · 68cf7b1f

由 Janosch Frank 提交于 6月 14, 2019

A lot of the registers are controlled by the Ultravisor and never
visible to KVM. Some fields in the sie control block are overlayed, like
gbea. As no known userspace uses the ONE_REG interface on s390 if sync
regs are available, no functionality is lost if it is disabled for
protected guests.
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

68cf7b1f

13 2月, 2020 1 次提交

docs: kvm: Convert api.txt to ReST format · 106ee47d

由 Mauro Carvalho Chehab 提交于 2月 10, 2020

convert api.txt document to ReST format while trying to keep
its format as close as possible with the authors intent, and
avoid adding uneeded markups.

- Use document title and chapter markups;
- Convert tables;
- Add markups for literal blocks;
- use :field: for field descriptions;
- Add blank lines and adjust indentation
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

106ee47d

31 1月, 2020 1 次提交

KVM: s390: Add new reset vcpu API · 7de3f142

由 Janosch Frank 提交于 1月 31, 2020

The architecture states that we need to reset local IRQs for all CPU
resets. Because the old reset interface did not support the normal CPU
reset we never did that on a normal reset.

Let's implement an interface for the missing normal and clear resets
and reset all local IRQs, registers and control structures as stated
in the architecture.

Userspace might already reset the registers via the vcpu run struct,
but as we need the interface for the interrupt clearing part anyway,
we implement the resets fully and don't rely on userspace to reset the
rest.
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20200131100205.74720-4-frankja@linux.ibm.comSigned-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

7de3f142

23 1月, 2020 1 次提交

arm64: KVM: Add UAPI notes for swapped registers · 290a6bb0

由 Andrew Jones 提交于 1月 20, 2020

Two UAPI system register IDs do not derive their values from the
ARM system register encodings. This is because their values were
accidentally swapped. As the IDs are API, they cannot be changed.
Add WARNING notes to point them out.
Suggested-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NAndrew Jones <drjones@redhat.com>
[maz: turned XXX into WARNING]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200120130825.28838-1-drjones@redhat.com

290a6bb0

30 11月, 2019 1 次提交

Documentation: kvm: Fix mention to number of ioctls classes · 80b10aa9

由 Wainer dos Santos Moschetta 提交于 11月 29, 2019

In api.txt it is said that KVM ioctls belong to three classes
but in reality it is four. Fixed this, but do not count categories
anymore to avoid such as outdated information in the future.
Signed-off-by: NWainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

80b10aa9

28 11月, 2019 1 次提交

KVM: PPC: Book3S HV: Support reset of secure guest · 22945688

由 Bharata B Rao 提交于 11月 25, 2019

Add support for reset of secure guest via a new ioctl KVM_PPC_SVM_OFF.
This ioctl will be issued by QEMU during reset and includes the
the following steps:

- Release all device pages of the secure guest.
- Ask UV to terminate the guest via UV_SVM_TERMINATE ucall
- Unpin the VPA pages so that they can be migrated back to secure
  side when guest becomes secure again. This is required because
  pinned pages can't be migrated.
- Reinit the partition scoped page tables

After these steps, guest is ready to issue UV_ESM call once again
to switch to secure mode.
Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
	[Implementation of uv_svm_terminate() and its call from
	guest shutdown path]
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
	[Unpinning of VPA pages]
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

22945688

22 10月, 2019 2 次提交

KVM: arm/arm64: Allow user injection of external data aborts · da345174

由 Christoffer Dall 提交于 10月 11, 2019

In some scenarios, such as buggy guest or incorrect configuration of the
VMM and firmware description data, userspace will detect a memory access
to a portion of the IPA, which is not mapped to any MMIO region.

For this purpose, the appropriate action is to inject an external abort
to the guest.  The kernel already has functionality to inject an
external abort, but we need to wire up a signal from user space that
lets user space tell the kernel to do this.

It turns out, we already have the set event functionality which we can
perfectly reuse for this.
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

da345174

KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace · c726200d

由 Christoffer Dall 提交于 10月 11, 2019

For a long time, if a guest accessed memory outside of a memslot using
any of the load/store instructions in the architecture which doesn't
supply decoding information in the ESR_EL2 (the ISV bit is not set), the
kernel would print the following message and terminate the VM as a
result of returning -ENOSYS to userspace:

  load/store instruction decoding not implemented

The reason behind this message is that KVM assumes that all accesses
outside a memslot is an MMIO access which should be handled by
userspace, and we originally expected to eventually implement some sort
of decoding of load/store instructions where the ISV bit was not set.

However, it turns out that many of the instructions which don't provide
decoding information on abort are not safe to use for MMIO accesses, and
the remaining few that would potentially make sense to use on MMIO
accesses, such as those with register writeback, are not used in
practice.  It also turns out that fetching an instruction from guest
memory can be a pretty horrible affair, involving stopping all CPUs on
SMP systems, handling multiple corner cases of address translation in
software, and more.  It doesn't appear likely that we'll ever implement
this in the kernel.

What is much more common is that a user has misconfigured his/her guest
and is actually not accessing an MMIO region, but just hitting some
random hole in the IPA space.  In this scenario, the error message above
is almost misleading and has led to a great deal of confusion over the
years.

It is, nevertheless, ABI to userspace, and we therefore need to
introduce a new capability that userspace explicitly enables to change
behavior.

This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV)
which does exactly that, and introduces a new exit reason to report the
event to userspace.  User space can then emulate an exception to the
guest, restart the guest, suspend the guest, or take any other
appropriate action as per the policy of the running system.
Reported-by: NHeinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
Reviewed-by: NAlexander Graf <graf@amazon.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

c726200d

21 10月, 2019 1 次提交

KVM: PPC: Report single stepping capability · 1a9167a2

由 Fabiano Rosas 提交于 6月 19, 2019

When calling the KVM_SET_GUEST_DEBUG ioctl, userspace might request
the next instruction to be single stepped via the
KVM_GUESTDBG_SINGLESTEP control bit of the kvm_guest_debug structure.

This patch adds the KVM_CAP_PPC_GUEST_DEBUG_SSTEP capability in order
to inform userspace about the state of single stepping support.

We currently don't have support for guest single stepping implemented
in Book3S HV so the capability is only present for Book3S PR and
BookE.
Signed-off-by: NFabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

1a9167a2

24 9月, 2019 1 次提交

KVM/Hyper-V: Add new KVM capability KVM_CAP_HYPERV_DIRECT_TLBFLUSH · 344c6c80

由 Tianyu Lan 提交于 8月 22, 2019

Hyper-V direct tlb flush function should be enabled for
guest that only uses Hyper-V hypercall. User space
hypervisor(e.g, Qemu) can disable KVM identification in
CPUID and just exposes Hyper-V identification to make
sure the precondition. Add new KVM capability KVM_CAP_
HYPERV_DIRECT_TLBFLUSH for user space to enable Hyper-V
direct tlb function and this function is default to be
disabled in KVM.
Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

344c6c80

11 9月, 2019 1 次提交

doc: kvm: Fix return description of KVM_SET_MSRS · b274a290

由 Xiaoyao Li 提交于 9月 05, 2019

Userspace can use ioctl KVM_SET_MSRS to update a set of MSRs of guest.
This ioctl set specified MSRs one by one. If it fails to set an MSR,
e.g., due to setting reserved bits, the MSR is not supported/emulated by
KVM, etc..., it stops processing the MSR list and returns the number of
MSRs have been set successfully.
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b274a290

09 9月, 2019 1 次提交

KVM: arm/arm64: vgic: Allow more than 256 vcpus for KVM_IRQ_LINE · 92f35b75

由 Marc Zyngier 提交于 8月 18, 2019

While parts of the VGIC support a large number of vcpus (we
bravely allow up to 512), other parts are more limited.

One of these limits is visible in the KVM_IRQ_LINE ioctl, which
only allows 256 vcpus to be signalled when using the CPU or PPI
types. Unfortunately, we've cornered ourselves badly by allocating
all the bits in the irq field.

Since the irq_type subfield (8 bit wide) is currently only taking
the values 0, 1 and 2 (and we have been careful not to allow anything
else), let's reduce this field to only 4 bits, and allocate the
remaining 4 bits to a vcpu2_index, which acts as a multiplier:

  vcpu_id = 256 * vcpu2_index + vcpu_index

With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2)
allowing this to be discovered, it becomes possible to inject
PPIs to up to 4096 vcpus. But please just don't.

Whilst we're there, add a clarification about the use of KVM_IRQ_LINE
on arm, which is not completely conditionned by KVM_CAP_IRQCHIP.
Reported-by: NZenghui Yu <yuzenghui@huawei.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

92f35b75

29 8月, 2019 1 次提交

KVM: s390: improve documentation for S390_MEM_OP · b4d863c3

由 Cornelia Huck 提交于 8月 29, 2019

Explicitly specify the valid ranges for size and ar, and reword
buf requirements a bit.
Signed-off-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lkml.kernel.org/r/20190829124746.28665-1-cohuck@redhat.comSigned-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

b4d863c3

24 7月, 2019 1 次提交

Documentation: move Documentation/virtual to Documentation/virt · 2f5947df

由 Christoph Hellwig 提交于 7月 24, 2019

Renaming docs seems to be en vogue at the moment, so fix on of the
grossly misnamed directories.  We usually never use "virtual" as
a shortcut for virtualization in the kernel, but always virt,
as seen in the virt/ top-level directory.  Fix up the documentation
to match that.

Fixes: ed16648e ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2f5947df

20 7月, 2019 1 次提交

KVM: x86: Add fixed counters to PMU filter · 30cd8604

由 Eric Hankland 提交于 7月 18, 2019

Updates KVM_CAP_PMU_EVENT_FILTER so it can also whitelist or blacklist
fixed counters.
Signed-off-by: NEric Hankland <ehankland@google.com>
[No need to check padding fields for zero. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

30cd8604

11 7月, 2019 1 次提交

KVM: x86: PMU Event Filter · 66bb8a06

由 Eric Hankland 提交于 7月 10, 2019

Some events can provide a guest with information about other guests or the
host (e.g. L3 cache stats); providing the capability to restrict access
to a "safe" set of events would limit the potential for the PMU to be used
in any side channel attacks. This change introduces a new VM ioctl that
sets an event filter. If the guest attempts to program a counter for
any blacklisted or non-whitelisted event, the kernel counter won't be
created, so any RDPMC/RDMSR will show 0 instances of that event.
Signed-off-by: NEric Hankland <ehankland@google.com>
[Lots of changes. All remaining bugs are probably mine. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

66bb8a06

19 6月, 2019 1 次提交

KVM: x86: Modify struct kvm_nested_state to have explicit fields for data · 6ca00dfa

由 Liran Alon 提交于 6月 16, 2019

Improve the KVM_{GET,SET}_NESTED_STATE structs by detailing the format
of VMX nested state data in a struct.

In order to avoid changing the ioctl values of
KVM_{GET,SET}_NESTED_STATE, there is a need to preserve
sizeof(struct kvm_nested_state). This is done by defining the data
struct as "data.vmx[0]". It was the most elegant way I found to
preserve struct size while still keeping struct readable and easy to
maintain. It does have a misfortunate side-effect that now it has to be
accessed as "data.vmx[0]" rather than just "data.vmx".

Because we are already modifying these structs, I also modified the
following:
* Define the "format" field values as macros.
* Rename vmcs_pa to vmcs12_pa for better readability.
Signed-off-by: NLiran Alon <liran.alon@oracle.com>
[Remove SVM stubs, add KVM_STATE_NESTED_VMX_VMCS12_SIZE. - Paolo]
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6ca00dfa

18 6月, 2019 1 次提交

KVM: fix typo in documentation · 76e3bcdb

由 Dennis Restle 提交于 4月 30, 2019

The documentation mentions a non-existing capability KVM_CAP_USER_MEM.s
The right name is KVM_CAP_USER_MEMORY.
Signed-off-by: NDennis Restle <derestle@htwg-konstanz.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

76e3bcdb

15 6月, 2019 1 次提交

docs: arm64: convert docs to ReST and rename to .rst · b693d0b3

由 Mauro Carvalho Chehab 提交于 6月 12, 2019

The documentation is in a format that is very close to ReST format.

The conversion is actually:
  - add blank lines in order to identify paragraphs;
  - fixing tables markups;
  - adding some lists markups;
  - marking literal blocks;
  - adjust some title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

b693d0b3

05 6月, 2019 2 次提交

KVM: X86: Provide a capability to disable cstate msr read intercepts · b5170063

由 Wanpeng Li 提交于 5月 21, 2019

Allow guest reads CORE cstate when exposing host CPU power management capabilities
to the guest. PKG cstate is restricted to avoid a guest to get the whole package
information in multi-tenant scenario.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b5170063

KVM: Documentation: Add disable pause exits to KVM_CAP_X86_DISABLE_EXITS · 8ffdaa7f

由 Wanpeng Li 提交于 5月 21, 2019

Commit b31c114b (KVM: X86: Provide a capability to disable PAUSE intercepts)
forgot to add the KVM_X86_DISABLE_EXITS_PAUSE into api doc. This patch adds
it.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8ffdaa7f

08 5月, 2019 1 次提交

KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 · d7547c55

由 Peter Xu 提交于 5月 08, 2019

The previous KVM_CAP_MANUAL_DIRTY_LOG_PROTECT has some problem which
blocks the correct usage from userspace.  Obsolete the old one and
introduce a new capability bit for it.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d7547c55

01 5月, 2019 3 次提交

KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size · 65c4189d

由 Paolo Bonzini 提交于 4月 17, 2019

If a memory slot's size is not a multiple of 64 pages (256K), then
the KVM_CLEAR_DIRTY_LOG API is unusable: clearing the final 64 pages
either requires the requested page range to go beyond memslot->npages,
or requires log->num_pages to be unaligned, and kvm_clear_dirty_log_protect
requires log->num_pages to be both in range and aligned.

To allow this case, allow log->num_pages not to be a multiple of 64 if
it ends exactly on the last page of the slot.
Reported-by: NPeter Xu <peterx@redhat.com>
Fixes: 98938aa8 ("KVM: validate userspace input in kvm_clear_dirty_log_protect()", 2019-01-02)
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

65c4189d

Revert "KVM: doc: Document the life cycle of a VM and its resources" · 3a1e5e4a

由 Radim Krčmář 提交于 4月 29, 2019

This reverts commit 919f6cd8.

The patch was applied twice.
The first commit is eca6be56.
Reported-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3a1e5e4a

KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size · 76d58e0f

由 Paolo Bonzini 提交于 4月 17, 2019

76d58e0f

30 4月, 2019 2 次提交

KVM: PPC: Book3S HV: XIVE: Add get/set accessors for the VP XIVE state · e4945b9d

由 Cédric Le Goater 提交于 4月 18, 2019

The state of the thread interrupt management registers needs to be
collected for migration. These registers are cached under the
'xive_saved_state.w01' field of the VCPU when the VPCU context is
pulled from the HW thread. An OPAL call retrieves the backup of the
IPB register in the underlying XIVE NVT structure and merges it in the
KVM state.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

e4945b9d

KVM: PPC: Book3S HV: XIVE: Introduce a new capability KVM_CAP_PPC_IRQ_XIVE · eacc56bb

由 Cédric Le Goater 提交于 4月 18, 2019

The user interface exposes a new capability KVM_CAP_PPC_IRQ_XIVE to
let QEMU connect the vCPU presenters to the XIVE KVM device if
required. The capability is not advertised for now as the full support
for the XIVE native exploitation mode is not yet available. When this
is case, the capability will be advertised on PowerNV Hypervisors
only. Nested guests (pseries KVM Hypervisor) are not supported.

Internally, the interface to the new KVM device is protected with a
new interrupt mode: KVMPPC_IRQ_XIVE.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

eacc56bb

29 4月, 2019 1 次提交

Documentation: kvm: fix dirty log ioctl arch lists · dbcdae18

由 Andrew Jones 提交于 4月 29, 2019

KVM_GET_DIRTY_LOG is implemented by all architectures, not just x86,
and KVM_CAP_MANUAL_DIRTY_LOG_PROTECT is additionally implemented by
arm, arm64, and mips.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

dbcdae18

24 4月, 2019 2 次提交

KVM: arm64: Add capability to advertise ptrauth for guest · a243c16d

由 Amit Daniel Kachhap 提交于 4月 23, 2019

This patch advertises the capability of two cpu feature called address
pointer authentication and generic pointer authentication. These
capabilities depend upon system support for pointer authentication and
VHE mode.

The current arm64 KVM partially implements pointer authentication and
support of address/generic authentication are tied together. However,
separate ABI requirements for both of them is added so that any future
isolated implementation will not require any ABI changes.
Signed-off-by: NAmit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

a243c16d

KVM: arm64: Add userspace flag to enable pointer authentication · a22fa321

由 Amit Daniel Kachhap 提交于 4月 23, 2019

Now that the building blocks of pointer authentication are present, lets
add userspace flags KVM_ARM_VCPU_PTRAUTH_ADDRESS and
KVM_ARM_VCPU_PTRAUTH_GENERIC. These flags will enable pointer
authentication for the KVM guest on a per-vcpu basis through the ioctl
KVM_ARM_VCPU_INIT.

This features will allow the KVM guest to allow the handling of
pointer authentication instructions or to treat them as undefined
if not set.

Necessary documentations are added to reflect the changes done.
Reviewed-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NAmit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

a22fa321

19 4月, 2019 4 次提交

KVM: arm64: Clarify access behaviour for out-of-range SVE register slice IDs · 43b8e1f0

由 Dave Martin 提交于 4月 12, 2019

The existing documentation for which SVE register slice IDs are
considered out-of-range, and what happens when userspace tries to
access them, is cryptic.

This patch rewords the text with the aim of making it a bit easier to
understand.

No functional change.
Suggested-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

43b8e1f0

KVM: Clarify KVM_{SET,GET}_ONE_REG error code documentation · fe365b4e

由 Dave Martin 提交于 4月 12, 2019

The current error code documentation for KVM_GET_ONE_REG and
KVM_SET_ONE_REG could be read as implying that all architectures
implement these error codes, or that KVM guarantees which error
code is returned in a particular situation.

Because this is not really the case, this patch waters down the
documentation explicitly to remove such guarantees.

EPERM is marked as arm64-specific, since for now arm64 really is
the only architecture that yields this error code for the
finalization-required case.  Keeping this as a distinct error code
is useful however for debugging due to the statefulness of the API
in this instance.

No functional change.
Suggested-by: NAndrew Jones <drjones@redhat.com>
Fixes: 395f562f ("KVM: Document errors for KVM_GET_ONE_REG and KVM_SET_ONE_REG")
Fixes: 50036ad0 ("KVM: arm64/sve: Document KVM API extensions for SVE")
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

fe365b4e

KVM: Clarify capability requirements for KVM_ARM_VCPU_FINALIZE · 9df2d660

由 Dave Martin 提交于 4月 12, 2019

Userspace is only supposed to use KVM_ARM_VCPU_FINALIZE when there
is some vcpu feature that can actually be finalized.

This means that documenting KVM_ARM_VCPU_FINALIZE as available or
not depending on the capabilities present is not helpful.

This patch amends the documentation to describe availability in
terms of which capability is required for each finalizable feature
instead.

In any case, userspace sees the same error (EINVAL) regardless of
whether the given feature is not present or KVM_ARM_VCPU_FINALIZE
is not implemented at all.

No functional change.
Suggested-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

9df2d660

KVM: arm64/sve: Simplify KVM_REG_ARM64_SVE_VLS array sizing · 4bd774e5

由 Dave Martin 提交于 4月 11, 2019

A complicated DIV_ROUND_UP() expression is currently written out
explicitly in multiple places in order to specify the size of the
bitmap exchanged with userspace to represent the value of the
KVM_REG_ARM64_SVE_VLS pseudo-register.

Userspace currently has no direct way to work this out either: for
documentation purposes, the size is just quoted as 8 u64s.

To make this more intuitive, this patch replaces these with a
single define, which is also exported to userspace as
KVM_ARM64_SVE_VLS_WORDS.

Since the number of words in a bitmap is just the index of the last
word used + 1, this patch expresses the bound that way instead.
This should make it clearer what is being expressed.

For userspace convenience, the minimum and maximum possible vector
lengths relevant to the KVM ABI are exposed to UAPI as
KVM_ARM64_SVE_VQ_MIN, KVM_ARM64_SVE_VQ_MAX.  Since the only direct
use for these at present is manipulation of KVM_REG_ARM64_SVE_VLS,
no corresponding _VL_ macros are defined.  They could be added
later if a need arises.

Since use of DIV_ROUND_UP() was the only reason for including
<linux/kernel.h> in guest.c, this patch also removes that #include.
Suggested-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

4bd774e5

16 4月, 2019 1 次提交

kvm: move KVM_CAP_NR_MEMSLOTS to common code · c110ae57

由 Paolo Bonzini 提交于 3月 28, 2019

All architectures except MIPS were defining it in the same way,
and memory slots are handled entirely by common code so there
is no point in keeping the definition per-architecture.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c110ae57

29 3月, 2019 5 次提交

KVM: arm64/sve: Document KVM API extensions for SVE · 50036ad0

由 Dave Martin 提交于 9月 28, 2018

This patch adds sections to the KVM API documentation describing
the extensions for supporting the Scalable Vector Extension (SVE)
in guests.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

50036ad0

KVM: Document errors for KVM_GET_ONE_REG and KVM_SET_ONE_REG · 395f562f

由 Dave Martin 提交于 1月 15, 2019

KVM_GET_ONE_REG and KVM_SET_ONE_REG return some error codes that
are not documented (but hopefully not surprising either).  To give
an indication of what these may mean, this patch adds brief
documentation.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

395f562f

KVM: Documentation: Document arm64 core registers in detail · fd3bc912

由 Dave Martin 提交于 9月 28, 2018

Since the the sizes of individual members of the core arm64
registers vary, the list of register encodings that make sense is
not a simple linear sequence.

To clarify which encodings to use, this patch adds a brief list
to the documentation.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NJulien Grall <julien.grall@arm.com>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

fd3bc912

Documentation: kvm: clarify KVM_SET_USER_MEMORY_REGION · e2788c4a

由 Paolo Bonzini 提交于 3月 28, 2019

The documentation does not mention how to delete a slot, add the
information.
Reported-by: NNathaniel McCallum <npmccallum@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e2788c4a

KVM: doc: Document the life cycle of a VM and its resources · 919f6cd8

由 Sean Christopherson 提交于 2月 15, 2019

The series to add memcg accounting to KVM allocations[1] states:

There are many KVM kernel memory allocations which are tied to the
life of the VM process and should be charged to the VM process's
cgroup.

While it is correct to account KVM kernel allocations to the cgroup of
the process that created the VM, it's technically incorrect to state
that the KVM kernel memory allocations are tied to the life of the VM
process. This is because the VM itself, i.e. struct kvm, is not tied to
the life of the process which created it, rather it is tied to the life
of its associated file descriptor. In other words, kvm_destroy_vm() is
not invoked until fput() decrements its associated file's refcount to
zero. A simple example is to fork() in Qemu and have the child sleep
indefinitely; kvm_destroy_vm() isn't called until Qemu closes its file
descriptor *and* the rogue child is killed.

The allocations are guaranteed to be *accounted* to the process which
created the VM, but only because KVM's per-{VM,vCPU} ioctls reject the
ioctl() with -EIO if kvm->mm != current->mm. I.e. the child can keep
the VM "alive" but can't do anything useful with its reference.

Note that because 'struct kvm' also holds a reference to the mm_struct
of its owner, the above behavior also applies to userspace allocations.

Given that mucking with a VM's file descriptor can lead to subtle and
undesirable behavior, e.g. memcg charges persisting after a VM is shut
down, explicitly document a VM's lifecycle and its impact on the VM's
resources.

Alternatively, KVM could aggressively free resources when the creating
process exits, e.g. via mmu_notifier->release(). However, mmu_notifier
isn't guaranteed to be available, and freeing resources when the creator
exits is likely to be error prone and fragile as KVM would need to
ensure that it only freed resources that are truly out of reach. In
practice, the existing behavior shouldn't be problematic as a properly
configured system will prevent a child process from being moved out of
the appropriate cgroup hierarchy, i.e. prevent hiding the process from
the OOM killer, and will prevent an unprivileged user from being able to
to hold a reference to struct kvm via another method, e.g. debugfs.

[1]https://patchwork.kernel.org/patch/10806707/Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

919f6cd8

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功