- 22 10月, 2020 2 次提交
-
-
由 Paolo Bonzini 提交于
Allowing userspace to intercept reads to x2APIC MSRs when APICV is fully enabled for the guest simply can't work. But more in general, the LAPIC could be set to in-kernel after the MSR filter is setup and allowing accesses by userspace would be very confusing. We could in principle allow userspace to intercept reads and writes to TPR, and writes to EOI and SELF_IPI, but while that could be made it work, it would still be silly. Cc: Alexander Graf <graf@amazon.com> Cc: Aaron Lewis <aaronlewis@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rework the resetting of the MSR bitmap for x2APIC MSRs to ignore userspace filtering. Allowing userspace to intercept reads to x2APIC MSRs when APICV is fully enabled for the guest simply can't work; the LAPIC and thus virtual APIC is in-kernel and cannot be directly accessed by userspace. To keep things simple we will in fact forbid intercepting x2APIC MSRs altogether, independent of the default_allow setting. Cc: Alexander Graf <graf@amazon.com> Cc: Aaron Lewis <aaronlewis@google.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20201005195532.8674-3-sean.j.christopherson@intel.com> [Modified to operate even if APICv is disabled, adjust documentation. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 9月, 2020 3 次提交
-
-
由 Alexander Graf 提交于
It's not desireable to have all MSRs always handled by KVM kernel space. Some MSRs would be useful to handle in user space to either emulate behavior (like uCode updates) or differentiate whether they are valid based on the CPU model. To allow user space to specify which MSRs it wants to see handled by KVM, this patch introduces a new ioctl to push filter rules with bitmaps into KVM. Based on these bitmaps, KVM can then decide whether to reject MSR access. With the addition of KVM_CAP_X86_USER_SPACE_MSR it can also deflect the denied MSR events to user space to operate on. If no filter is populated, MSR handling stays identical to before. Signed-off-by: NAlexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-8-graf@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alexander Graf 提交于
MSRs are weird. Some of them are normal control registers, such as EFER. Some however are registers that really are model specific, not very interesting to virtualization workloads, and not performance critical. Others again are really just windows into package configuration. Out of these MSRs, only the first category is necessary to implement in kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against certain CPU models and MSRs that contain information on the package level are much better suited for user space to process. However, over time we have accumulated a lot of MSRs that are not the first category, but still handled by in-kernel KVM code. This patch adds a generic interface to handle WRMSR and RDMSR from user space. With this, any future MSR that is part of the latter categories can be handled in user space. Furthermore, it allows us to replace the existing "ignore_msrs" logic with something that applies per-VM rather than on the full system. That way you can run productive VMs in parallel to experimental ones where you don't care about proper MSR handling. Signed-off-by: NAlexander Graf <graf@amazon.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <20200925143422.21718-3-graf@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
We forgot to update KVM_GET_SUPPORTED_HV_CPUID's documentation in api.rst when SynDBG leaves were added. While on it, fix 'KVM_GET_SUPPORTED_CPUID' copy-paste error. Fixes: f97f5a56 ("x86/kvm/hyper-v: Add support for synthetic debugger interface") Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200924145757.1035782-2-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 14 9月, 2020 1 次提交
-
-
由 Collin Walling 提交于
Documentation for the s390 DIAGNOSE 0x318 instruction handling. Signed-off-by: NCollin Walling <walling@linux.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/kvm/20200625150724.10021-2-walling@linux.ibm.com/ Message-Id: <20200625150724.10021-2-walling@linux.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 21 8月, 2020 2 次提交
-
-
由 Andrew Jones 提交于
arm64 requires a vcpu fd (KVM_HAS_DEVICE_ATTR vcpu ioctl) to probe support for steal-time. However this is unnecessary, as only a KVM fd is required, and it complicates userspace (userspace may prefer delaying vcpu creation until after feature probing). Introduce a cap that can be checked instead. While x86 can already probe steal-time support with a kvm fd (KVM_GET_SUPPORTED_CPUID), we add the cap there too for consistency. Signed-off-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NSteven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200804170604.42662-7-drjones@redhat.com
-
由 Andrew Jones 提交于
In preparation for documenting a new capability let's fix up the formatting of the current ones. Signed-off-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NSteven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200804170604.42662-6-drjones@redhat.com
-
- 22 7月, 2020 1 次提交
-
-
由 Athira Rajeev 提交于
Power ISA v3.1 has added new performance monitoring unit (PMU) special purpose registers (SPRs). They are: Monitor Mode Control Register 3 (MMCR3) Sampled Instruction Event Register A (SIER2) Sampled Instruction Event Register B (SIER3) Add support to save/restore these new SPRs while entering/exiting guest. Also include changes to support KVM_REG_PPC_MMCR3/SIER2/SIER3. Add new SPRs to KVM API documentation. Signed-off-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1594996707-3727-6-git-send-email-atrajeev@linux.vnet.ibm.com
-
- 13 7月, 2020 1 次提交
-
-
由 Randy Dunlap 提交于
Drop the duplicated word "struct". Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20200707180414.10467-21-rdunlap@infradead.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 10 7月, 2020 1 次提交
-
-
由 Paolo Bonzini 提交于
Commit 850448f3 ("KVM: nVMX: Fix VMX preemption timer migration", 2020-06-01) accidentally broke nVMX live migration from older version by changing the userspace ABI. Restore it and, while at it, ensure that vmx->nested.has_preemption_timer_deadline is always initialized according to the KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE flag. Cc: Makarand Sonare <makarandsonare@google.com> Fixes: 850448f3 ("KVM: nVMX: Fix VMX preemption timer migration") Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 7月, 2020 2 次提交
-
-
由 Xiaoyao Li 提交于
Current implementation keeps userspace input of CPUID configuration and cpuid->nent even if kvm_update_cpuid() fails. Reset vcpu->arch.cpuid_nent to 0 for the case of failure as a simple fix. Besides, update the doc to explicitly state that if IOCTL SET_CPUID* fail KVM gives no gurantee that previous valid CPUID configuration is kept. Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Message-Id: <20200708065054.19713-2-xiaoyao.li@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jim Mattson 提交于
More often than not, a failed VM-entry in an x86 production environment is induced by a defective CPU. To help identify the bad hardware, include the id of the last logical CPU to run a vCPU in the information provided to userspace on a KVM exit for failed VM-entry or for KVM internal errors not associated with emulation. The presence of this additional information is indicated by a new capability, KVM_CAP_LAST_CPU. Signed-off-by: NJim Mattson <jmattson@google.com> Reviewed-by: NOliver Upton <oupton@google.com> Reviewed-by: NPeter Shier <pshier@google.com> Message-Id: <20200603235623.245638-5-jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 06 7月, 2020 1 次提交
-
-
由 Randy Dunlap 提交于
Drop multiple doubled words. Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20200703212906.30655-3-rdunlap@infradead.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 01 6月, 2020 3 次提交
-
-
由 Jon Doron 提交于
Add support for Hyper-V synthetic debugger (syndbg) interface. The syndbg interface is using MSRs to emulate a way to send/recv packets data. The debug transport dll (kdvm/kdnet) will identify if Hyper-V is enabled and if it supports the synthetic debugger interface it will attempt to use it, instead of trying to initialize a network adapter. Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NJon Doron <arilou@gmail.com> Message-Id: <20200529134543.1127440-4-arilou@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Peter Shier 提交于
Add new field to hold preemption timer expiration deadline appended to struct kvm_vmx_nested_state_hdr. This is to prevent the first VM-Enter after migration from incorrectly restarting the timer with the full timer value instead of partially decayed timer value. KVM_SET_NESTED_STATE restarts timer using migrated state regardless of whether L1 sets VM_EXIT_SAVE_VMX_PREEMPTION_TIMER. Fixes: cf8b84f4 ("kvm: nVMX: Prepare for checkpointing L2 state") Signed-off-by: NPeter Shier <pshier@google.com> Signed-off-by: NMakarand Sonare <makarandsonare@google.com> Message-Id: <20200526215107.205814-2-makarandsonare@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jon Doron 提交于
The problem the patch is trying to address is the fact that 'struct kvm_hyperv_exit' has different layout on when compiling in 32 and 64 bit modes. In 64-bit mode the default alignment boundary is 64 bits thus forcing extra gaps after 'type' and 'msr' but in 32-bit mode the boundary is at 32 bits thus no extra gaps. This is an issue as even when the kernel is 64 bit, the userspace using the interface can be both 32 and 64 bit but the same 32 bit userspace has to work with 32 bit kernel. The issue is fixed by forcing the 64 bit layout, this leads to ABI change for 32 bit builds and while we are obviously breaking '32 bit userspace with 32 bit kernel' case, we're fixing the '32 bit userspace with 64 bit kernel' one. As the interface has no (known) users and 32 bit KVM is rather baroque nowadays, this seems like a reasonable decision. Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NJon Doron <arilou@gmail.com> Message-Id: <20200424113746.3473563-2-arilou@gmail.com> Reviewed-by: NRoman Kagan <rvkagan@yandex-team.ru> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 5月, 2020 1 次提交
-
-
由 Keqian Zhu 提交于
There is already support of enabling dirty log gradually in small chunks for x86 in commit 3c9bd400 ("KVM: x86: enable dirty log gradually in small chunks"). This adds support for arm64. x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET is enabled. However, for arm64, both huge pages and normal pages can be write protected gradually by userspace. Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G Linux VMs with different page size. The memory pressure is 127G in each case. The time taken of memory_global_dirty_log_start in QEMU is listed below: Page Size Before After Optimization 4K 650ms 1.8ms 2M 4ms 1.8ms 1G 2ms 1.8ms Besides the time reduction, the biggest improvement is that we will minimize the performance side effect (because of dissolving huge pages and marking memslots dirty) on guest after enabling dirty log. Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200413122023.52583-1-zhukeqian1@huawei.com
-
- 05 5月, 2020 1 次提交
-
-
由 Joshua Abraham 提交于
The KVM_KVMCLOCK_CTRL ioctl signals to supported KVM guests that the hypervisor has paused it. Update the documentation to reflect that the guest is notified by this API. Signed-off-by: NJoshua Abraham <sinisterpatrician@gmail.com> Link: https://lore.kernel.org/r/20200501223624.GA25826@josh-ZenBookSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 25 4月, 2020 1 次提交
-
-
由 David Matlack 提交于
KVM_CAP_HALT_POLL is a per-VM capability that lets userspace control the halt-polling time, allowing halt-polling to be tuned or disabled on particular VMs. With dynamic halt-polling, a VM's VCPUs can poll from anywhere from [0, halt_poll_ns] on each halt. KVM_CAP_HALT_POLL sets the upper limit on the poll time. Signed-off-by: NDavid Matlack <dmatlack@google.com> Signed-off-by: NJon Cargille <jcargill@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <20200417221446.108733-1-jcargill@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 3月, 2020 1 次提交
-
-
由 Christian Borntraeger 提交于
If a signal is pending we might return -ENOMEM instead of -EINTR. We should propagate the proper error during KSM unsharing. unmerge_ksm_pages returns -ERESTARTSYS on signal_pending. This gets translated by entry.S to -EINTR. It is important to get this error code so that userspace can retry. To make this clearer we also add -EINTR to the documentation of the PV_ENABLE call, which calls unmerge_ksm_pages. Fixes: 3ac8e380 ("s390/mm: disable KSM for storage key enabled pages") Reviewed-by: NJanosch Frank <frankja@linux.vnet.ibm.com> Reported-by: NMarc Hartmayer <mhartmay@linux.ibm.com> Tested-by: NMarc Hartmayer <mhartmay@linux.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 26 3月, 2020 1 次提交
-
-
由 Paul Mackerras 提交于
At present, on Power systems with Protected Execution Facility hardware and an ultravisor, a KVM guest can transition to being a secure guest at will. Userspace (QEMU) has no way of knowing whether a host system is capable of running secure guests. This will present a problem in future when the ultravisor is capable of migrating secure guests from one host to another, because virtualization management software will have no way to ensure that secure guests only run in domains where all of the hosts can support secure guests. This adds a VM capability which has two functions: (a) userspace can query it to find out whether the host can support secure guests, and (b) userspace can enable it for a guest, which allows that guest to become a secure guest. If userspace does not enable it, KVM will return an error when the ultravisor does the hypercall that indicates that the guest is starting to transition to a secure guest. The ultravisor will then abort the transition and the guest will terminate. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Reviewed-by: NRam Pai <linuxram@us.ibm.com>
-
- 17 3月, 2020 2 次提交
-
-
由 Sean Christopherson 提交于
Remove the code for handling stateful CPUID 0x2 and mark the associated flags as deprecated. WARN if host CPUID 0x2.0.AL > 1, i.e. if by some miracle a host with stateful CPUID 0x2 is encountered. No known CPU exists that supports hardware accelerated virtualization _and_ a stateful CPUID 0x2. Barring an extremely contrived nested virtualization scenario, stateful CPUID support is dead code. Suggested-by: NVitaly Kuznetsov <vkuznets@redhat.com> Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jay Zhou 提交于
It could take kvm->mmu_lock for an extended period of time when enabling dirty log for the first time. The main cost is to clear all the D-bits of last level SPTEs. This situation can benefit from manual dirty log protect as well, which can reduce the mmu_lock time taken. The sequence is like this: 1. Initialize all the bits of the dirty bitmap to 1 when enabling dirty log for the first time 2. Only write protect the huge pages 3. KVM_GET_DIRTY_LOG returns the dirty bitmap info 4. KVM_CLEAR_DIRTY_LOG will clear D-bit for each of the leaf level SPTEs gradually in small chunks Under the Intel(R) Xeon(R) Gold 6152 CPU @ 2.10GHz environment, I did some tests with a 128G windows VM and counted the time taken of memory_global_dirty_log_start, here is the numbers: VM Size Before After optimization 128G 460ms 10ms Signed-off-by: NJay Zhou <jianjay.zhou@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 2月, 2020 2 次提交
-
-
由 Janosch Frank 提交于
Add documentation for KVM_CAP_S390_PROTECTED capability and the KVM_S390_PV_COMMAND ioctl. Signed-off-by: NJanosch Frank <frankja@linux.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Janosch Frank 提交于
A lot of the registers are controlled by the Ultravisor and never visible to KVM. Some fields in the sie control block are overlayed, like gbea. As no known userspace uses the ONE_REG interface on s390 if sync regs are available, no functionality is lost if it is disabled for protected guests. Signed-off-by: NJanosch Frank <frankja@linux.ibm.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 25 2月, 2020 1 次提交
-
-
由 Christian Borntraeger 提交于
We also need to rstify the new ioctls that we added in parallel to the rstification of the kvm docs. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 13 2月, 2020 1 次提交
-
-
由 Mauro Carvalho Chehab 提交于
convert api.txt document to ReST format while trying to keep its format as close as possible with the authors intent, and avoid adding uneeded markups. - Use document title and chapter markups; - Convert tables; - Add markups for literal blocks; - use :field: for field descriptions; - Add blank lines and adjust indentation Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 31 1月, 2020 1 次提交
-
-
由 Janosch Frank 提交于
The architecture states that we need to reset local IRQs for all CPU resets. Because the old reset interface did not support the normal CPU reset we never did that on a normal reset. Let's implement an interface for the missing normal and clear resets and reset all local IRQs, registers and control structures as stated in the architecture. Userspace might already reset the registers via the vcpu run struct, but as we need the interface for the interrupt clearing part anyway, we implement the resets fully and don't rely on userspace to reset the rest. Signed-off-by: NJanosch Frank <frankja@linux.ibm.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/r/20200131100205.74720-4-frankja@linux.ibm.comSigned-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 23 1月, 2020 1 次提交
-
-
由 Andrew Jones 提交于
Two UAPI system register IDs do not derive their values from the ARM system register encodings. This is because their values were accidentally swapped. As the IDs are API, they cannot be changed. Add WARNING notes to point them out. Suggested-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NAndrew Jones <drjones@redhat.com> [maz: turned XXX into WARNING] Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200120130825.28838-1-drjones@redhat.com
-
- 30 11月, 2019 1 次提交
-
-
In api.txt it is said that KVM ioctls belong to three classes but in reality it is four. Fixed this, but do not count categories anymore to avoid such as outdated information in the future. Signed-off-by: NWainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 11月, 2019 1 次提交
-
-
由 Bharata B Rao 提交于
Add support for reset of secure guest via a new ioctl KVM_PPC_SVM_OFF. This ioctl will be issued by QEMU during reset and includes the the following steps: - Release all device pages of the secure guest. - Ask UV to terminate the guest via UV_SVM_TERMINATE ucall - Unpin the VPA pages so that they can be migrated back to secure side when guest becomes secure again. This is required because pinned pages can't be migrated. - Reinit the partition scoped page tables After these steps, guest is ready to issue UV_ESM call once again to switch to secure mode. Signed-off-by: NBharata B Rao <bharata@linux.ibm.com> Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [Implementation of uv_svm_terminate() and its call from guest shutdown path] Signed-off-by: NRam Pai <linuxram@us.ibm.com> [Unpinning of VPA pages] Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 22 10月, 2019 2 次提交
-
-
由 Christoffer Dall 提交于
In some scenarios, such as buggy guest or incorrect configuration of the VMM and firmware description data, userspace will detect a memory access to a portion of the IPA, which is not mapped to any MMIO region. For this purpose, the appropriate action is to inject an external abort to the guest. The kernel already has functionality to inject an external abort, but we need to wire up a signal from user space that lets user space tell the kernel to do this. It turns out, we already have the set event functionality which we can perfectly reuse for this. Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
由 Christoffer Dall 提交于
For a long time, if a guest accessed memory outside of a memslot using any of the load/store instructions in the architecture which doesn't supply decoding information in the ESR_EL2 (the ISV bit is not set), the kernel would print the following message and terminate the VM as a result of returning -ENOSYS to userspace: load/store instruction decoding not implemented The reason behind this message is that KVM assumes that all accesses outside a memslot is an MMIO access which should be handled by userspace, and we originally expected to eventually implement some sort of decoding of load/store instructions where the ISV bit was not set. However, it turns out that many of the instructions which don't provide decoding information on abort are not safe to use for MMIO accesses, and the remaining few that would potentially make sense to use on MMIO accesses, such as those with register writeback, are not used in practice. It also turns out that fetching an instruction from guest memory can be a pretty horrible affair, involving stopping all CPUs on SMP systems, handling multiple corner cases of address translation in software, and more. It doesn't appear likely that we'll ever implement this in the kernel. What is much more common is that a user has misconfigured his/her guest and is actually not accessing an MMIO region, but just hitting some random hole in the IPA space. In this scenario, the error message above is almost misleading and has led to a great deal of confusion over the years. It is, nevertheless, ABI to userspace, and we therefore need to introduce a new capability that userspace explicitly enables to change behavior. This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV) which does exactly that, and introduces a new exit reason to report the event to userspace. User space can then emulate an exception to the guest, restart the guest, suspend the guest, or take any other appropriate action as per the policy of the running system. Reported-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NAlexander Graf <graf@amazon.com> Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
- 21 10月, 2019 1 次提交
-
-
由 Fabiano Rosas 提交于
When calling the KVM_SET_GUEST_DEBUG ioctl, userspace might request the next instruction to be single stepped via the KVM_GUESTDBG_SINGLESTEP control bit of the kvm_guest_debug structure. This patch adds the KVM_CAP_PPC_GUEST_DEBUG_SSTEP capability in order to inform userspace about the state of single stepping support. We currently don't have support for guest single stepping implemented in Book3S HV so the capability is only present for Book3S PR and BookE. Signed-off-by: NFabiano Rosas <farosas@linux.ibm.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 24 9月, 2019 1 次提交
-
-
由 Tianyu Lan 提交于
Hyper-V direct tlb flush function should be enabled for guest that only uses Hyper-V hypercall. User space hypervisor(e.g, Qemu) can disable KVM identification in CPUID and just exposes Hyper-V identification to make sure the precondition. Add new KVM capability KVM_CAP_ HYPERV_DIRECT_TLBFLUSH for user space to enable Hyper-V direct tlb function and this function is default to be disabled in KVM. Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 9月, 2019 1 次提交
-
-
由 Xiaoyao Li 提交于
Userspace can use ioctl KVM_SET_MSRS to update a set of MSRs of guest. This ioctl set specified MSRs one by one. If it fails to set an MSR, e.g., due to setting reserved bits, the MSR is not supported/emulated by KVM, etc..., it stops processing the MSR list and returns the number of MSRs have been set successfully. Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 9月, 2019 1 次提交
-
-
由 Marc Zyngier 提交于
While parts of the VGIC support a large number of vcpus (we bravely allow up to 512), other parts are more limited. One of these limits is visible in the KVM_IRQ_LINE ioctl, which only allows 256 vcpus to be signalled when using the CPU or PPI types. Unfortunately, we've cornered ourselves badly by allocating all the bits in the irq field. Since the irq_type subfield (8 bit wide) is currently only taking the values 0, 1 and 2 (and we have been careful not to allow anything else), let's reduce this field to only 4 bits, and allocate the remaining 4 bits to a vcpu2_index, which acts as a multiplier: vcpu_id = 256 * vcpu2_index + vcpu_index With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2) allowing this to be discovered, it becomes possible to inject PPIs to up to 4096 vcpus. But please just don't. Whilst we're there, add a clarification about the use of KVM_IRQ_LINE on arm, which is not completely conditionned by KVM_CAP_IRQCHIP. Reported-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
- 29 8月, 2019 1 次提交
-
-
由 Cornelia Huck 提交于
Explicitly specify the valid ranges for size and ar, and reword buf requirements a bit. Signed-off-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Link: https://lkml.kernel.org/r/20190829124746.28665-1-cohuck@redhat.comSigned-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 24 7月, 2019 1 次提交
-
-
由 Christoph Hellwig 提交于
Renaming docs seems to be en vogue at the moment, so fix on of the grossly misnamed directories. We usually never use "virtual" as a shortcut for virtualization in the kernel, but always virt, as seen in the virt/ top-level directory. Fix up the documentation to match that. Fixes: ed16648e ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:") Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-