- 13 4月, 2017 10 次提交
-
-
由 David Hildenbrand 提交于
I don't see any reason any more for this lock, seemed to be used to protect removal of kvm->arch.vpic / kvm->arch.vioapic when already partially inititalized, now access is properly protected using kvm->arch.irqchip_mode and this shouldn't be necessary anymore. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
When handling KVM_GET_IRQCHIP, we already check irqchip_kernel(), which implies a fully inititalized ioapic. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
Let's just use kvm->arch.vioapic directly. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
It seemed like a nice idea to encapsulate access to kvm->arch.vpic. But as the usage is already mixed, internal locks are taken outside of i8259.c and grepping for "vpic" only is much easier, let's just get rid of pic_irqchip(). Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
KVM_IRQCHIP_KERNEL implies a fully inititalized ioapic, while kvm->arch.vioapic might temporarily be set but invalidated again if e.g. setting of default routing fails when setting KVM_CREATE_IRQCHIP. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
Let's avoid checking against kvm->arch.vpic. We have kvm->arch.irqchip_mode for that now. KVM_IRQCHIP_KERNEL implies a fully inititalized pic, while kvm->arch.vpic might temporarily be set but invalidated again if e.g. kvm_ioapic_init() fails when setting KVM_CREATE_IRQCHIP. Although current users seem to be fine, this avoids future bugs. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
Let's replace the checks for pic_in_kernel() and ioapic_in_kernel() by checks against irqchip_mode. Also make sure that creation of any route is only possible if we have an lapic in kernel (irqchip_in_kernel()) or if we are currently inititalizing the irqchip. This is necessary to switch pic_in_kernel() and ioapic_in_kernel() to irqchip_mode, too. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
Let's add a new mode and set it while we create the irqchip via KVM_CREATE_IRQCHIP and KVM_CAP_SPLIT_IRQCHIP. This mode will be used later to test if adding routes (in kvm_set_routing_entry()) is already allowed. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 07 4月, 2017 16 次提交
-
-
由 Jim Mattson 提交于
The userspace exception injection API and code path are entirely unprepared for exceptions that might cause a VM-exit from L2 to L1, so the best course of action may be to simply disallow this for now. 1. The API provides no mechanism for userspace to specify the new DR6 bits for a #DB exception or the new CR2 value for a #PF exception. Presumably, userspace is expected to modify these registers directly with KVM_SET_SREGS before the next KVM_RUN ioctl. However, in the event that L1 intercepts the exception, these registers should not be changed. Instead, the new values should be provided in the exit_qualification field of vmcs12 (Intel SDM vol 3, section 27.1). 2. In the case of a userspace-injected #DB, inject_pending_event() clears DR7.GD before calling vmx_queue_exception(). However, in the event that L1 intercepts the exception, this is too early, because DR7.GD should not be modified by a #DB that causes a VM-exit directly (Intel SDM vol 3, section 27.1). 3. If the injected exception is a #PF, nested_vmx_check_exception() doesn't properly check whether or not L1 is interested in the associated error code (using the #PF error code mask and match fields from vmcs12). It may either return 0 when it should call nested_vmx_vmexit() or vice versa. 4. nested_vmx_check_exception() assumes that it is dealing with a hardware-generated exception intercept from L2, with some of the relevant details (the VM-exit interruption-information and the exit qualification) live in vmcs02. For userspace-injected exceptions, this is not the case. 5. prepare_vmcs12() assumes that when its exit_intr_info argument specifies valid information with a valid error code that it can VMREAD the VM-exit interruption error code from vmcs02. For userspace-injected exceptions, this is not the case. Signed-off-by: NJim Mattson <jmattson@google.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
If we already entered/are about to enter SMM, don't allow switching to INIT/SIPI_RECEIVED, otherwise the next call to kvm_apic_accept_events() will report a warning. Same applies if we are already in MP state INIT_RECEIVED and SMM is requested to be turned on. Refuse to set the VCPU events in this case. Fixes: cd7764fe ("KVM: x86: latch INITs while in system management mode") Cc: stable@vger.kernel.org # 4.2+ Reported-by: NDmitry Vyukov <dvyukov@google.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
Its value has never changed; we might as well make it part of the ABI instead of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO). Because PPC does not always make MMIO available, the code has to be made dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
Remove code from architecture files that can be moved to virt/kvm, since there is already common code for coalesced MMIO. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> [Removed a pointless 'break' after 'return'.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
In order to simplify adding exit reasons in the future, the array of exit reason names is now also sorted by exit reason code. Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
Now use bit 6 of EPTP to optionally enable A/D bits for EPTP. Another thing to change is that, when EPT accessed and dirty bits are not in use, VMX treats accesses to guest paging structures as data reads. When they are in use (bit 6 of EPTP is set), they are treated as writes and the corresponding EPT dirty bit is set. The MMU didn't know this detail, so this patch adds it. We also have to fix up the exit qualification. It may be wrong because KVM sets bit 6 but the guest might not. L1 emulates EPT A/D bits using write permissions, so in principle it may be possible for EPT A/D bits to be used by L1 even though not available in hardware. The problem is that guest page-table walks will be treated as reads rather than writes, so they would not cause an EPT violation. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [Fixed typo in walk_addr_generic() comment and changed bit clear + conditional-set pattern in handle_ept_violation() to conditional-clear] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
This prepares the MMU paging code for EPT accessed and dirty bits, which can be enabled optionally at runtime. Code that updates the accessed and dirty bits will need a pointer to the struct kvm_mmu. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
handle_ept_violation is checking for "guest-linear-address invalid" + "not a paging-structure walk". However, _all_ EPT violations without a valid guest linear address are paging structure walks, because those EPT violations happen when loading the guest PDPTEs. Therefore, the check can never be true, and even if it were, KVM doesn't care about the guest linear address; it only uses the guest *physical* address VMCS field. So, remove the check altogether. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NJim Mattson <jmattson@google.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
Large pages at the PDPE level can be emulated by the MMU, so the bit can be set unconditionally in the EPT capabilities MSR. The same is true of 2MB EPT pages, though all Intel processors with EPT in practice support those. Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Legacy device assignment has been deprecated since 4.2 (released 1.5 years ago). VFIO is better and everyone should have switched to it. If they haven't, this should convince them. :) Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Virtual NMIs are only missing in Prescott and Yonah chips. Both are obsolete for virtualization usage---Yonah is 32-bit only even---so drop vNMI emulation. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Borislav Petkov 提交于
MCG_CAP[63:9] bits are reserved on AMD. However, on an AMD guest, this MSR returns 0x100010a. More specifically, bit 24 is set, which is simply wrong. That bit is MCG_SER_P and is present only on Intel. Thus, clean up the reserved bits in order not to confuse guests. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Hildenbrand 提交于
Let's combine it in a single function vmx_switch_vmcs(). Signed-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Jim Mattson 提交于
According to the Intel SDM, volume 3, section 28.3.2: Creating and Using Cached Translation Information, "No linear mappings are used while EPT is in use." INVEPT will invalidate both the guest-physical mappings and the combined mappings in the TLBs and paging-structure caches, so an INVVPID is superfluous. Signed-off-by: NJim Mattson <jmattson@google.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Yi Min Zhao 提交于
Introduce a cap to enable AIS facility bit, and add documentation for this capability. Signed-off-by: NYi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: NFei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 06 4月, 2017 3 次提交
-
-
由 Yi Min Zhao 提交于
Inject adapter interrupts on a specified adapter which allows to retrieve the adapter flags, e.g. if the adapter is subject to AIS facility or not. And add documentation for this interface. For adapters subject to AIS, handle the airq injection suppression for a given ISC according to the interruption mode: - before injection, if NO-Interruptions Mode, just return 0 and suppress, otherwise, allow the injection. - after injection, if SINGLE-Interruption Mode, change it to NO-Interruptions Mode to suppress the following interrupts. Besides, add tracepoint for suppressed airq and AIS mode transitions. Signed-off-by: NYi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: NFei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Fei Li 提交于
Provide an interface for userspace to modify AIS (adapter-interruption-suppression) mode state, and add documentation for the interface. Allowed target modes are ALL-Interruptions mode and SINGLE-Interruption mode. We introduce the 'simm' and 'nimm' fields in kvm_s390_float_interrupt to store interruption modes for each ISC. Each bit in 'simm' and 'nimm' targets to one ISC, and collaboratively indicate three modes: ALL-Interruptions, SINGLE-Interruption and NO-Interruptions. This interface can initiate most transitions between the states; transition from SINGLE-Interruption to NO-Interruptions via adapter interrupt injection will be introduced in a following patch. The meaningful combinations are as follows: interruption mode | simm bit | nimm bit ------------------|----------|---------- ALL | 0 | 0 SINGLE | 1 | 0 NO | 1 | 1 Besides, add tracepoint to track AIS mode transitions. Co-Authored-By: NYi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: NYi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: NFei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Fei Li 提交于
In order to properly implement adapter-interruption suppression, we need a way for userspace to specify which adapters are subject to suppression. Let's convert the existing (and unused) 'pad' field into a 'flags' field and define a flag value for suppressible adapters. Besides, add documentation for the interface. Signed-off-by: NFei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 28 3月, 2017 11 次提交
-
-
由 James Hogan 提交于
Properly implement emulation of the TLBR instruction for Trap & Emulate. This instruction reads the TLB entry pointed at by the CP0_Index register into the other TLB registers, which may have the side effect of changing the current ASID. Therefore abstract the CP0_EntryHi and ASID changing code into a common function in the process. A comment indicated that Linux doesn't use TLBR, which is true during normal use, however dumping of the TLB does use it (for example with the relatively recent 'x' magic sysrq key), as does a wired TLB entries test case in my KVM tests. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Acked-by: NRalf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Octeon III has VZ ASE support, so allow KVM to be enabled on Octeon CPUs as it should now be functional. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Octeon III implements a read-only guest CP0_PRid register, so add cases to the KVM register access API for Octeon to ensure the correct value is read and writes are ignored. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Octeon III doesn't implement the optional GuestCtl0.CG bit to allow guest mode to execute virtual address based CACHE instructions, so implement emulation of a few important ones specifically for Octeon III in response to a GPSI exception. Currently the main reason to perform these operations is for icache synchronisation, so they are implemented as a simple icache flush with local_flush_icache_range(). Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Set up hardware virtualisation on Octeon III cores, configuring guest interrupt routing and carving out half of the root TLB for guest use, restoring it back again afterwards. We need to be careful to inhibit TLB shutdown machine check exceptions while invalidating guest TLB entries, since TLB invalidation is not available so guest entries must be invalidated by setting them to unique unmapped addresses, which could conflict with mappings set by the guest or root if recently repartitioned. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Octeon CPUs don't report the correct dcache line size in CP0_Config1.DL, so encode the correct value for the guest CP0_Config1.DL based on cpu_dcache_line_size(). Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
When TLB entries are invalidated in the presence of a virtually tagged icache, such as that found on Octeon CPUs, flush the icache so that we don't get a reserved instruction exception even though the TLB mapping is removed. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Cache management is implemented separately for Cavium Octeon CPUs, so r4k_blast_[id]cache aren't available. Instead for Octeon perform a local icache flush using local_flush_icache_range(), and for other platforms which don't use c-r4k.c use __flush_cache_all() / flush_icache_all(). Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Add accessors for some VZ related Cavium Octeon III specific COP0 registers, along with field definitions. These will mostly be used by KVM to set up interrupt routing and partition the TLB between root and guest. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Acked-by: NRalf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Create a trace event for guest mode changes, and enable VZ's GuestCtl0.MC bit after the trace event is enabled to trap all guest mode changes. The MC bit causes Guest Hardware Field Change (GHFC) exceptions whenever a guest mode change occurs (such as an exception entry or return from exception), so we need to handle this exception now. The MC bit is only enabled when restoring register state, so enabling the trace event won't take immediate effect. Tracing guest mode changes can be particularly handy when trying to work out what a guest OS gets up to before something goes wrong, especially if the problem occurs as a result of some previous guest userland exception which would otherwise be invisible in the trace. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-
由 James Hogan 提交于
Transfer timer state to the VZ guest context (CP0_GTOffset & guest CP0_Count) when entering guest mode, enabling direct guest access to it, and transfer back to soft timer when saving guest register state. This usually allows guest code to directly read CP0_Count (via MFC0 and RDHWR) and read/write CP0_Compare, without trapping to the hypervisor for it to emulate the guest timer. Writing to CP0_Count or CP0_Cause.DC is much less common and still triggers a hypervisor GPSI exception, in which case the timer state is transferred back to an hrtimer before emulating the write. We are careful to prevent small amounts of drift from building up due to undeterministic time intervals between reading of the ktime and reading of CP0_Count. Some drift is expected however, since the system clocksource may use a different timer to the local CP0_Count timer used by VZ. This is permitted to prevent guest CP0_Count from appearing to go backwards. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
-