- 08 8月, 2017 2 次提交
-
-
由 Longpeng(Mike) 提交于
This implements the kvm_arch_vcpu_in_kernel() for ARM, and adjusts the calls to kvm_vcpu_on_spin(). Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Longpeng(Mike) 提交于
If a vcpu exits due to request a user mode spinlock, then the spinlock-holder may be preempted in user mode or kernel mode. (Note that not all architectures trap spin loops in user mode, only AMD x86 and ARM/ARM64 currently do). But if a vcpu exits in kernel mode, then the holder must be preempted in kernel mode, so we should choose a vcpu in kernel mode as a more likely candidate for the lock holder. This introduces kvm_arch_vcpu_in_kernel() to decide whether the vcpu is in kernel-mode when it's preempted. kvm_vcpu_on_spin's new argument says the same of the spinning VCPU. Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 8月, 2017 2 次提交
-
-
由 Christoffer Dall 提交于
There is a small chance that the compiler could generate separate loads for the dist->propbaser which could be modified from another CPU. As we want to make sure we atomically update the entire value, and don't race with other updates, guarantee that the cmpxchg operation compares against the original value. Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Paolo Bonzini 提交于
During teardown, accesses to memslots and buses are using rcu_dereference_protected with an always-true condition because these accesses are done outside the usual mutexes. This is because the last reference is gone and there cannot be any concurrent modifications, but rcu_dereference_protected is ugly and unobvious. Instead, check the refcount in kvm_get_bus and __kvm_memslots. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 27 7月, 2017 1 次提交
-
-
由 Claudio Imbrenda 提交于
Simplify and improve the code so that the PID is always available in the uevent even when debugfs is not available. This adds a userspace_pid field to struct kvm, as per Radim's suggestion, so that the PID can be retrieved on destruction too. Acked-by: NJanosch Frank <frankja@linux.vnet.ibm.com> Fixes: 286de8f6 ("KVM: trigger uevents when creating or destroying a VM") Signed-off-by: NClaudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 25 7月, 2017 3 次提交
-
-
由 Suzuki K Poulose 提交于
The mmu_notifier_release() callback of KVM triggers cleaning up the stage2 page table on kvm-arm. However there could be other notifier callbacks in parallel with the mmu_notifier_release(), which could cause the call backs ending up in an empty stage2 page table. Make sure we check it for all the notifier callbacks. Cc: stable@vger.kernel.org Fixes: commit 293f2936 ("kvm-arm: Unmap shadow pagetables properly") Reported-by: NAlex Graf <agraf@suse.de> Reviewed-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Andrew Jones 提交于
kvm_pmu_overflow_set() is called from perf's interrupt handler, making the call of kvm_vgic_inject_irq() from it introduced with "KVM: arm/arm64: PMU: remove request-less vcpu kick" a really bad idea, as it's quite easy to try and retake a lock that the interrupted context is already holding. The fix is to use a vcpu kick, leaving the interrupt injection to kvm_pmu_sync_hwstate(), like it was doing before the refactoring. We don't just revert, though, because before the kick was request-less, leaving the vcpu exposed to the request-less vcpu kick race, and also because the kick was used unnecessarily from register access handlers. Reviewed-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Shanker Donthineni 提交于
Commit 0e4e82f1 ("KVM: arm64: vgic-its: Enable ITS emulation as a virtual MSI controller") tried to advertise KVM_CAP_MSI_DEVID, but the code logic was not updating the dist->msis_require_devid field correctly. If hypervisor tool creates the ITS device after VGIC initialization then we don't advertise KVM_CAP_MSI_DEVID capability. Update the field msis_require_devid to true inside vgic_its_create() to fix the issue. Fixes: 0e4e82f1 ("vgic-its: Enable ITS emulation as a virtual MSI controller") Signed-off-by: NShanker Donthineni <shankerd@codeaurora.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 13 7月, 2017 1 次提交
-
-
由 Claudio Imbrenda 提交于
This patch adds a few lines to the KVM common code to fire a KOBJ_CHANGE uevent whenever a KVM VM is created or destroyed. The event carries five environment variables: CREATED indicates how many times a new VM has been created. It is useful for example to trigger specific actions when the first VM is started COUNT indicates how many VMs are currently active. This can be used for logging or monitoring purposes PID has the pid of the KVM process that has been started or stopped. This can be used to perform process-specific tuning. STATS_PATH contains the path in debugfs to the directory with all the runtime statistics for this VM. This is useful for performance monitoring and profiling. EVENT described the type of event, its value can be either "create" or "destroy" Specific udev rules can be then set up in userspace to deal with the creation or destruction of VMs as needed. Signed-off-by: NClaudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 10 7月, 2017 1 次提交
-
-
由 Paolo Bonzini 提交于
The uniprocessor version of smp_call_function_many does not evaluate all of its argument, and the compiler emits a warning about "wait" being unused. This breaks the build on architectures for which "-Werror" is enabled by default. Work around it by moving the invocation of smp_call_function_many to its own inline function. Reported-by: NPaul Mackerras <paulus@ozlabs.org> Cc: stable@vger.kernel.org Fixes: 7a97cec2Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 7月, 2017 4 次提交
-
-
由 Christian Borntraeger 提交于
we access the memslots array via srcu. Mark it as such and use the right access functions also for the freeing of memory slots. Found by sparse: ./include/linux/kvm_host.h:565:16: error: incompatible types in comparison expression (different address spaces) Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Christian Borntraeger 提交于
mark kvm->busses as rcu protected and use the correct access function everywhere. found by sparse virt/kvm/kvm_main.c:3490:15: error: incompatible types in comparison expression (different address spaces) virt/kvm/kvm_main.c:3509:15: error: incompatible types in comparison expression (different address spaces) virt/kvm/kvm_main.c:3561:15: error: incompatible types in comparison expression (different address spaces) virt/kvm/kvm_main.c:3644:15: error: incompatible types in comparison expression (different address spaces) Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Christian Borntraeger 提交于
irq routing is rcu protected. Use the proper access functions. Found by sparse virt/kvm/irqchip.c:233:13: warning: incorrect type in assignment (different address spaces) virt/kvm/irqchip.c:233:13: expected struct kvm_irq_routing_table *old virt/kvm/irqchip.c:233:13: got struct kvm_irq_routing_table [noderef] <asn:4>*irq_routing Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Christian Borntraeger 提交于
We do use rcu to protect the pid pointer. Mark it as such and adopt all code to use the proper access methods. This was detected by sparse. "virt/kvm/kvm_main.c:2248:15: error: incompatible types in comparison expression (different address spaces)" Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 29 6月, 2017 2 次提交
-
-
由 Alex Williamson 提交于
At the point where the kvm-vfio pseudo device wants to release its vfio group reference, we can't always acquire a new reference to make that happen. The group can be in a state where we wouldn't allow a new reference to be added. This new helper function allows a caller to match a file to a group to facilitate this. Given a file and group, report if they match. Thus the caller needs to already have a group reference to match to the file. This allows the deletion of a group without acquiring a new reference. Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Tested-by: NEric Auger <eric.auger@redhat.com> Cc: stable@vger.kernel.org
-
由 Alex Williamson 提交于
Unset-KVM and decrement-assignment only when we find the group in our list. Otherwise we can get out of sync if the user triggers this for groups that aren't currently on our list. Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NEric Auger <eric.auger@redhat.com> Tested-by: NEric Auger <eric.auger@redhat.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org
-
- 27 6月, 2017 2 次提交
-
-
由 Paolo Bonzini 提交于
The call to kvm_put_kvm was removed from error handling in commit 506cfba9 ("KVM: don't use anon_inode_getfd() before possible failures"), but it is _not_ a memory leak. Reuse Al's explanation to avoid that someone else makes the same mistake. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Roman Storozhenko 提交于
Replaces "S_IRUGO | S_IWUSR" with 0644. The reason is that symbolic permissions considered harmful: https://lwn.net/Articles/696229/Signed-off-by: NRoman Storozhenko <romeusmeister@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 23 6月, 2017 2 次提交
-
-
由 Tyler Baicar 提交于
Currently external aborts are unsupported by the guest abort handling. Add handling for SEAs so that the host kernel reports SEAs which occur in the guest kernel. When an SEA occurs in the guest kernel, the guest exits and is routed to kvm_handle_guest_abort(). Prior to this patch, a print message of an unsupported FSC would be printed and nothing else would happen. With this patch, the code gets routed to the APEI handling of SEAs in the host kernel to report the SEA information. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 James Morse 提交于
Once we enable ARCH_SUPPORTS_MEMORY_FAILURE on arm64, notifications for broken memory can call memory_failure() in mm/memory-failure.c to offline pages of memory, possibly signalling user space processes and notifying all the in-kernel users. memory_failure() has two modes, early and late. Early is used by machine-managers like Qemu to receive a notification when a memory error is notified to the host. These can then be relayed to the guest before the affected page is accessed. To enable this, the process must set PR_MCE_KILL_EARLY in PR_MCE_KILL_SET using the prctl() syscall. Once the early notification has been handled, nothing stops the machine-manager or guest from accessing the affected page. If the machine-manager does this the page will fail to be mapped and SIGBUS will be sent. This patch adds the equivalent path for when the guest accesses the page, sending SIGBUS to the machine-manager. These two signals can be distinguished by the machine-manager using their si_code: BUS_MCEERR_AO for 'action optional' early notifications, and BUS_MCEERR_AR for 'action required' synchronous/late notifications. Do as x86 does, and deliver the SIGBUS when we discover pfn == KVM_PFN_ERR_HWPOISON. Use the hugepage size as si_addr_lsb if this vma was allocated as a hugepage. Transparent hugepages will be split by memory_failure() before we see them here. Cc: Punit Agrawal <punit.agrawal@arm.com> Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 20 6月, 2017 1 次提交
-
-
由 Ingo Molnar 提交于
Rename: wait_queue_t => wait_queue_entry_t 'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue", but in reality it's a queue *entry*. The 'real' queue is the wait queue head, which had to carry the name. Start sorting this out by renaming it to 'wait_queue_entry_t'. This also allows the real structure name 'struct __wait_queue' to lose its double underscore and become 'struct wait_queue_entry', which is the more canonical nomenclature for such data types. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 15 6月, 2017 19 次提交
-
-
由 Hu Huajun 提交于
When reading the cntpct_el0 in guest with VHE (Virtual Host Extension) enabled in host, the "Unsupported guest sys_reg access" error reported. The reason is cnthctl_el2.EL1PCTEN is not enabled, which is expected to be done in kvm_timer_init_vhe(). The problem is kvm_timer_init_vhe is called by cpu_init_hyp_mode, and which is called when VHE is disabled. This patch remove the incorrect call to kvm_timer_init_vhe() from cpu_init_hyp_mode(), and calls kvm_timer_init_vhe() to enable cnthctl_el2.EL1PCTEN in cpu_hyp_reinit(). Fixes: 488f94d7 ("KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems") Cc: stable@vger.kernel.org Signed-off-by: NHu Huajun <huhuajun@huawei.com> Reviewed-by: NChristoffer Dall <cdall@linaro.org> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Mark Rutland 提交于
Per ARM DDI 0487B.a, the registers are named ICC_IGRPEN*_EL1 rather than ICC_GRPEN*_EL1. Correct our mnemonics and comments to match, before we add more GICv3 register definitions. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: kvmarm@lists.cs.columbia.edu Acked-by: NChristoffer Dall <cdall@linaro.org> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
A write-to-read-only GICv3 access should UNDEF at EL1. But since we're in complete paranoia-land with broken CPUs, let's assume the worse and gracefully handle the case. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Reviewed-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
A read-from-write-only GICv3 access should UNDEF at EL1. But since we're in complete paranoia-land with broken CPUs, let's assume the worse and gracefully handle the case. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Reviewed-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
In order to facilitate debug, let's log which class of GICv3 system registers are trapped. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Now that we're able to safely handle common sysreg access, let's give the user the opportunity to enable it by passing a specific command-line option (vgic_v3.common_trap). Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Add a handler for reading/writing the guest's view of the ICC_PMR_EL1 register, which is located in the ICH_VMCR_EL2.VPMR field. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Add a handler for reading/writing the guest's view of the ICV_CTLR_EL1 register. only EOIMode and CBPR are of interest here, as all the other bits directly come from ICH_VTR_EL2 and are Read-Only. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Add a handler for reading the guest's view of the ICV_RPR_EL1 register, returning the highest active priority. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Add a handler for writing the guest's view of the ICC_DIR_EL1 register, performing the deactivation of an interrupt if EOImode is set ot 1. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Reviewed-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 David Daney 提交于
Some Cavium Thunder CPUs suffer a problem where a KVM guest may inadvertently cause the host kernel to quit receiving interrupts. Use the Group-0/1 trapping in order to deal with it. [maz]: Adapted patch to the Group-0/1 trapping, reworked commit log Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NDavid Daney <david.daney@cavium.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Now that we're able to safely handle Group-0 sysreg access, let's give the user the opportunity to enable it by passing a specific command-line option (vgic_v3.group0_trap). Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
In order to be able to trap Group-0 GICv3 system registers, we need to set ICH_HCR_EL2.TALL0 begore entering the guest. This is conditionnaly done after having restored the guest's state, and cleared on exit. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
A number of Group-0 registers can be handled by the same accessors as that of Group-1, so let's add the required system register encodings and catch them in the dispatching function. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Add a handler for reading/writing the guest's view of the ICC_IGRPEN0_EL1 register, which is located in the ICH_VMCR_EL2.VENG0 field. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Reviewed-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Add a handler for reading/writing the guest's view of the ICC_BPR0_EL1 register, which is located in the ICH_VMCR_EL2.BPR0 field. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Reviewed-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Now that we're able to safely handle Group-1 sysreg access, let's give the user the opportunity to enable it by passing a specific command-line option (vgic_v3.group1_trap). Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
In order to be able to trap Group-1 GICv3 system registers, we need to set ICH_HCR_EL2.TALL1 before entering the guest. This is conditionally done after having restored the guest's state, and cleared on exit. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-
由 Marc Zyngier 提交于
Add a handler for reading the guest's view of the ICV_HPPIR1_EL1 register. This is a simple parsing of the available LRs, extracting the highest available interrupt. Tested-by: NAlexander Graf <agraf@suse.de> Acked-by: NDavid Daney <david.daney@cavium.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Reviewed-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NChristoffer Dall <cdall@linaro.org>
-