- 18 6月, 2015 1 次提交
-
-
由 Marc Zyngier 提交于
Back in the days, vgic.c used to have an intimate knowledge of the actual GICv2. These days, this has been abstracted away into hardware-specific backends. Remove the now useless arm-gic.h #include directive, making it clear that GICv2 specific code doesn't belong here. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 17 6月, 2015 9 次提交
-
-
由 Marc Zyngier 提交于
Commit fd1d0ddf (KVM: arm/arm64: check IRQ number on userland injection) rightly limited the range of interrupts userspace can inject in a guest, but failed to consider the (unlikely) case where a guest is configured with 1024 interrupts. In this case, interrupts ranging from 1020 to 1023 are unuseable, as they have a special meaning for the GIC CPU interface. Make sure that these number cannot be used as an IRQ. Also delete a redundant (and similarily buggy) check in kvm_set_irq. Reported-by: NPeter Maydell <peter.maydell@linaro.org> Cc: Andre Przywara <andre.przywara@arm.com> Cc: <stable@vger.kernel.org> # 4.1, 4.0, 3.19, 3.18 Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
The GIC Hypervisor Configuration Register is used to enable the delivery of virtual interupts to a guest, as well as to define in which conditions maintenance interrupts are delivered to the host. This register doesn't contain any information that we need to read back (the EOIcount is utterly useless for us). So let's save ourselves some cycles, and not save it before writing zero to it. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
If a GICv3-enabled guest tries to configure Group0, we print a warning on the console (because we don't support Group0 interrupts). This is fairly pointless, and would allow a guest to spam the console. Let's just drop the warning. Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Lorenzo Pieralisi 提交于
According to the PSCI specification and the SMC/HVC calling convention, PSCI function_ids that are not implemented must return NOT_SUPPORTED as return value. Current KVM implementation takes an unhandled PSCI function_id as an error and injects an undefined instruction into the guest if PSCI implementation is called with a function_id that is not handled by the resident PSCI version (ie it is not implemented), which is not the behaviour expected by a guest when calling a PSCI function_id that is not implemented. This patch fixes this issue by returning NOT_SUPPORTED whenever the kvm PSCI call is executed for a function_id that is not implemented by the PSCI kvm layer. Cc: <stable@vger.kernel.org> # 3.18+ Cc: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Alex Bennée 提交于
The elr_el2 and spsr_el2 registers in fact contain the processor state before entry into EL2. In the case of guest state it could be in either el0 or el1. Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Kim Phillips 提交于
The KVM-VFIO device is used by the QEMU VFIO device. It is used to record the list of in-use VFIO groups so that KVM can manipulate them. Signed-off-by: NKim Phillips <kim.phillips@linaro.org> Signed-off-by: NEric Auger <eric.auger@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Christoffer Dall 提交于
Until now we have been calling kvm_guest_exit after re-enabling interrupts when we come back from the guest, but this has the unfortunate effect that CPU time accounting done in the context of timer interrupts occurring while the guest is running doesn't properly notice that the time since the last tick was spent in the guest. Inspired by the comment in the x86 code, move the kvm_guest_exit() call below the local_irq_enable() call and change __kvm_guest_exit() to kvm_guest_exit(), because we are now calling this function with interrupts enabled. We have to now explicitly disable preemption and not enable preemption before we've called kvm_guest_exit(), since otherwise we could be preempted and everything happening before we eventually get scheduled again would be accounted for as guest time. At the same time, move the trace_kvm_exit() call outside of the atomic section, since there is no reason for us to do that with interrupts disabled. Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Tiejun Chen 提交于
We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD, kvm_vm_ioctl_check_extension_generic() | + switch (arg) { + ... + #ifdef CONFIG_HAVE_KVM_IRQFD + case KVM_CAP_IRQFD: + #endif + ... + return 1; + ... + } | + kvm_vm_ioctl_check_extension() So its not necessary to check this in arch again, and also fix one typo, s/emlation/emulation. Signed-off-by: NTiejun Chen <tiejun.chen@intel.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
由 Marc Zyngier 提交于
On VM entry, we disable access to the VFP registers in order to perform a lazy save/restore of these registers. On VM exit, we restore access, test if we did enable them before, and save/restore the guest/host registers if necessary. In this sequence, the FPEXC register is always accessed, irrespective of the trapping configuration. If the guest didn't touch the VFP registers, then the HCPTR access has now enabled such access, but we're missing a barrier to ensure architectural execution of the new HCPTR configuration. If the HCPTR access has been delayed/reordered, the subsequent access to FPEXC will cause a trap, which we aren't prepared to handle at all. The same condition exists when trapping to enable VFP for the guest. The fix is to introduce a barrier after enabling VFP access. In the vmexit case, it can be relaxed to only takes place if the guest hasn't accessed its view of the VFP registers, making the access to FPEXC safe. The set_hcptr macro is modified to deal with both vmenter/vmexit and vmtrap operations, and now takes an optional label that is branched to when the guest hasn't touched the VFP registers. Reported-by: NVikram Sethi <vikrams@codeaurora.org> Cc: stable@kernel.org # v3.9+ Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 10 6月, 2015 2 次提交
-
-
由 Andre Przywara 提交于
Commit 47a98b15 ("arm/arm64: KVM: support for un-queuing active IRQs") introduced handling of the GICD_I[SC]ACTIVER registers, but only for the GICv2 emulation. For the sake of completeness and as this is a pre-requisite for save/restore of the GICv3 distributor state, we should also emulate their handling in the distributor and redistributor frames of an emulated GICv3. Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NAndre Przywara <andre.przywara@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Firo Yang 提交于
No need to cast the void pointer returned by kmalloc() in arch/arm/kvm/mmu.c::kvm_alloc_stage2_pgd(). Signed-off-by: NFiro Yang <firogm@gmail.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 20 5月, 2015 16 次提交
-
-
由 Paolo Bonzini 提交于
gfn_to_pfn_async is used in just one place, and because of x86-specific treatment that place will need to look at the memory slot. Hence inline it into try_async_pf and export __gfn_to_pfn_memslot. The patch also switches the subsequent call to gfn_to_pfn_prot to use __gfn_to_pfn_memslot. This is a small optimization. Finally, remove the now-unused async argument of __gfn_to_pfn. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The argument to KVM_GET_DIRTY_LOG is a memslot id; it may not match the position in the memslots array, which is sorted by gfn. Cc: stable@vger.kernel.org Reviewed-by: NJames Hogan <james.hogan@imgtec.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
CR0.CD and CR0.NW are not used by shadow page table so that need not adjust mmu if these two bit are changed Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Currently, whenever guest MTRR registers are changed kvm_mmu_reset_context is called to switch to the new root shadow page table, however, it's useless since: 1) the cache type is not cached into shadow page's attribute so that the original root shadow page will be reused 2) the cache type is set on the last spte, that means we should sync the last sptes when MTRR is changed This patch fixs this issue by drop all the spte in the gfn range which is being updated by MTRR Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
There are some bugs in current get_mtrr_type(); 1: bit 1 of mtrr_state->enabled is corresponding bit 11 of IA32_MTRR_DEF_TYPE MSR which completely control MTRR's enablement that means other bits are ignored if it is cleared 2: the fixed MTRR ranges are controlled by bit 0 of mtrr_state->enabled (bit 10 of IA32_MTRR_DEF_TYPE) 3: if MTRR is disabled, UC is applied to all of physical memory rather than mtrr_state->def_type Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: NWanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Split kvm_unmap_rmapp and introduce kvm_zap_rmapp which will be used in the later patch Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
slot_handle_level and its helper functions are ready now, use them to clean up the code Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
There are several places walking all rmaps for the memslot so that introduce common functions to cleanup the code Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
It's used to abstract the code from kvm_handle_hva_range and it will be used by later patch Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
It's used to walk all the sptes on the rmap to clean up the code Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This reverts commit ff7bbb9c. Sasha Levin is seeing odd jump in time values during boot of a KVM guest: [...] [ 0.000000] tsc: Detected 2260.998 MHz processor [3376355.247558] Calibrating delay loop (skipped) preset value.. [...] and bisected them to this commit. Reported-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
KVM may turn a user page to a kernel page when kernel writes a readonly user page if CR0.WP = 1. This shadow page entry will be reused after SMAP is enabled so that kernel is allowed to access this user page Fix it by setting SMAP && !CR0.WP into shadow page's role and reset mmu once CR4.SMAP is updated Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
When a REP-string is executed in 64-bit mode with an address-size prefix, ECX/EDI/ESI are used as counter and pointers. When ECX is initially zero, Intel CPUs clear the high 32-bits of RCX, and recent Intel CPUs update the high bits of the pointers in MOVS/STOS. This behavior is specific to Intel according to few experiments. As one may guess, this is an undocumented behavior. Yet, it is observable in the guest, since at least VMX traps REP-INS/OUTS even when ECX=0. Note that VMware appears to get it right. The behavior can be observed using the following code: #include <stdio.h> #define LOW_MASK (0xffffffff00000000ull) #define ALL_MASK (0xffffffffffffffffull) #define TEST(opcode) \ do { \ asm volatile(".byte 0xf2 \n\t .byte 0x67 \n\t .byte " opcode "\n\t" \ : "=S"(s), "=c"(c), "=D"(d) \ : "S"(ALL_MASK), "c"(LOW_MASK), "D"(ALL_MASK)); \ printf("opcode %s rcx=%llx rsi=%llx rdi=%llx\n", \ opcode, c, s, d); \ } while(0) void main() { unsigned long long s, d, c; iopl(3); TEST("0x6c"); TEST("0x6d"); TEST("0x6e"); TEST("0x6f"); TEST("0xa4"); TEST("0xa5"); TEST("0xa6"); TEST("0xa7"); TEST("0xaa"); TEST("0xab"); TEST("0xae"); TEST("0xaf"); } Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
When REP-string instruction is preceded with an address-size prefix, ECX/EDI/ESI are used as the operation counter and pointers. When they are updated, the high 32-bits of RCX/RDI/RSI are cleared, similarly to the way they are updated on every 32-bit register operation. Fix it. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
If the host sets hardware breakpoints to debug the guest, and a task-switch occurs in the guest, the architectural DR7 will not be updated. The effective DR7 would be updated instead. This fix puts the DR7 update during task-switch emulation, so it now uses the standard DR setting mechanism instead of the one that was previously used. As a bonus, the update of DR7 will now be effective for AMD as well. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 5月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
Merge tag 'kvm-s390-next-20150508' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes and features for 4.2 (kvm/next) Mostly a bunch of fixes, reworks and optimizations for s390. There is one new feature (EDAT-2 inside the guest), which boils down to 2GB pages.
-
- 08 5月, 2015 11 次提交
-
-
由 David Hildenbrand 提交于
Our implementation will never trigger interception code 12 as the responsible setting is never enabled - and never will be. The handler is dead code. Let's get rid of it. Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 David Hildenbrand 提交于
This patch factors out the search for a floating irq destination VCPU as well as the kicking of the found VCPU. The search is optimized in the following ways: 1. stopped VCPUs can't take any floating interrupts, so try to find an operating one. We have to take care of the special case where all VCPUs are stopped and we don't have any valid destination. 2. use online_vcpus, not KVM_MAX_VCPU. This speeds up the search especially if KVM_MAX_VCPU is increased one day. As these VCPU objects are initialized prior to increasing online_vcpus, we can be sure that they exist. Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: NDominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: NJens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Jens Freimann 提交于
We can avoid checking guest control registers and guest PSW as well as all the masking and calculations on the interrupt masks when no interrupts are pending. Also, the check for IRQ_PEND_COUNT can be removed, because we won't enter the while loop if no interrupts are pending and invalid interrupt types can't be injected. Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: NDominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Christian Borntraeger 提交于
Some updates to the control blocks need to be done in a way that ensures that no CPU is within SIE. Provide wrappers around the s390_vcpu_block functions and adopt the TOD migration code to update in a guaranteed fashion. Also rename these functions to have the kvm_s390_ prefix as everything else. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
-
由 Christian Borntraeger 提交于
exit_sie_sync is used to kick CPUs out of SIE and prevent reentering at any point in time. This is used to reload the prefix pages and to set the IBS stuff in a way that guarantees that after this function returns we are no longer in SIE. All current users trigger KVM requests. The request must be set before we block the CPUs to avoid races. Let's make this implicit by adding the request into a new function kvm_s390_sync_requests that replaces exit_sie_sync and split out s390_vcpu_block and s390_vcpu_unblock, that can be used to keep CPUs out of SIE independent of requests. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
-
由 Guenther Hutzl 提交于
1. Enable EDAT2 in the list of KVM facilities 2. Handle 2G frames in pfmf instruction If we support EDAT2, we may enable handling of 2G frames if not in 24 bit mode. 3. Enable EDAT2 in sie_block If the EDAT2 facility is available we enable GED2 mode control in the sie_block. Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NGuenther Hutzl <hutzl@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Guenther Hutzl 提交于
We should only enable EDAT1 for the guest if the host actually supports it and the cpu model for the guest has EDAT-1 enabled. Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NGuenther Hutzl <hutzl@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Christian Borntraeger 提交于
The fast path for a sie exit is that no kvm reqest is pending. Make an early check to skip all single bit checks. Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 David Hildenbrand 提交于
Commit ea5f4969 ("KVM: s390: only one external call may be pending at a time") introduced a bug on machines that don't have SIGP interpretation facility installed. The injection of an external call will now always fail with -EBUSY (if none is already pending). This leads to the following symptoms: - An external call will be injected but with the wrong "src cpu id", as this id will not be remembered. - The target vcpu will not be woken up, therefore the guest will hang if it cannot deal with unexpected failures of the SIGP EXTERNAL CALL instruction. - If an external call is already pending, -EBUSY will not be reported. Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NJens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.0 Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Paolo Bonzini 提交于
smep_andnot_wp is initialized in kvm_init_shadow_mmu and shadow pages should not be reused for different values of it. Thus, it has to be added to the mask in kvm_mmu_pte_write. Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Current permission check assumes that RSVD bit in PFEC is always zero, however, it is not true since MMIO #PF will use it to quickly identify MMIO access Fix it by clearing the bit if walking guest page table is needed Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-