- 15 9月, 2015 3 次提交
-
-
由 Jason Wang 提交于
This patch factors out core eventfd assign/deassign logic and leaves the argument checking and bus index selection to callers. Cc: stable@vger.kernel.org Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jason Wang 提交于
We only want zero length mmio eventfd to be registered on KVM_FAST_MMIO_BUS. So check this explicitly when arg->len is zero to make sure this. Cc: stable@vger.kernel.org Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wei Yang 提交于
After 'commit 0b8ba4a2 ("KVM: fix checkpatch.pl errors in kvm/coalesced_mmio.h")', the declaration of the two function will exceed 80 characters. This patch reduces the TAPs to make each line in 80 characters. Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 14 9月, 2015 1 次提交
-
-
由 Wanpeng Li 提交于
If there is already some polling ongoing, it's impossible to disable the polling, since as soon as somebody sets halt_poll_ns to 0, polling will never stop, as grow and shrink are only handled if halt_poll_ns is != 0. This patch fix it by reset vcpu->halt_poll_ns in order to stop polling when polling is disabled. Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 9月, 2015 1 次提交
-
-
由 Vladimir Davydov 提交于
In the scope of the idle memory tracking feature, which is introduced by the following patch, we need to clear the referenced/accessed bit not only in primary, but also in secondary ptes. The latter is required in order to estimate wss of KVM VMs. At the same time we want to avoid flushing tlb, because it is quite expensive and it won't really affect the final result. Currently, there is no function for clearing pte young bit that would meet our requirements, so this patch introduces one. To achieve that we have to add a new mmu-notifier callback, clear_young, since there is no method for testing-and-clearing a secondary pte w/o flushing tlb. The new method is not mandatory and currently only implemented by KVM. Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Reviewed-by: NAndres Lagar-Cavilla <andreslc@google.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Greg Thelen <gthelen@google.com> Cc: Michel Lespinasse <walken@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 9月, 2015 1 次提交
-
-
由 Sudip Mukherjee 提交于
We were taking the exit path after checking ue->flags and return value of setup_routing_entry(), but 'e' was not freed incase of a failure. Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 06 9月, 2015 3 次提交
-
-
由 Wanpeng Li 提交于
Tracepoint for dynamic halt_pool_ns, fired on every potential change. Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
There is a downside of always-poll since poll is still happened for idle vCPUs which can waste cpu usage. This patchset add the ability to adjust halt_poll_ns dynamically, to grow halt_poll_ns when shot halt is detected, and to shrink halt_poll_ns when long halt is detected. There are two new kernel parameters for changing the halt_poll_ns: halt_poll_ns_grow and halt_poll_ns_shrink. no-poll always-poll dynamic-poll ----------------------------------------------------------------------- Idle (nohz) vCPU %c0 0.15% 0.3% 0.2% Idle (250HZ) vCPU %c0 1.1% 4.6%~14% 1.2% TCP_RR latency 34us 27us 26.7us "Idle (X) vCPU %c0" is the percent of time the physical cpu spent in c0 over 60 seconds (each vCPU is pinned to a pCPU). (nohz) means the guest was tickless. (250HZ) means the guest was ticking at 250HZ. The big win is with ticking operating systems. Running the linux guest with nohz=off (and HZ=250), we save 3.4%~12.8% CPUs/second and get close to no-polling overhead levels by using the dynamic-poll. The savings should be even higher for higher frequency ticks. Suggested-by: NDavid Matlack <dmatlack@google.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Simplify the patch. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Change halt_poll_ns into per-VCPU variable, seeded from module parameter, to allow greater flexibility. Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 9月, 2015 2 次提交
-
-
由 Christoffer Dall 提交于
Provide a better quality of implementation and be architecture compliant on ARMv7 for the architected timer by resetting the CNTV_CTL to 0 on reset of the timer. This change alone fixes the UEFI reset issue reported by Laszlo back in February. Cc: Laszlo Ersek <lersek@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Drew Jones <drjones@redhat.com> Cc: Wei Huang <wei@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Christoffer Dall 提交于
We currently set the physical active state only when we *inject* a new pending virtual interrupt, but this is actually not correct, because we could have been preempted and run something else on the system that resets the active state to clear. This causes us to run the VM with the timer set to fire, but without setting the physical active state. The solution is to always check the LR configurations, and we if have a mapped interrupt in the LR in either the pending or active state (virtual), then set the physical active state. Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 12 8月, 2015 7 次提交
-
-
由 Marc Zyngier 提交于
In order to remove the crude hack where we sneak the masked bit into the timer's control register, make use of the phys_irq_map API control the active state of the interrupt. This causes some limited changes to allow for potential error propagation. Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
Virtual interrupts mapped to a HW interrupt should only be triggered from inside the kernel. Otherwise, you could end up confusing the kernel (and the GIC's) state machine. Rearrange the injection path so that kvm_vgic_inject_irq is used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is used for mapped interrupts. The latter should only be called from inside the kernel (timer, irqfd). Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
In order to control the active state of an interrupt, introduce a pair of accessors allowing the state to be set/queried. This only affects the logical state, and the HW state will only be applied at world-switch time. Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
To allow a HW interrupt to be injected into a guest, we lookup the guest virtual interrupt in the irq_phys_map list, and if we have a match, encode both interrupts in the LR. We also mark the interrupt as "active" at the host distributor level. On guest EOI on the virtual interrupt, the host interrupt will be deactivated. Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
In order to be able to feed physical interrupts to a guest, we need to be able to establish the virtual-physical mapping between the two worlds. The mappings are kept in a set of RCU lists, indexed by virtual interrupts. Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
We only set the irq_queued flag for level interrupts, meaning that "!vgic_irq_is_queued(vcpu, irq)" is a good enough predicate for all interrupts. This will allow us to inject edge HW interrupts, for which the state ACTIVE+PENDING is not allowed. Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
Now that struct vgic_lr supports the LR_HW bit and carries a hwirq field, we can encode that information into the list registers. This patch provides implementations for both GICv2 and GICv3. Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 30 7月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 29 7月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
This is another remnant of ia64 support. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 7月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
If there are no assigned devices, the guest PAT are not providing any useful information and can be overridden to writeback; VMX always does this because it has the "IPAT" bit in its extended page table entries, but SVM does not have anything similar. Hook into VFIO and legacy device assignment so that they provide this information to KVM. Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Tested-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 7月, 2015 1 次提交
-
-
由 Peter Zijlstra 提交于
Commit 1cde2930 ("sched/preempt: Add static_key() to preempt_notifiers") had two problems. First, the preempt-notifier API needs to sleep with the addition of the static_key, we do however need to hold off preemption while modifying the preempt notifier list, otherwise a preemption could observe an inconsistent list state. KVM correctly registers and unregisters preempt notifiers with preemption disabled, so the sleep caused dmesg splats. Second, KVM registers and unregisters preemption notifiers very often (in vcpu_load/vcpu_put). With a single uniprocessor guest the static key would move between 0 and 1 continuously, hitting the slow path on every userspace exit. To fix this, wrap the static_key inc/dec in a new API, and call it from KVM. Fixes: 1cde2930 ("sched/preempt: Add static_key() to preempt_notifiers") Reported-by: NPontus Fuchs <pontus.fuchs@gmail.com> Reported-by: NTakashi Iwai <tiwai@suse.de> Tested-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 19 6月, 2015 3 次提交
-
-
由 Kevin Mulvey 提交于
Tabs rather than spaces Signed-off-by: NKevin Mulvey <kmulvey@linux.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Kevin Mulvey 提交于
fix brace spacing Signed-off-by: NKevin Mulvey <kmulvey@linux.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Joerg Roedel 提交于
The allocation size of the kvm_irq_routing_table depends on the number of irq routing entries because they are all allocated with one kzalloc call. When the irq routing table gets bigger this requires high order allocations which fail from time to time: qemu-kvm: page allocation failure: order:4, mode:0xd0 This patch fixes this issue by breaking up the allocation of the table and its entries into individual kzalloc calls. These could all be satisfied with order-0 allocations, which are less likely to fail. The downside of this change is the lower performance, because of more calls to kzalloc. But given how often kvm_set_irq_routing is called in the lifetime of a guest, it doesn't really matter much. Signed-off-by: NJoerg Roedel <jroedel@suse.de> [Avoid sparse warning through rcu_access_pointer. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 6月, 2015 1 次提交
-
-
由 Marc Zyngier 提交于
Back in the days, vgic.c used to have an intimate knowledge of the actual GICv2. These days, this has been abstracted away into hardware-specific backends. Remove the now useless arm-gic.h #include directive, making it clear that GICv2 specific code doesn't belong here. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 17 6月, 2015 2 次提交
-
-
由 Marc Zyngier 提交于
Commit fd1d0ddf (KVM: arm/arm64: check IRQ number on userland injection) rightly limited the range of interrupts userspace can inject in a guest, but failed to consider the (unlikely) case where a guest is configured with 1024 interrupts. In this case, interrupts ranging from 1020 to 1023 are unuseable, as they have a special meaning for the GIC CPU interface. Make sure that these number cannot be used as an IRQ. Also delete a redundant (and similarily buggy) check in kvm_set_irq. Reported-by: NPeter Maydell <peter.maydell@linaro.org> Cc: Andre Przywara <andre.przywara@arm.com> Cc: <stable@vger.kernel.org> # 4.1, 4.0, 3.19, 3.18 Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
If a GICv3-enabled guest tries to configure Group0, we print a warning on the console (because we don't support Group0 interrupts). This is fairly pointless, and would allow a guest to spam the console. Let's just drop the warning. Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 12 6月, 2015 1 次提交
-
-
由 Marc Zyngier 提交于
So far, we configured the world-switch by having a small array of pointers to the save and restore functions, depending on the GIC used on the platform. Loading these values each time is a bit silly (they never change), and it makes sense to rely on the instruction patching instead. This leads to a nice cleanup of the code. Acked-by: NWill Deacon <will.deacon@arm.com> Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 10 6月, 2015 1 次提交
-
-
由 Andre Przywara 提交于
Commit 47a98b15 ("arm/arm64: KVM: support for un-queuing active IRQs") introduced handling of the GICD_I[SC]ACTIVER registers, but only for the GICv2 emulation. For the sake of completeness and as this is a pre-requisite for save/restore of the GICv3 distributor state, we should also emulate their handling in the distributor and redistributor frames of an emulated GICv3. Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NAndre Przywara <andre.przywara@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 05 6月, 2015 2 次提交
-
-
由 Paolo Bonzini 提交于
Only two ioctls have to be modified; the address space id is placed in the higher 16 bits of their slot id argument. As of this patch, no architecture defines more than one address space; x86 will be the first. Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
We need to hide SMRAM from guests not running in SMM. Therefore, all uses of kvm_read_guest* and kvm_write_guest* must be changed to use different address spaces, depending on whether the VCPU is in system management mode. We need to introduce a new family of functions for this purpose. For now, the VCPU-based functions have the same behavior as the existing per-VM ones, they just accept a different type for the first argument. Later however they will be changed to use one of many "struct kvm_memslots" stored in struct kvm, through an architecture hook. VM-based functions will unconditionally use the first memslots pointer. Whenever possible, this patch introduces slot-based functions with an __ prefix, with two wrappers for generic and vcpu-based actions. The exceptions are kvm_read_guest and kvm_write_guest, which are copied into the new functions kvm_vcpu_read_guest and kvm_vcpu_write_guest. Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 5月, 2015 4 次提交
-
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Most of the function that wrap it can be rewritten without it, except for gfn_to_pfn_prot. Just inline it into gfn_to_pfn_prot, and rewrite the other function on top of gfn_to_pfn_memslot*. Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The memory slot is already available from gfn_to_memslot_dirty_bitmap. Isn't it a shame to look it up again? Plus, it makes gfn_to_page_many_atomic agnostic of multiple VCPU address spaces. Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This lets the function access the new memory slot without going through kvm_memslots and id_to_memslot. It will simplify the code when more than one address space will be supported. Unfortunately, the "const"ness of the new argument must be casted away in two places. Fixing KVM to accept const struct kvm_memory_slot pointers would require modifications in pretty much all architectures, and is left for later. Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 5月, 2015 4 次提交
-
-
由 Paolo Bonzini 提交于
Prepare for the case of multiple address spaces. Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Architecture-specific helpers are not supposed to muck with struct kvm_userspace_memory_region contents. Add const to enforce this. In order to eliminate the only write in __kvm_set_memory_region, the cleaning of deleted slots is pulled up from update_memslots to __kvm_set_memory_region. Reviewed-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
kvm_memslots provides lockdep checking. Use it consistently instead of explicit dereferencing of kvm->memslots. Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
kvm_alloc_memslots is extracted out of previously scattered code that was in kvm_init_memslots_id and kvm_create_vm. kvm_free_memslot and kvm_free_memslots are new names of kvm_free_physmem and kvm_free_physmem_slot, but they also take an explicit pointer to struct kvm_memslots. This will simplify the transition to multiple address spaces, each represented by one pointer to struct kvm_memslots. Reviewed-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-