- 12 10月, 2017 7 次提交
-
-
由 Radim Krčmář 提交于
Our routines look at tscdeadline and period when deciding state of a timer. The timer is disarmed when switching between TSC deadline and other modes, so we should set everything to disarmed state. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
preemption timer only looks at tscdeadline and could inject already disarmed timer. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
0 should disable the timer, but start_hv_timer will recognize it as an expired timer instead. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
The description in the Intel SDM of how the divide configuration register is used: "The APIC timer frequency will be the processor's bus clock or core crystal clock frequency divided by the value specified in the divide configuration register." Observation of baremetal shown that when the TDCR is change, the TMCCT does not change or make a big jump in value, but the rate at which it count down change. The patch update the emulation to APIC timer to so that a change to the divide configuration would be reflected in the value of the counter and when the next interrupt is triggered. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Fixed some whitespace and added a check for negative delta and running timer. - Radim] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
If we take TSC-deadline mode timer out of the picture, the Intel SDM does not say that the timer is disable when the timer mode is change, either from one-shot to periodic or vice versa. After this patch, the timer is no longer disarmed on change of mode, so the counter (TMCCT) keeps counting down. So what does a write to LVTT changes ? On baremetal, the change of mode is probably taken into account only when the counter reach 0. When this happen, LVTT is use to figure out if the counter should restard counting down from TMICT (so periodic mode) or stop counting (if one-shot mode). This patch is based on observation of the behavior of the APIC timer on baremetal as well as check that they does not go against the description written in the Intel SDM. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Fixed rate limiting of periodic timer.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
Extract the logic of limit lapic periodic timer frequency to a new function, this function will be used by later patches. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
SDM 10.5.4.1 TSC-Deadline Mode mentioned that "Transitioning between TSC-Deadline mode and other timer modes also disarms the timer". So the APIC Timer Initial Count Register for one-shot/periodic mode should be reset. This patch do it. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Removed unnecessary definition of APIC_LVT_TIMER_MASK.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 15 9月, 2017 1 次提交
-
-
由 Davidlohr Bueso 提交于
A comment might serve future readers. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 8月, 2017 1 次提交
-
-
由 Ladi Prosek 提交于
It has been experimentally confirmed that supporting these two MSRs is one of the necessary conditions for nested Hyper-V to use the TSC page. Modern Windows guests are noticeably slower when they fall back to reading timestamps from the HV_X64_MSR_TIME_REF_COUNT MSR instead of using the TSC page. The newly supported MSRs are advertised with the AccessFrequencyRegs partition privilege flag and CPUID.40000003H:EDX[8] "Support for determining timer frequencies is available" (both outside of the scope of this KVM patch). Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 7月, 2017 1 次提交
-
-
由 Wanpeng Li 提交于
Preempt can occur in the preemption timer expiration handler: CPU0 CPU1 preemption timer vmexit handle_preemption_timer(vCPU0) kvm_lapic_expired_hv_timer hv_timer_is_use == true sched_out sched_in kvm_arch_vcpu_load kvm_lapic_restart_hv_timer restart_apic_timer start_hv_timer already-expired timer or sw timer triggerd in the window start_sw_timer cancel_hv_timer /* back in kvm_lapic_expired_hv_timer */ cancel_hv_timer WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops This can be reproduced if CONFIG_PREEMPT is enabled. ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G OE 4.13.0-rc2+ #16 RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] Call Trace: handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G W OE 4.13.0-rc2+ #16 RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm] Call Trace: kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm] handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer and start_sw_timer be in preemption-disabled regions, which trivially avoid any reentrancy issue with preempt notifier. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Add more WARNs. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 30 6月, 2017 3 次提交
-
-
由 Wanpeng Li 提交于
If the TSC deadline timer is programmed really close to the deadline or even in the past, the computation in vmx_set_hv_timer will program the absolute target tsc value to vmcs preemption timer field w/ delta == 0, then plays a vmentry and an upcoming vmx preemption timer fire vmexit dance, the lapic timer injection is delayed due to this duration. Actually the lapic timer which is emulated by hrtimer can handle this correctly. This patch fixes it by firing the lapic timer and injecting a timer interrupt immediately during the next vmentry if the TSC deadline timer is programmed really close to the deadline or even in the past. This saves ~300 cycles on the tsc_deadline_timer test of apic.flat. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Move the code to cancel the hv timer into the caller, just before it starts the hrtimer. Check availability of the hv timer in start_hv_timer. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
There are many cases in which the hv timer must be canceled. Split out a new function to avoid duplication. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 5月, 2017 1 次提交
-
-
由 Jan H. Schönherr 提交于
Intel SDM says, that at most one LAPIC should be configured with ExtINT delivery. KVM configures all LAPICs this way. This causes pic_unlock() to kick the first available vCPU from the internal KVM data structures. If this vCPU is not the BSP, but some not-yet-booted AP, the BSP may never realize that there is an interrupt. Fix that by enabling ExtINT delivery only for the BSP. This allows booting a Linux guest without a TSC in the above situation. Otherwise the BSP gets stuck in calibrate_delay_converge(). Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de> Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 5月, 2017 1 次提交
-
-
由 Wanpeng Li 提交于
Preemption can occur during cancel preemption timer, and there will be inconsistent status in lapic, vmx and vmcs field. CPU0 CPU1 preemption timer vmexit handle_preemption_timer(vCPU0) kvm_lapic_expired_hv_timer vmx_cancel_hv_timer vmx->hv_deadline_tsc = -1 vmcs_clear_bits /* hv_timer_in_use still true */ sched_out sched_in kvm_arch_vcpu_load vmx_set_hv_timer write vmx->hv_deadline_tsc vmcs_set_bits /* back in kvm_lapic_expired_hv_timer */ hv_timer_in_use = false ... vmx_vcpu_run vmx_arm_hv_run write preemption timer deadline spurious preemption timer vmexit handle_preemption_timer(vCPU0) kvm_lapic_expired_hv_timer WARN_ON(!apic->lapic_timer.hv_timer_in_use); This can be reproduced sporadically during boot of L2 on a preemptible L1, causing a splat on L1. WARNING: CPU: 3 PID: 1952 at arch/x86/kvm/lapic.c:1529 kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm] CPU: 3 PID: 1952 Comm: qemu-system-x86 Not tainted 4.12.0-rc1+ #24 RIP: 0010:kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm] Call Trace: handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xc9/0x15f0 [kvm_intel] ? lock_acquire+0xdb/0x250 ? lock_acquire+0xdb/0x250 ? kvm_arch_vcpu_ioctl_run+0xdf3/0x1ce0 [kvm] kvm_arch_vcpu_ioctl_run+0xe55/0x1ce0 [kvm] kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? __fget+0xf3/0x210 do_vfs_ioctl+0xa4/0x700 ? __fget+0x114/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x8f/0x750 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL64_slow_path+0x25/0x25 This patch fixes it by disabling preemption while cancelling preemption timer. This way cancel_hv_timer is atomic with respect to kvm_arch_vcpu_load. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 5月, 2017 1 次提交
-
-
由 Michal Hocko 提交于
Patch series "kvmalloc", v5. There are many open coded kmalloc with vmalloc fallback instances in the tree. Most of them are not careful enough or simply do not care about the underlying semantic of the kmalloc/page allocator which means that a) some vmalloc fallbacks are basically unreachable because the kmalloc part will keep retrying until it succeeds b) the page allocator can invoke a really disruptive steps like the OOM killer to move forward which doesn't sound appropriate when we consider that the vmalloc fallback is available. As it can be seen implementing kvmalloc requires quite an intimate knowledge if the page allocator and the memory reclaim internals which strongly suggests that a helper should be implemented in the memory subsystem proper. Most callers, I could find, have been converted to use the helper instead. This is patch 6. There are some more relying on __GFP_REPEAT in the networking stack which I have converted as well and Eric Dumazet was not opposed [2] to convert them as well. [1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com This patch (of 9): Using kmalloc with the vmalloc fallback for larger allocations is a common pattern in the kernel code. Yet we do not have any common helper for that and so users have invented their own helpers. Some of them are really creative when doing so. Let's just add kv[mz]alloc and make sure it is implemented properly. This implementation makes sure to not make a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also to not warn about allocation failures. This also rules out the OOM killer as the vmalloc is a more approapriate fallback than a disruptive user visible action. This patch also changes some existing users and removes helpers which are specific for them. In some cases this is not possible (e.g. ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and require GFP_NO{FS,IO} context which is not vmalloc compatible in general (note that the page table allocation is GFP_KERNEL). Those need to be fixed separately. While we are at it, document that __vmalloc{_node} about unsupported gfp mask because there seems to be a lot of confusion out there. kvmalloc_node will warn about GFP_KERNEL incompatible (which are not superset) flags to catch new abusers. Existing ones would have to die slowly. [sfr@canb.auug.org.au: f2fs fixup] Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Andreas Dilger <adilger@dilger.ca> [ext4 part] Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: John Hubbard <jhubbard@nvidia.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 5月, 2017 1 次提交
-
-
由 Paolo Bonzini 提交于
This reverts commit bbd64115. I've been sitting on this revert for too long and it unfortunately missed 4.11. It's also the reason why I haven't merged ring-based dirty tracking for 4.12. Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and kvm_vcpu_write_guest_offset_cached means that the MSR value can now be used to access SMRAM, simply by making it point to an SMRAM physical address. This is problematic because it lets the guest OS overwrite memory that it shouldn't be able to touch. Cc: stable@vger.kernel.org Fixes: bbd64115Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 2月, 2017 1 次提交
-
-
由 Cao, Lei 提交于
Provide versions of struct gfn_to_hva_cache functions that take vcpu as a parameter instead of struct kvm. The existing functions are not needed anymore, so delete them. This allows dirty pages to be logged in the vcpu dirty ring, instead of the global dirty ring, for ring-based dirty memory tracking. Signed-off-by: NLei Cao <lei.cao@stratus.com> Message-Id: <CY1PR08MB19929BD2AC47A291FD680E83F04F0@CY1PR08MB1992.namprd08.prod.outlook.com> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 2月, 2017 5 次提交
-
-
由 Paolo Bonzini 提交于
Since bf9f6ac8 ("KVM: Update Posted-Interrupts Descriptor when vCPU is blocked", 2015-09-18) the posted interrupt descriptor is checked unconditionally for PIR.ON. Therefore we don't need KVM_REQ_EVENT to trigger the scan and, if NMIs or SMIs are not involved, we can avoid the complicated event injection path. Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been there since APICv was introduced. However, without the KVM_REQ_EVENT safety net KVM needs to be much more careful about races between vmx_deliver_posted_interrupt and vcpu_enter_guest. First, the IPI for posted interrupts may be issued between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts. If that happens, kvm_trigger_posted_interrupt returns true, but smp_kvm_posted_intr_ipi doesn't do anything about it. The guest is entered with PIR.ON, but the posted interrupt IPI has not been sent and the interrupt is only delivered to the guest on the next vmentry (if any). To fix this, disable interrupts before setting vcpu->mode. This ensures that the IPI is delayed until the guest enters non-root mode; it is then trapped by the processor causing the interrupt to be injected. Second, the IPI may be issued between kvm_x86_ops->sync_pir_to_irr(vcpu) and vcpu->mode = IN_GUEST_MODE. In this case, kvm_vcpu_kick is called but it (correctly) doesn't do anything because it sees vcpu->mode == OUTSIDE_GUEST_MODE. Again, the guest is entered with PIR.ON but no posted interrupt IPI is pending; this time, the fix for this is to move the RVI update after IN_GUEST_MODE. Both issues were mostly masked by the liberal usage of KVM_REQ_EVENT, though the second could actually happen with VT-d posted interrupts. In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting in another vmentry which would inject the interrupt. This saves about 300 cycles on the self_ipi_* tests of vmexit.flat. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Calls to apic_find_highest_irr are scanning IRR twice, once in vmx_sync_pir_from_irr and once in apic_search_irr. Change sync_pir_from_irr to get the new maximum IRR from kvm_apic_update_irr; now that it does the computation, it can also do the RVI write. In order to avoid complications in svm.c, make the callback optional. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Add return value to __kvm_apic_update_irr/kvm_apic_update_irr. Move vmx_sync_pir_to_irr around. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Pending interrupts might be in the PI descriptor when the LAPIC is restored from an external state; we do not want them to be injected. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 1月, 2017 1 次提交
-
-
由 David Matlack 提交于
KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled). These are implemented with delayed_work structs which can still be pending when the KVM module is unloaded. We've seen this cause kernel panics when the kvm_intel module is quickly reloaded. Use the new static_key_deferred_flush() API to flush pending updates on module unload. Signed-off-by: NDavid Matlack <dmatlack@google.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 1月, 2017 7 次提交
-
-
由 Paolo Bonzini 提交于
On interrupt delivery the PPR can only grow (except for auto-EOI), so it is impossible that non-auto-EOI interrupt delivery results in KVM_REQ_EVENT. We can therefore use __apic_update_ppr. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
On PPR update, we set KVM_REQ_EVENT unconditionally anytime PPR is lowered. But we can take into account IRR here already. Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
PPR needs to be updated whenever on every IRR read because we may have missed TPR writes that _increased_ PPR. However, these writes need not generate KVM_REQ_EVENT, because either KVM_REQ_EVENT has been set already in __apic_accept_irq, or we are going to process the interrupt right away. Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Since we're already in VCPU context, all we have to do here is recompute the PPR value. That will in turn generate a KVM_REQ_EVENT if necessary. Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
LAPIC after reset is in xAPIC mode, which poses a problem for hotplug of VCPUs with high APIC ID, because reset VCPU is waiting for INIT/SIPI, but there is no way to uniquely address it using xAPIC. From many possible options, we chose the one that also works on real hardware: accepting interrupts addressed to LAPIC's x2APIC ID even in xAPIC mode. KVM intentionally differs from real hardware, because real hardware (Knights Landing) does just "x2apic_id & 0xff" to decide whether to accept the interrupt in xAPIC mode and it can deliver one interrupt to more than one physical destination, e.g. 0x123 to 0x123 and 0x23. Fixes: 682f732e ("KVM: x86: bump MAX_VCPUS to 288") Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
Slow path tried to prevent IPIs from x2APIC VCPUs from being delivered to xAPIC VCPUs and vice-versa. Make slow path behave like fast path, which never distinguished that. Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
There were three calls sites: - recalculate_apic_map and kvm_apic_match_physical_addr, where it would only complicate implementation of x2APIC hotplug; - in apic_debug, where it was still somewhat preserved, but keeping the old function just for apic_debug was not worth it Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 12月, 2016 1 次提交
-
-
由 Thomas Gleixner 提交于
ktime_set(S,N) was required for the timespec storage type and is still useful for situations where a Seconds and Nanoseconds part of a time value needs to be converted. For anything where the Seconds argument is 0, this is pointless and can be replaced with a simple assignment. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 25 11月, 2016 1 次提交
-
-
由 Radim Krčmář 提交于
Cluster xAPIC delivery incorrectly assumed that dest_id <= 0xff. With enabled KVM_X2APIC_API_USE_32BIT_IDS in KVM_CAP_X2APIC_API, a userspace can send an interrupt with dest_id that results in out-of-bounds access. Found by syzkaller: BUG: KASAN: slab-out-of-bounds in kvm_irq_delivery_to_apic_fast+0x11fa/0x1210 at addr ffff88003d9ca750 Read of size 8 by task syz-executor/22923 CPU: 0 PID: 22923 Comm: syz-executor Not tainted 4.9.0-rc4+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [...] Call Trace: [...] __dump_stack lib/dump_stack.c:15 [...] dump_stack+0xb3/0x118 lib/dump_stack.c:51 [...] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156 [...] print_address_description mm/kasan/report.c:194 [...] kasan_report_error mm/kasan/report.c:283 [...] kasan_report+0x231/0x500 mm/kasan/report.c:303 [...] __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:329 [...] kvm_irq_delivery_to_apic_fast+0x11fa/0x1210 arch/x86/kvm/lapic.c:824 [...] kvm_irq_delivery_to_apic+0x132/0x9a0 arch/x86/kvm/irq_comm.c:72 [...] kvm_set_msi+0x111/0x160 arch/x86/kvm/irq_comm.c:157 [...] kvm_send_userspace_msi+0x201/0x280 arch/x86/kvm/../../../virt/kvm/irqchip.c:74 [...] kvm_vm_ioctl+0xba5/0x1670 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3015 [...] vfs_ioctl fs/ioctl.c:43 [...] do_vfs_ioctl+0x18c/0x1040 fs/ioctl.c:679 [...] SYSC_ioctl fs/ioctl.c:694 [...] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685 [...] entry_SYSCALL_64_fastpath+0x1f/0xc2 Reported-by: NDmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: e45115b6 ("KVM: x86: use physical LAPIC array for logical x2APIC") Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 22 11月, 2016 1 次提交
-
-
由 Jim Mattson 提交于
From the Intel SDM, volume 3, section 10.4.3, "Enabling or Disabling the Local APIC," When IA32_APIC_BASE[11] is 0, the processor is functionally equivalent to an IA-32 processor without an on-chip APIC. The CPUID feature flag for the APIC (see Section 10.4.2, "Presence of the Local APIC") is also set to 0. Signed-off-by: NJim Mattson <jmattson@google.com> [Changed subject tag from nVMX to x86.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 03 11月, 2016 6 次提交
-
-
由 Paolo Bonzini 提交于
On some benchmarks (e.g. netperf with ioeventfd disabled), APICv posted interrupts turn out to be slower than interrupt injection via KVM_REQ_EVENT. This patch optimizes a bit the IRR update, avoiding expensive atomic operations in the common case where PI.ON=0 at vmentry or the PIR vector is mostly zero. This saves at least 20 cycles (1%) per vmexit, as measured by kvm-unit-tests' inl_from_qemu test (20 runs): | enable_apicv=1 | enable_apicv=0 | mean stdev | mean stdev ----------|-----------------|------------------ before | 5826 32.65 | 5765 47.09 after | 5809 43.42 | 5777 77.02 Of course, any change in the right column is just placebo effect. :) The savings are bigger if interrupts are frequent. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The base clock for the LAPIC timer is always CLOCK_MONOTONIC. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Most windows guests still utilize APIC Timer periodic/oneshot mode instead of tsc-deadline mode, and the APIC Timer periodic/oneshot mode are still emulated by high overhead hrtimer on host. This patch converts the expected expire time of the periodic/oneshot mode to guest deadline tsc in order to leverage VMX preemption timer logic for APIC Timer tsc-deadline mode. After each preemption timer vmexit preemption timer is restarted to emulate LVTT current-count register is automatically reloaded from the initial-count register when the count reaches 0. This patch reduces ~5600 cycles for each APIC Timer periodic mode operation virtualization. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Squashed with my fixes that were reviewed-by Paolo.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
Rename start/cancel_hv_tscdeadline to start/cancel_hv_timer since they will handle both APIC Timer periodic/oneshot mode and tsc-deadline mode. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
Introdce kvm_get_lapic_target_expiration_tsc() to get APIC Timer target deadline tsc. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
Check apic_lvtt_tscdeadline() mode directly instead of apic_lvtt_oneshot() and apic_lvtt_period() to guarantee the timer is in tsc-deadline mode when rdmsr MSR_IA32_TSCDEADLINE. Suggested-by: NRadim Krčmář <rkrcmar@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-