- 04 12月, 2014 1 次提交
-
-
由 Radim Krčmář 提交于
While fixing an x2apic bug, 17d68b76 KVM: x86: fix guest-initiated crash with x2apic (CVE-2013-6376) we've made only one cluster available. This means that the amount of logically addressible x2APICs was reduced to 16 and VCPUs kept overwriting themselves in that region, so even the first cluster wasn't set up correctly. This patch extends x2APIC support back to the logical_map's limit, and keeps the CVE fixed as messages for non-present APICs are dropped. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 11月, 2014 4 次提交
-
-
由 Radim Krčmář 提交于
We mirror a subset of these registers in separate variables. Using them directly should be faster. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
APIC-write VM exits are "trap-like": they save CS:RIP values for the instruction after the write, and more importantly, the handler will already see the new value in the virtual-APIC page. This means that apic_reg_write cannot use kvm_apic_get_reg to omit timer cancelation when mode changes. timer_mode_mask shouldn't be changing as it depends on cpuid. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
APIC-write VM exits are "trap-like": they save CS:RIP values for the instruction after the write, and more importantly, the handler will already see the new value in the virtual-APIC page. This caused a bug if you used KVM_SET_IRQCHIP to set the SW-enabled bit in the SPIV register. The chain of events is as follows: * When the irqchip is added to the destination VM, the apic_sw_disabled static key is incremented (1) * When the KVM_SET_IRQCHIP ioctl is invoked, it is decremented (0) * When the guest disables the bit in the SPIV register, e.g. as part of shutdown, apic_set_spiv does not notice the change and the static key is _not_ incremented. * When the guest is destroyed, the static key is decremented (-1), resulting in this trace: WARNING: at kernel/jump_label.c:81 __static_key_slow_dec+0xa6/0xb0() jump label: negative count! [<ffffffff816bf898>] dump_stack+0x19/0x1b [<ffffffff8107c6f1>] warn_slowpath_common+0x61/0x80 [<ffffffff8107c76c>] warn_slowpath_fmt+0x5c/0x80 [<ffffffff811931e6>] __static_key_slow_dec+0xa6/0xb0 [<ffffffff81193226>] static_key_slow_dec_deferred+0x16/0x20 [<ffffffffa0637698>] kvm_free_lapic+0x88/0xa0 [kvm] [<ffffffffa061c63e>] kvm_arch_vcpu_uninit+0x2e/0xe0 [kvm] [<ffffffffa05ff301>] kvm_vcpu_uninit+0x21/0x40 [kvm] [<ffffffffa067cec7>] vmx_free_vcpu+0x47/0x70 [kvm_intel] [<ffffffffa061bc50>] kvm_arch_vcpu_free+0x50/0x60 [kvm] [<ffffffffa061ca22>] kvm_arch_destroy_vm+0x102/0x260 [kvm] [<ffffffff810b68fd>] ? synchronize_srcu+0x1d/0x20 [<ffffffffa06030d1>] kvm_put_kvm+0xe1/0x1c0 [kvm] [<ffffffffa06036f8>] kvm_vcpu_release+0x18/0x20 [kvm] [<ffffffff81215c62>] __fput+0x102/0x310 [<ffffffff81215f4e>] ____fput+0xe/0x10 [<ffffffff810ab664>] task_work_run+0xb4/0xe0 [<ffffffff81083944>] do_exit+0x304/0xc60 [<ffffffff816c8dfc>] ? _raw_spin_unlock_irq+0x2c/0x50 [<ffffffff810fd22d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff8108432c>] do_group_exit+0x4c/0xc0 [<ffffffff810843b4>] SyS_exit_group+0x14/0x20 [<ffffffff816d33a9>] system_call_fastpath+0x16/0x1b Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
KVM does not deliver x2APIC broadcast messages with physical mode. Intel SDM (10.12.9 ICR Operation in x2APIC Mode) states: "A destination ID value of FFFF_FFFFH is used for broadcast of interrupts in both logical destination and physical destination modes." In addition, the local-apic enables cluster mode broadcast. As Intel SDM 10.6.2.2 says: "Broadcast to all local APICs is achieved by setting all destination bits to one." This patch enables cluster mode broadcast. The fix tries to combine broadcast in different modes through a unified code. One rare case occurs when the source of IPI has its APIC disabled. In such case, the source can still issue IPIs, but since the source is not obliged to have the same LAPIC mode as the enabled ones, we cannot rely on it. Since it is a rare case, it is unoptimized and done on the slow-path. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NWanpeng Li <wanpeng.li@linux.intel.com> [As per Radim's review, use unsigned int for X2APIC_BROADCAST, return bool from kvm_apic_broadcast. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 1月, 2014 1 次提交
-
-
由 Jan Kiszka 提交于
Check for invalid state transitions on guest-initiated updates of MSR_IA32_APICBASE. This address both enabling of the x2APIC when it is not supported and all invalid transitions as described in SDM section 10.12.5. It also checks that no reserved bit is set in APICBASE by the guest. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> [Use cpuid_maxphyaddr instead of guest_cpuid_get_phys_bits. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 13 12月, 2013 1 次提交
-
-
由 Andy Honig 提交于
In kvm_lapic_sync_from_vapic and kvm_lapic_sync_to_vapic there is the potential to corrupt kernel memory if userspace provides an address that is at the end of a page. This patches concerts those functions to use kvm_write_guest_cached and kvm_read_guest_cached. It also checks the vapic_address specified by userspace during ioctl processing and returns an error to userspace if the address is not a valid GPA. This is generally not guest triggerable, because the required write is done by firmware that runs before the guest. Also, it only affects AMD processors and oldish Intel that do not have the FlexPriority feature (unless you disable FlexPriority, of course; then newer processors are also affected). Fixes: b93463aa ('KVM: Accelerated apic support') Reported-by: NAndrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Signed-off-by: NAndrew Honig <ahonig@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 6月, 2013 1 次提交
-
-
由 Gleb Natapov 提交于
This reverts most of the f1ed0450. After the commit kvm_apic_set_irq() no longer returns accurate information about interrupt injection status if injection is done into disabled APIC. RTC interrupt coalescing tracking relies on the information to be accurate and cannot recover if it is not. Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 14 5月, 2013 1 次提交
-
-
由 Jan Kiszka 提交于
Since the arrival of posted interrupt support we can no longer guarantee that coalesced IRQs are always reported to the IRQ source. Moreover, accumulated APIC timer events could cause a busy loop when a VCPU should rather be halted. The consensus is to remove coalesced tracking from the LAPIC. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Acked-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 17 4月, 2013 2 次提交
-
-
由 Yang Zhang 提交于
Only deliver the posted interrupt when target vcpu is running and there is no previous interrupt pending in pir. Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Yang Zhang 提交于
We already know the trigger mode of a given interrupt when programming the ioapice entry. So it's not necessary to set it in each interrupt delivery. Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 16 4月, 2013 2 次提交
-
-
由 Yang Zhang 提交于
restore rtc_status from migration or save/restore Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Yang Zhang 提交于
Add a new parameter to know vcpus who received the interrupt. Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 07 4月, 2013 1 次提交
-
-
由 Yang Zhang 提交于
For a given vcpu, kvm_apic_match_dest() will tell you whether the vcpu in the destination list quickly. Drop kvm_calculate_eoi_exitmap() and use kvm_apic_match_dest() instead. Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 13 3月, 2013 1 次提交
-
-
由 Jan Kiszka 提交于
A VCPU sending INIT or SIPI to some other VCPU races for setting the remote VCPU's mp_state. When we were unlucky, KVM_MP_STATE_INIT_RECEIVED was overwritten by kvm_emulate_halt and, thus, got lost. This introduces APIC events for those two signals, keeping them in kvm_apic until kvm_apic_accept_events is run over the target vcpu context. kvm_apic_has_events reports to kvm_arch_vcpu_runnable if there are pending events, thus if vcpu blocking should end. The patch comes with the side effect of effectively obsoleting KVM_MP_STATE_SIPI_RECEIVED. We still accept it from user space, but immediately translate it to KVM_MP_STATE_INIT_RECEIVED + KVM_APIC_SIPI. The vcpu itself will no longer enter the KVM_MP_STATE_SIPI_RECEIVED state. That also means we no longer exit to user space after receiving a SIPI event. Furthermore, we already reset the VCPU on INIT, only fixing up the code segment later on when SIPI arrives. Moreover, we fix INIT handling for the BSP: it never enter wait-for-SIPI but directly starts over on INIT. Tested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 29 1月, 2013 3 次提交
-
-
由 Yang Zhang 提交于
Virtual interrupt delivery avoids KVM to inject vAPIC interrupts manually, which is fully taken care of by the hardware. This needs some special awareness into existing interrupr injection path: - for pending interrupt, instead of direct injection, we may need update architecture specific indicators before resuming to guest. - A pending interrupt, which is masked by ISR, should be also considered in above update action, since hardware will decide when to inject it at right time. Current has_interrupt and get_interrupt only returns a valid vector from injection p.o.v. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Yang Zhang 提交于
basically to benefit from apicv, we need to enable virtualized x2apic mode. Currently, we only enable it when guest is really using x2apic. Also, clear MSR bitmap for corresponding x2apic MSRs when guest enabled x2apic: 0x800 - 0x8ff: no read intercept for apicv register virtualization, except APIC ID and TMCCT which need software's assistance to get right value. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Yang Zhang 提交于
- APIC read doesn't cause VM-Exit - APIC write becomes trap-like Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NYang Zhang <yang.z.zhang@intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 20 9月, 2012 1 次提交
-
-
由 Gleb Natapov 提交于
Most interrupt are delivered to only one vcpu. Use pre-build tables to find interrupt destination instead of looping through all vcpus. In case of logical mode loop only through vcpus in a logical cluster irq is sent to. Signed-off-by: NGleb Natapov <gleb@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 09 8月, 2012 1 次提交
-
-
由 Gleb Natapov 提交于
For apic_set_spiv() to track APIC SW state correctly it needs to see previous and next values of the spurious vector register, but currently memset() overwrite the old value before apic_set_spiv() get a chance to do tracking. Fix it by calling apic_set_spiv() before overwriting old value. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 07 8月, 2012 2 次提交
-
-
由 Gleb Natapov 提交于
Those functions are used during interrupt injection. When inlined they become nops on the fast path. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Usually all APICs are HW enabled so the check can be optimized out. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 01 8月, 2012 2 次提交
-
-
由 Avi Kivity 提交于
'reinject' is never initialized 't_ops' only serves as indirection to lapic_is_periodic; call that directly instead 'kvm' is never used 'vcpu' can be derived via container_of Remove these fields. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
kvm_timer_fn(), the sole inhabitant of timer.c, is only used by lapic.c. Move it there to make it easier to hack on it. struct kvm_timer is a thin wrapper around hrtimer, and only adds obfuscation. Move near its two users (with different names) to prepare for simplification. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 25 6月, 2012 3 次提交
-
-
由 Michael S. Tsirkin 提交于
Implementation of PV EOI using shared memory. This reduces the number of exits an interrupt causes as much as by half. The idea is simple: there's a bit, per APIC, in guest memory, that tells the guest that it does not need EOI. We set it before injecting an interrupt and clear before injecting a nested one. Guest tests it using a test and clear operation - this is necessary so that host can detect interrupt nesting - and if set, it can skip the EOI MSR. There's a new MSR to set the address of said register in guest memory. Otherwise not much changed: - Guest EOI is not required - Register is tested & ISR is automatically cleared on exit For testing results see description of previous patch 'kvm_para: guest side for eoi avoidance'. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Michael S. Tsirkin 提交于
We perform ISR lookups twice: during interrupt injection and on EOI. Typical workloads only have a single bit set there. So we can avoid ISR scans by 1. counting bits as we set/clear them in ISR 2. on set, caching the injected vector number 3. on clear, invalidating the cache The real purpose of this is enabling PV EOI which needs to quickly validate the vector. But non PV guests also benefit: with this patch, and without interrupt nesting, apic_find_highest_isr will always return immediately without scanning ISR. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Michael S. Tsirkin 提交于
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 27 12月, 2011 1 次提交
-
-
由 Avi Kivity 提交于
Needed to deliver performance monitoring interrupts. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 05 10月, 2011 1 次提交
-
-
由 Liu, Jinsong 提交于
This patch emulate lapic tsc deadline timer for guest: Enumerate tsc deadline timer capability by CPUID; Enable tsc deadline timer mode by lapic MMIO; Start tsc deadline timer by WRMSR; [jan: use do_div()] [avi: fix for !irqchip_in_kernel()] [marcelo: another fix for !irqchip_in_kernel()] Signed-off-by: NLiu, Jinsong <jinsong.liu@intel.com> Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 26 9月, 2011 1 次提交
-
-
由 Kevin Tian 提交于
Instruction emulation for EOI writes can be skipped, since sane guest simply uses MOV instead of string operations. This is a nice improvement when guest doesn't support x2apic or hyper-V EOI support. a single VM bandwidth is observed with ~8% bandwidth improvement (7.4Gbps->8Gbps), by saving ~5% cycles from EOI emulation. Signed-off-by: NKevin Tian <kevin.tian@intel.com> <Based on earlier work from>: Signed-off-by: NEddie Dong <eddie.dong@intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 18 3月, 2011 1 次提交
-
-
由 Takuya Yoshikawa 提交于
Access to this page is mostly done through the regs member which holds the address to this page. The exceptions are in vmx_vcpu_reset() and kvm_free_lapic() and these both can easily be converted to using regs. Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 01 3月, 2010 1 次提交
-
-
由 Gleb Natapov 提交于
Implement HYPER-V apic MSRs. Spec defines three MSRs that speed-up access to EOI/TPR/ICR apic registers for PV guests. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NVadim Rozenfeld <vrozenfe@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 10 9月, 2009 3 次提交
-
-
由 Gleb Natapov 提交于
This patch implements MSR interface to local apic as defines by x2apic Intel specification. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Directed EOI is specified by x2APIC, but is available even when lapic is in xAPIC mode. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Most of the time IRR is empty, so instead of scanning the whole IRR on each VM entry keep a variable that tells us if IRR is not empty. IRR will have to be scanned twice on each IRQ delivery, but this is much more rare than VM entry. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 10 6月, 2009 4 次提交
-
-
由 Gleb Natapov 提交于
Deliver interrupt during destination matching loop. Signed-off-by: NGleb Natapov <gleb@redhat.com> Acked-by: NXiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Use kvm_apic_match_dest() in kvm_get_intr_delivery_bitmask() instead of duplicating the same code. Use kvm_get_intr_delivery_bitmask() in apic_send_ipi() to figure out ipi destination instead of reimplementing the logic. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Get rid of ioapic_inj_irq() and ioapic_inj_nmi() functions. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Marcelo Tosatti 提交于
Hide the internals of vcpu awakening / injection from the in-kernel emulated timers. This makes future changes in this logic easier and decreases the distance to more generic timer handling. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 15 2月, 2009 1 次提交
-
-
由 Marcelo Tosatti 提交于
Simplify LAPIC TMCCT calculation by using hrtimer provided function to query remaining time until expiration. Fixes host hang with nested ESX. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-