提交 · 80bc97f2d84dbec6df6b36d6bbc0715c14f80c61 · openeuler / Kernel

16 5月, 2020 2 次提交

KVM: X86: TSCDEADLINE MSR emulation fastpath · ae95f566

由 Wanpeng Li 提交于 4月 28, 2020

This patch implements a fast path for emulation of writes to the TSCDEADLINE
MSR.  Besides shortcutting various housekeeping tasks in the vCPU loop,
the fast path can also deliver the timer interrupt directly without going
through KVM_REQ_PENDING_TIMER because it runs in vCPU context.
Tested-by: NHaiwei Li <lihaiwei@tencent.com>
Cc: Haiwei Li <lihaiwei@tencent.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <1588055009-12677-7-git-send-email-wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ae95f566

KVM: x86: introduce kvm_can_use_hv_timer · 199a8b84

由 Paolo Bonzini 提交于 5月 05, 2020

Replace the ad hoc test in vmx_set_hv_timer with a test in the caller,
start_hv_timer. This test is not Intel-specific and would be duplicated
when introducing the fast path for the TSC deadline MSR.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

199a8b84

14 5月, 2020 1 次提交

kvm: Replace vcpu->swait with rcuwait · da4ad88c

由 Davidlohr Bueso 提交于 4月 23, 2020

The use of any sort of waitqueue (simple or regular) for
wait/waking vcpus has always been an overkill and semantically
wrong. Because this is per-vcpu (which is blocked) there is
only ever a single waiting vcpu, thus no need for any sort of
queue.

As such, make use of the rcuwait primitive, with the following
considerations:

  - rcuwait already provides the proper barriers that serialize
  concurrent waiter and waker.

  - Task wakeup is done in rcu read critical region, with a
  stable task pointer.

  - Because there is no concurrency among waiters, we need
  not worry about rcuwait_wait_event() calls corrupting
  the wait->task. As a consequence, this saves the locking
  done in swait when modifying the queue. This also applies
  to per-vcore wait for powerpc kvm-hv.

The x86 tscdeadline_latency test mentioned in 8577370f
("KVM: Use simple waitqueue for vcpu->wq") shows that, on avg,
latency is reduced by around 15-20% with this change.

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-mips@vger.kernel.org
Reviewed-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Message-Id: <20200424054837.5138-6-dave@stgolabs.net>
[Avoid extra logic changes. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

da4ad88c

16 4月, 2020 1 次提交

KVM: x86: Return updated timer current count register from KVM_GET_LAPIC · 24647e0a

由 Peter Shier 提交于 10月 10, 2018

kvm_vcpu_ioctl_get_lapic (implements KVM_GET_LAPIC ioctl) does a bulk copy
of the LAPIC registers but must take into account that the one-shot and
periodic timer current count register is computed upon reads and is not
present in register state. When restoring LAPIC state (e.g. after
migration), restart timers from their their current count values at time of
save.

Note: When a one-shot timer expires, the code in arch/x86/kvm/lapic.c does
not zero the value of the LAPIC initial count register (emulating HW
behavior). If no other timer is run and pending prior to a subsequent
KVM_GET_LAPIC call, the returned register set will include the expired
one-shot initial count. On a subsequent KVM_SET_LAPIC call the code will
see a non-zero initial count and start a new one-shot timer using the
expired timer's count. This is a prior existing bug and will be addressed
in a separate patch. Thanks to jmattson@google.com for this find.
Signed-off-by: NPeter Shier <pshier@google.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <20181010225653.238911-1-pshier@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

24647e0a

07 4月, 2020 1 次提交

KVM: X86: Filter out the broadcast dest for IPI fastpath · 4064a4c6

由 Wanpeng Li 提交于 4月 02, 2020

Except destination shorthand, a destination value 0xffffffff is used to
broadcast interrupts, let's also filter out this for single target IPI
fastpath.
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <1585815626-28370-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4064a4c6

31 3月, 2020 1 次提交

KVM: x86: Copy kvm_x86_ops by value to eliminate layer of indirection · afaf0b2f

由 Sean Christopherson 提交于 3月 21, 2020

Replace the kvm_x86_ops pointer in common x86 with an instance of the
struct to save one pointer dereference when invoking functions. Copy the
struct by value to set the ops during kvm_init().

Arbitrarily use kvm_x86_ops.hardware_enable to track whether or not the
ops have been initialized, i.e. a vendor KVM module has been loaded.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200321202603.19355-7-sean.j.christopherson@intel.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

afaf0b2f

26 3月, 2020 1 次提交

KVM: X86: Micro-optimize IPI fastpath delay · d5361678

由 Wanpeng Li 提交于 3月 26, 2020

This patch optimizes the virtual IPI fastpath emulation sequence:

write ICR2                          send virtual IPI
read ICR2                           write ICR2
send virtual IPI         ==>        write ICR
write ICR

We can observe ~0.67% performance improvement for IPI microbenchmark
(https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/)
on Skylake server.
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <1585189202-1708-4-git-send-email-wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d5361678

24 3月, 2020 1 次提交

KVM: LAPIC: Also cancel preemption timer when disarm LAPIC timer · 94be4b85

由 Wanpeng Li 提交于 3月 24, 2020

The timer is disarmed when switching between TSC deadline and other modes,
we should set everything to disarmed state, however, LAPIC timer can be
emulated by preemption timer, it still works if vmx->hv_deadline_timer is
not -1. This patch also cancels preemption timer when disarm LAPIC timer.
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <1585031530-19823-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

94be4b85

23 3月, 2020 1 次提交

KVM: LAPIC: Mark hrtimer for period or oneshot mode to expire in hard interrupt context · edec6e01

由 He Zhe 提交于 3月 20, 2020

apic->lapic_timer.timer was initialized with HRTIMER_MODE_ABS_HARD but
started later with HRTIMER_MODE_ABS, which may cause the following warning
in PREEMPT_RT kernel.

WARNING: CPU: 1 PID: 2957 at kernel/time/hrtimer.c:1129 hrtimer_start_range_ns+0x348/0x3f0
CPU: 1 PID: 2957 Comm: qemu-system-x86 Not tainted 5.4.23-rt11 #1
Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1a 09/18/2018
RIP: 0010:hrtimer_start_range_ns+0x348/0x3f0
Code: 4d b8 0f 94 c1 0f b6 c9 e8 35 f1 ff ff 4c 8b 45
      b0 e9 3b fd ff ff e8 d7 3f fa ff 48 98 4c 03 34
      c5 a0 26 bf 93 e9 a1 fd ff ff <0f> 0b e9 fd fc ff
      ff 65 8b 05 fa b7 90 6d 89 c0 48 0f a3 05 60 91
RSP: 0018:ffffbc60026ffaf8 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff9d81657d4110 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000006cc7987bcf RDI: ffff9d81657d4110
RBP: ffffbc60026ffb58 R08: 0000000000000001 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000000 R12: 0000006cc7987bcf
R13: 0000000000000000 R14: 0000006cc7987bcf R15: ffffbc60026d6a00
FS: 00007f401daed700(0000) GS:ffff9d81ffa40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000ffffffff CR3: 0000000fa7574000 CR4: 00000000003426e0
Call Trace:
? kvm_release_pfn_clean+0x22/0x60 [kvm]
start_sw_timer+0x85/0x230 [kvm]
? vmx_vmexit+0x1b/0x30 [kvm_intel]
kvm_lapic_switch_to_sw_timer+0x72/0x80 [kvm]
vmx_pre_block+0x1cb/0x260 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_vmexit+0x1b/0x30 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_vmexit+0x1b/0x30 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_vmexit+0x1b/0x30 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_vmexit+0x1b/0x30 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_vmexit+0x1b/0x30 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_vmexit+0x1b/0x30 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_vmexit+0x1b/0x30 [kvm_intel]
? vmx_vmexit+0xf/0x30 [kvm_intel]
? vmx_sync_pir_to_irr+0x9e/0x100 [kvm_intel]
? kvm_apic_has_interrupt+0x46/0x80 [kvm]
kvm_arch_vcpu_ioctl_run+0x85b/0x1fa0 [kvm]
? _raw_spin_unlock_irqrestore+0x18/0x50
? _copy_to_user+0x2c/0x30
kvm_vcpu_ioctl+0x235/0x660 [kvm]
? rt_spin_unlock+0x2c/0x50
do_vfs_ioctl+0x3e4/0x650
? __fget+0x7a/0xa0
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x4d/0x120
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4027cc54a7
Code: 00 00 90 48 8b 05 e9 59 0c 00 64 c7 00 26 00 00
      00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00
      00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff
      73 01 c3 48 8b 0d b9 59 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007f401dae9858 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00005558bd029690 RCX: 00007f4027cc54a7
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000d
RBP: 00007f4028b72000 R08: 00005558bc829ad0 R09: 00000000ffffffff
R10: 00005558bcf90ca0 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 00005558bce1c840
--[ end trace 0000000000000002 ]--
Signed-off-by: NHe Zhe <zhe.he@windriver.com>
Message-Id: <1584687967-332859-1-git-send-email-zhe.he@windriver.com>
Reviewed-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

edec6e01

17 3月, 2020 2 次提交

KVM: LAPIC: Recalculate apic map in batch · 4abaffce

由 Wanpeng Li 提交于 2月 26, 2020

In the vCPU reset and set APIC_BASE MSR path, the apic map will be recalculated
several times, each time it will consume 10+ us observed by ftrace in my
non-overcommit environment since the expensive memory allocate/mutex/rcu etc
operations. This patch optimizes it by recaluating apic map in batch, I hope
this can benefit the serverless scenario which can frequently create/destroy
VMs.

Before patch:

kvm_lapic_reset  ~27us

After patch:

kvm_lapic_reset  ~14us

Observed by ftrace, improve ~48%.
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4abaffce

KVM: apic: remove unused function apic_lvt_vector() · 999eabcc

由 Miaohe Lin 提交于 2月 13, 2020

The function apic_lvt_vector() is unused now, remove it.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

999eabcc

22 2月, 2020 2 次提交

KVM: apic: avoid calculating pending eoi from an uninitialized val · 23520b2d

由 Miaohe Lin 提交于 2月 21, 2020

When pv_eoi_get_user() fails, 'val' may remain uninitialized and the return
value of pv_eoi_get_pending() becomes random. Fix the issue by initializing
the variable.
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

23520b2d

KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1 · 91a5f413

由 Vitaly Kuznetsov 提交于 2月 20, 2020

Even when APICv is disabled for L1 it can (and, actually, is) still
available for L2, this means we need to always call
vmx_deliver_nested_posted_interrupt() when attempting an interrupt
delivery.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

91a5f413

13 2月, 2020 1 次提交

KVM: apic: reuse smp_wmb() in kvm_make_request() · 331ca0f8

由 Miaohe Lin 提交于 2月 07, 2020

kvm_make_request() provides smp_wmb() so pending_events changes are
guaranteed to be visible.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

331ca0f8

05 2月, 2020 1 次提交

kvm: lapic: Introduce APICv update helper function · b26a695a

由 Suravee Suthikulpanit 提交于 11月 14, 2019

Re-factor code into a helper function for setting lapic parameters when
activate/deactivate APICv, and export the function for subsequent usage.
Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b26a695a

28 1月, 2020 3 次提交

Adding 'else' to reduce checking. · 17ac43a8

由 Haiwei Li 提交于 1月 16, 2020

These two conditions are in conflict, adding 'else' to reduce checking.
Signed-off-by: NHaiwei Li <lihaiwei@tencent.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

17ac43a8

KVM: apic: short-circuit kvm_apic_accept_pic_intr() when pic intr is accepted · 3ce4dc17

由 Miaohe Lin 提交于 1月 18, 2020

Short-circuit kvm_apic_accept_pic_intr() when pic intr is accepted, there
is no need to proceed further. Also remove unnecessary var r.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3ce4dc17

KVM: x86: Protect kvm_lapic_reg_write() from Spectre-v1/L1TF attacks · 4bf79cb0

由 Marios Pomonis 提交于 12月 11, 2019

This fixes a Spectre-v1/L1TF vulnerability in kvm_lapic_reg_write().
This function contains index computations based on the
(attacker-controlled) MSR number.

Fixes: 0105d1a5 ("KVM: x2apic interface to lapic")
Signed-off-by: NNick Finco <nifi@google.com>
Signed-off-by: NMarios Pomonis <pomonis@google.com>
Reviewed-by: NAndrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4bf79cb0

21 1月, 2020 2 次提交

KVM: Fix some grammar mistakes · 00116795

由 Miaohe Lin 提交于 12月 11, 2019

Fix some grammar mistakes in the comments.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

00116795

KVM: Fix some comment typos and missing parentheses · 67b0ae43

由 Miaohe Lin 提交于 12月 11, 2019

Fix some typos and add missing parentheses in the comments.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

67b0ae43

09 1月, 2020 3 次提交

KVM: X86: Fix callers of kvm_apic_match_dest() to use correct macros · 5c69d5c1

由 Peter Xu 提交于 12月 04, 2019

Callers of kvm_apic_match_dest() should always pass in APIC_DEST_*
macros for either dest_mode and short_hand parameters.  Fix up all the
callers of kvm_apic_match_dest() that are not following the rule.

Since at it, rename the parameter from short_hand to shorthand in
kvm_apic_match_dest(), as suggested by Vitaly.
Reported-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reported-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5c69d5c1

KVM: X86: Drop KVM_APIC_SHORT_MASK and KVM_APIC_DEST_MASK · ac8ef992

由 Peter Xu 提交于 12月 04, 2019

We have both APIC_SHORT_MASK and KVM_APIC_SHORT_MASK defined for the
shorthand mask.  Similarly, we have both APIC_DEST_MASK and
KVM_APIC_DEST_MASK defined for the destination mode mask.

Drop the KVM_APIC_* macros and replace the only user of them to use
the APIC_DEST_* macros instead.  At the meantime, move APIC_SHORT_MASK
and APIC_DEST_MASK from lapic.c to lapic.h.
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ac8ef992

KVM: X86: Fix kvm_bitmap_or_dest_vcpus() to use irq shorthand · b4b29636

由 Peter Xu 提交于 12月 04, 2019

The 3rd parameter of kvm_apic_match_dest() is the irq shorthand,
rather than the irq delivery mode.

Fixes: 7ee30bc1 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs")
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b4b29636

15 11月, 2019 3 次提交

KVM: x86: deliver KVM IOAPIC scan request to target vCPUs · 7ee30bc1

由 Nitesh Narayan Lal 提交于 11月 07, 2019

In IOAPIC fixed delivery mode instead of flushing the scan
requests to all vCPUs, we should only send the requests to
vCPUs specified within the destination field.

This patch introduces kvm_get_dest_vcpus_mask() API which
retrieves an array of target vCPUs by using
kvm_apic_map_get_dest_lapic() and then based on the
vcpus_idx, it sets the bit in a bitmap. However, if the above
fails kvm_get_dest_vcpus_mask() finds the target vCPUs by
traversing all available vCPUs. Followed by setting the
bits in the bitmap.

If we had different vCPUs in the previous request for the
same redirection table entry then bits corresponding to
these vCPUs are also set. This to done to keep
ioapic_handled_vectors synchronized.

This bitmap is then eventually passed on to
kvm_make_vcpus_request_mask() to generate a masked request
only for the target vCPUs.

This would enable us to reduce the latency overhead on isolated
vCPUs caused by the IPI to process due to KVM_REQ_IOAPIC_SCAN.
Suggested-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NNitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7ee30bc1

KVM: APIC: add helper func to remove duplicate code in kvm_pv_send_ipi · 1a686237

由 Miaohe Lin 提交于 11月 09, 2019

There are some duplicate code in kvm_pv_send_ipi when deal with ipi
bitmap. Add helper func to remove it, and eliminate odd out label,
get rid of unnecessary kvm_lapic_irq field init and so on.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1a686237

KVM: x86: Prevent set vCPU into INIT/SIPI_RECEIVED state when INIT are latched · 27cbe7d6

由 Liran Alon 提交于 11月 11, 2019

Commit 4b9852f4 ("KVM: x86: Fix INIT signal handling in various CPU states")
fixed KVM to also latch pending LAPIC INIT event when vCPU is in VMX
operation.

However, current API of KVM_SET_MP_STATE allows userspace to put vCPU
into KVM_MP_STATE_SIPI_RECEIVED or KVM_MP_STATE_INIT_RECEIVED even when
vCPU is in VMX operation.

Fix this by introducing a util method to check if vCPU state latch INIT
signals and use it in KVM_SET_MP_STATE handler.

Fixes: 4b9852f4 ("KVM: x86: Fix INIT signal handling in various CPU states")
Reported-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NMihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

27cbe7d6

23 10月, 2019 1 次提交

KVM: SVM: Fix potential wrong physical id in avic_handle_ldr_update · 5c94ac5d

由 Miaohe Lin 提交于 10月 18, 2019

Guest physical APIC ID may not equal to vcpu->vcpu_id in some case.
We may set the wrong physical id in avic_handle_ldr_update as we
always use vcpu->vcpu_id. Get physical APIC ID from vAPIC page
instead.
Export and use kvm_xapic_id here and in avic_handle_apic_id_update
as suggested by Vitaly.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5c94ac5d

26 9月, 2019 1 次提交

KVM: LAPIC: Loosen filter for adaptive tuning of lapic_timer_advance_ns · a0f0037e

由 Wanpeng Li 提交于 9月 26, 2019

5000 guest cycles delta is easy to encounter on desktop, per-vCPU
lapic_timer_advance_ns always keeps at 1000ns initial value, let's
loosen the filter a bit to let adaptive tuning make progress.
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a0f0037e

24 9月, 2019 1 次提交

KVM: LAPIC: Tune lapic_timer_advance_ns smoothly · d0f5a86a

由 Wanpeng Li 提交于 9月 17, 2019

Filter out drastic fluctuation and random fluctuation, remove
timer_advance_adjust_done altogether, the adjustment would be
continuous.
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d0f5a86a

12 9月, 2019 2 次提交

KVM: x86: Fix INIT signal handling in various CPU states · 4b9852f4

由 Liran Alon 提交于 8月 26, 2019

Commit cd7764fe ("KVM: x86: latch INITs while in system management mode")
changed code to latch INIT while vCPU is in SMM and process latched INIT
when leaving SMM. It left a subtle remark in commit message that similar
treatment should also be done while vCPU is in VMX non-root-mode.

However, INIT signals should actually be latched in various vCPU states:
(*) For both Intel and AMD, INIT signals should be latched while vCPU
is in SMM.
(*) For Intel, INIT should also be latched while vCPU is in VMX
operation and later processed when vCPU leaves VMX operation by
executing VMXOFF.
(*) For AMD, INIT should also be latched while vCPU runs with GIF=0
or in guest-mode with intercept defined on INIT signal.

To fix this:
1) Add kvm_x86_ops->apic_init_signal_blocked() such that each CPU vendor
can define the various CPU states in which INIT signals should be
blocked and modify kvm_apic_accept_events() to use it.
2) Modify vmx_check_nested_events() to check for pending INIT signal
while vCPU in guest-mode. If so, emualte vmexit on
EXIT_REASON_INIT_SIGNAL. Note that nSVM should have similar behaviour
but is currently left as a TODO comment to implement in the future
because nSVM don't yet implement svm_check_nested_events().

Note: Currently KVM nVMX implementation don't support VMX wait-for-SIPI
activity state as specified in MSR_IA32_VMX_MISC bits 6:8 exposed to
guest (See nested_vmx_setup_ctls_msrs()).
If and when support for this activity state will be implemented,
kvm_check_nested_events() would need to avoid emulating vmexit on
INIT signal in case activity-state is wait-for-SIPI. In addition,
kvm_apic_accept_events() would need to be modified to avoid discarding
SIPI in case VMX activity-state is wait-for-SIPI but instead delay
SIPI processing to vmx_check_nested_events() that would clear
pending APIC events and emulate vmexit on SIPI.
Reviewed-by: NJoao Martins <joao.m.martins@oracle.com>
Co-developed-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4b9852f4

KVM: LAPIC: Micro optimize IPI latency · 2b0911d1

由 Wanpeng Li 提交于 9月 05, 2019

This patch optimizes the virtual IPI emulation sequence:

write ICR2                     write ICR2
write ICR                      read ICR2
read ICR            ==>        send virtual IPI
read ICR2                      write ICR
send virtual IPI

It can reduce kvm-unit-tests/vmexit.flat IPI testing latency(from sender
send IPI to sender receive the ACK) from 3319 cycles to 3203 cycles on
SKylake server.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2b0911d1

14 8月, 2019 1 次提交

kvm: x86: skip populating logical dest map if apic is not sw enabled · b14c876b

由 Radim Krcmar 提交于 8月 13, 2019

recalculate_apic_map does not santize ldr and it's possible that
multiple bits are set. In that case, a previous valid entry
can potentially be overwritten by an invalid one.

This condition is hit when booting a 32 bit, >8 CPU, RHEL6 guest and then
triggering a crash to boot a kdump kernel. This is the sequence of
events:
1. Linux boots in bigsmp mode and enables PhysFlat, however, it still
writes to the LDR which probably will never be used.
2. However, when booting into kdump, the stale LDR values remain as
they are not cleared by the guest and there isn't a apic reset.
3. kdump boots with 1 cpu, and uses Logical Destination Mode but the
logical map has been overwritten and points to an inactive vcpu.
Signed-off-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b14c876b

05 8月, 2019 1 次提交

KVM: LAPIC: Don't need to wakeup vCPU twice afer timer fire · a48d06f9

由 Wanpeng Li 提交于 8月 01, 2019

kvm_set_pending_timer() will take care to wake up the sleeping vCPU which
has pending timer, don't need to check this in apic_timer_expired() again.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a48d06f9

02 8月, 2019 1 次提交

KVM: LAPIC: Mark hrtimer to expire in hard interrupt context · 2c0d278f

由 Sebastian Andrzej Siewior 提交于 7月 26, 2019

On PREEMPT_RT enabled kernels unmarked hrtimers are moved into soft
interrupt expiry mode by default.

While that's not a functional requirement for the KVM local APIC timer
emulation, it's a latency issue which can be avoided by marking the timer
so hard interrupt context expiry is enforced.

No functional change.

[ tglx: Split out from larger combo patch. Add changelog. ]
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185753.363363474@linutronix.de

2c0d278f

20 7月, 2019 1 次提交

KVM: LAPIC: Inject timer interrupt via posted interrupt · 0c5f81da

由 Wanpeng Li 提交于 7月 06, 2019

Dedicated instances are currently disturbed by unnecessary jitter due
to the emulated lapic timers firing on the same pCPUs where the
vCPUs reside. There is no hardware virtual timer on Intel for guest
like ARM, so both programming timer in guest and the emulated timer fires
incur vmexits. This patch tries to avoid vmexit when the emulated timer
fires, at least in dedicated instance scenario when nohz_full is enabled.

In that case, the emulated timers can be offload to the nearest busy
housekeeping cpus since APICv has been found for several years in server
processors. The guest timer interrupt can then be injected via posted interrupts,
which are delivered by the housekeeping cpu once the emulated timer fires.

The host should tuned so that vCPUs are placed on isolated physical
processors, and with several pCPUs surplus for busy housekeeping.
If disabled mwait/hlt/pause vmexits keep the vCPUs in non-root mode,
~3% redis performance benefit can be observed on Skylake server, and the
number of external interrupt vmexits drops substantially. Without patch

VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
EXTERNAL_INTERRUPT 42916 49.43% 39.30% 0.47us 106.09us 0.71us ( +- 1.09% )

While with patch:

VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
EXTERNAL_INTERRUPT 6871 9.29% 2.96% 0.44us 57.88us 0.72us ( +- 4.02% )

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0c5f81da

18 7月, 2019 1 次提交

KVM: LAPIC: Make lapic timer unpinned · 4d151bf3

由 Wanpeng Li 提交于 7月 06, 2019

Commit 61abdbe0 ("kvm: x86: make lapic hrtimer pinned") pinned the
lapic timer to avoid to wait until the next kvm exit for the guest to
see KVM_REQ_PENDING_TIMER set. There is another solution to give a kick
after setting the KVM_REQ_PENDING_TIMER bit, make lapic timer unpinned
will be used in follow up patches.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4d151bf3

16 7月, 2019 1 次提交

kvm: x86: ioapic and apic debug macros cleanup · 0d88800d

由 Yi Wang 提交于 7月 06, 2019

The ioapic_debug and apic_debug have been not used
for years, and kvm tracepoints are enough for debugging,
so remove them as Paolo suggested.

However, there may be something wrong when pv evi get/put
user, so it's better to retain some log there.
Signed-off-by: NYi Wang <wang.yi59@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0d88800d

06 7月, 2019 1 次提交

KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane · 548f7fb2

由 Wanpeng Li 提交于 7月 05, 2019

Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane which
can happen sporadically in product environment.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

548f7fb2

05 7月, 2019 2 次提交

kvm: LAPIC: write down valid APIC registers · 01402cf8

由 Paolo Bonzini 提交于 7月 05, 2019

Replace a magic 64-bit mask with a list of valid registers, computing
the same mask in the end.
Suggested-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

01402cf8

KVM: LAPIC: ARBPRI is a reserved register for x2APIC · 101628de

由 Paolo Bonzini 提交于 7月 05, 2019

kvm-unit-tests were adjusted to match bare metal behavior, but KVM
itself was not doing what bare metal does; fix that.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

101628de

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功