提交 · 0c94efabe00ed97415c48361b5fecaa2f2117d57 · openeuler / Kernel

01 5月, 2019 1 次提交

KVM: x86: Omit caching logic for always-available GPRs · de3cd117

由 Sean Christopherson 提交于 4月 30, 2019

Except for RSP and RIP, which are held in VMX's VMCS, GPRs are always
treated "available and dirtly" on both VMX and SVM, i.e. are
unconditionally loaded/saved immediately before/after VM-Enter/VM-Exit.

Eliminating the unnecessary caching code reduces the size of KVM by a
non-trivial amount, much of which comes from the most common code paths.
E.g. on x86_64, kvm_emulate_cpuid() is reduced from 342 to 182 bytes and
kvm_emulate_hypercall() from 1362 to 1143, with the total size of KVM
dropping by ~1000 bytes. With CONFIG_RETPOLINE=y, the numbers are even
more pronounced, e.g.: 353->182, 1418->1172 and well over 2000 bytes.
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

de3cd117

19 4月, 2019 1 次提交

x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012 · da66761c

由 Vitaly Kuznetsov 提交于 3月 20, 2019

It was reported that with some special Multi Processor Group configuration,
e.g:
 bcdedit.exe /set groupsize 1
 bcdedit.exe /set maxgroup on
 bcdedit.exe /set groupaware on
for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism
is in use.

Tracing kvm_hv_flush_tlb immediately reveals the issue:

 kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2

The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES,
however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified.
We don't flush anything and apparently it's not what Windows expects.

TLFS doesn't say anything about such requests and newer Windows versions
seem to be unaffected. This all feels like a WS2012 bug, which is, however,
easy to workaround in KVM: let's flush everything when we see an empty
flush request, over-flushing doesn't hurt.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

da66761c

29 3月, 2019 1 次提交

x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init · 013cc6eb

由 Vitaly Kuznetsov 提交于 3月 13, 2019

When userspace initializes guest vCPUs it may want to zero all supported
MSRs including Hyper-V related ones including HV_X64_MSR_STIMERn_CONFIG/
HV_X64_MSR_STIMERn_COUNT. With commit f3b138c5 ("kvm/x86: Update SynIC
timers on guest entry only") we began doing stimer_mark_pending()
unconditionally on every config change.

The issue I'm observing manifests itself as following:
- Qemu writes 0 to STIMERn_{CONFIG,COUNT} MSRs and marks all stimers as
  pending in stimer_pending_bitmap, arms KVM_REQ_HV_STIMER;
- kvm_hv_has_stimer_pending() starts returning true;
- kvm_vcpu_has_events() starts returning true;
- kvm_arch_vcpu_runnable() starts returning true;
- when kvm_arch_vcpu_ioctl_run() gets into
  (vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED) case:
  - kvm_vcpu_block() gets in 'kvm_vcpu_check_block(vcpu) < 0' and returns
    immediately, avoiding normal wait path;
  - -EAGAIN is returned from kvm_arch_vcpu_ioctl_run() immediately forcing
    userspace to retry.

So instead of normal wait path we get a busy loop on all secondary vCPUs
before they get INIT signal. This seems to be undesirable, especially given
that this happens even when Hyper-V extensions are not used.

Generally, it seems to be pointless to mark an stimer as pending in
stimer_pending_bitmap and arm KVM_REQ_HV_STIMER as the only thing
kvm_hv_process_stimers() will do is clear the corresponding bit. We may
just not mark disabled timers as pending instead.

Fixes: f3b138c5 ("kvm/x86: Update SynIC timers on guest entry only")
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

013cc6eb

21 2月, 2019 1 次提交

kvm: x86: Add memcg accounting to KVM allocations · 254272ce

由 Ben Gardon 提交于 2月 11, 2019

There are many KVM kernel memory allocations which are tied to the life of
the VM process and should be charged to the VM process's cgroup. If the
allocations aren't tied to the process, the OOM killer will not know
that killing the process will free the associated kernel memory.
Add __GFP_ACCOUNT flags to many of the allocations which are not yet being
charged to the VM process's cgroup.

Tested:
	Ran all kvm-unit-tests on a 64 bit Haswell machine, the patch
	introduced no new failures.
	Ran a kernel memory accounting test which creates a VM to touch
	memory and then checks that the kernel memory allocated for the
	process is within certain bounds.
	With this patch we account for much more of the vmalloc and slab memory
	allocated for the VM.

There remain a few allocations which should be charged to the VM's
cgroup but are not. In x86, they include:
	vcpu->arch.pio_data
There allocations are unaccounted in this patch because they are mapped
to userspace, and accounting them to a cgroup causes problems. This
should be addressed in a future patch.
Signed-off-by: NBen Gardon <bgardon@google.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

254272ce

26 1月, 2019 4 次提交

KVM: x86: Mark expected switch fall-throughs · b2869f28

由 Gustavo A. R. Silva 提交于 1月 25, 2019

In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

This patch fixes the following warnings:

arch/x86/kvm/lapic.c:1037:27: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/lapic.c:1876:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/hyperv.c:1637:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/svm.c:4396:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/mmu.c:4372:36: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/x86.c:3835:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/x86.c:7938:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/vmx/vmx.c:2015:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/vmx/vmx.c:1773:6: warning: this statement may fall through [-Wimplicit-fallthrough=]

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough.
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b2869f28

x86/kvm/hyper-v: recommend using eVMCS only when it is enabled · f1adceaf

由 Vitaly Kuznetsov 提交于 1月 25, 2019

We shouldn't probably be suggesting using Enlightened VMCS when it's not
enabled (not supported from guest's point of view). Hyper-V on KVM seems
to be fine either way but let's be consistent.

Fixes: 2bc39970 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f1adceaf

x86/kvm/hyper-v: don't recommend doing reset via synthetic MSR · 1998fd32

由 Vitaly Kuznetsov 提交于 1月 25, 2019

System reset through synthetic MSR is not recommended neither by genuine
Hyper-V nor my QEMU.

Fixes: 2bc39970 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1998fd32

x86/kvm/hyper-v: don't announce GUEST IDLE MSR support · 9699f970

由 Vitaly Kuznetsov 提交于 1月 24, 2019

HV_X64_MSR_GUEST_IDLE_AVAILABLE appeared in kvm_vcpu_ioctl_get_hv_cpuid()
by mistake: it announces support for HV_X64_MSR_GUEST_IDLE (0x400000F0)
which we don't support in KVM (yet).

Fixes: 2bc39970 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9699f970

15 12月, 2018 8 次提交

x86/hyper-v: Stop caring about EOI for direct stimers · 87a8d795

由 Vitaly Kuznetsov 提交于 12月 05, 2018

Turns out we over-engineered Direct Mode for stimers a bit: unlike
traditional stimers where we may want to try to re-inject the message upon
EOI, Direct Mode stimers just set the irq in APIC and kvm_apic_set_irq()
fails only when APIC is disabled (see APIC_DM_FIXED case in
__apic_accept_irq()). Remove the redundant part.
Suggested-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

87a8d795

x86/kvm/hyper-v: avoid open-coding stimer_mark_pending() in kvm_hv_notify_acked_sint() · 08a800ac

由 Vitaly Kuznetsov 提交于 11月 26, 2018

stimers_pending optimization only helps us to avoid multiple
kvm_make_request() calls. This doesn't happen very often and these
calls are very cheap in the first place, remove open-coded version of
stimer_mark_pending() from kvm_hv_notify_acked_sint().
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

08a800ac

x86/kvm/hyper-v: direct mode for synthetic timers · 8644f771

由 Vitaly Kuznetsov 提交于 11月 26, 2018

Turns out Hyper-V on KVM (as of 2016) will only use synthetic timers
if direct mode is available. With direct mode we notify the guest by
asserting APIC irq instead of sending a SynIC message.

The implementation uses existing vec_bitmap for letting lapic code
know that we're interested in the particular IRQ's EOI request. We assume
that the same APIC irq won't be used by the guest for both direct mode
stimer and as sint source (especially with AutoEOI semantics). It is
unclear how things should be handled if that's not true.

Direct mode is also somewhat less expensive; in my testing
stimer_send_msg() takes not less than 1500 cpu cycles and
stimer_notify_direct() can usually be done in 300-400. WS2016 without
Hyper-V, however, always sticks to non-direct version.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8644f771

x86/kvm/hyper-v: use stimer config definition from hyperv-tlfs.h · 6a058a1e

由 Vitaly Kuznetsov 提交于 11月 26, 2018

As a preparation to implementing Direct Mode for Hyper-V synthetic
timers switch to using stimer config definition from hyperv-tlfs.h.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6a058a1e

x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID · 2bc39970

由 Vitaly Kuznetsov 提交于 12月 10, 2018

With every new Hyper-V Enlightenment we implement we're forced to add a
KVM_CAP_HYPERV_* capability. While this approach works it is fairly
inconvenient: the majority of the enlightenments we do have corresponding
CPUID feature bit(s) and userspace has to know this anyways to be able to
expose the feature to the guest.

Add KVM_GET_SUPPORTED_HV_CPUID ioctl (backed by KVM_CAP_HYPERV_CPUID, "one
cap to rule them all!") returning all Hyper-V CPUID feature leaves.

Using the existing KVM_GET_SUPPORTED_CPUID doesn't seem to be possible:
Hyper-V CPUID feature leaves intersect with KVM's (e.g. 0x40000000,
0x40000001) and we would probably confuse userspace in case we decide to
return these twice.

KVM_CAP_HYPERV_CPUID's number is interim: we're intended to drop
KVM_CAP_HYPERV_STIMER_DIRECT and use its number instead.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2bc39970

x86/hyper-v: Do some housekeeping in hyperv-tlfs.h · a4987def

由 Vitaly Kuznetsov 提交于 12月 10, 2018

hyperv-tlfs.h is a bit messy: CPUID feature bits are not always sorted,
it's hard to get which CPUID they belong to, some items are duplicated
(e.g. HV_X64_MSR_CRASH_CTL_NOTIFY/HV_CRASH_CTL_CRASH_NOTIFY).

Do some housekeeping work. While on it, replace all (1 << X) with BIT(X)
macro.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a4987def

x86: kvm: hyperv: don't retry message delivery for periodic timers · 7deec5e0

由 Roman Kagan 提交于 12月 10, 2018

The SynIC message delivery protocol allows the message originator to
request, should the message slot be busy, to be notified when it's free.

However, this is unnecessary and even undesirable for messages generated
by SynIC timers in periodic mode: if the period is short enough compared
to the time the guest spends in the timer interrupt handler, so the
timer ticks start piling up, the excessive interactions due to this
notification and retried message delivery only makes the things worse.

[This was observed, in particular, with Windows L2 guests setting
(temporarily) the periodic timer to 2 kHz, and spending hundreds of
microseconds in the timer interrupt handler due to several L2->L1 exits;
under some load in L0 this could exceed 500 us so the timer ticks
started to pile up and the guest livelocked.]

Relieve the situation somewhat by not retrying message delivery for
periodic SynIC timers.  This appears to remain within the "lazy" lost
ticks policy for SynIC timers as implemented in KVM.

Note that it doesn't solve the fundamental problem of livelocking the
guest with a periodic timer whose period is smaller than the time needed
to process a tick, but it makes it a bit less likely to be triggered.
Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7deec5e0

x86: kvm: hyperv: simplify SynIC message delivery · 3a0e7731

由 Roman Kagan 提交于 12月 10, 2018

SynIC message delivery is somewhat overengineered: it pretends to follow
the ordering rules when grabbing the message slot, using atomic
operations and all that, but does it incorrectly and unnecessarily.

The correct order would be to first set .msg_pending, then atomically
replace .message_type if it was zero, and then clear .msg_pending if
the previous step was successful. But this all is done in vcpu context
so the whole update looks atomic to the guest (it's assumed to only
access the message page from this cpu), and therefore can be done in
whatever order is most convenient (and is also the reason why the
incorrect order didn't trigger any bugs so far).

While at this, also switch to kvm_vcpu_{read,write}_guest_page, and drop
the no longer needed synic_clear_sint_msg_pending.
Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3a0e7731

17 10月, 2018 11 次提交

x86/kvm/hyperv: don't clear VP assist pages on init · 12e0c618

由 Vitaly Kuznetsov 提交于 10月 16, 2018

VP assist pages may hold valuable data which needs to be preserved across
migration. Clean PV EOI portion of the data on init, the guest is
responsible for making sure there's no garbage in the rest.

This will be used for nVMX migration, eVMCS address needs to be preserved.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

12e0c618

KVM: hyperv: define VP assist page helpers · 72bbf935

由 Ladi Prosek 提交于 10月 16, 2018

The state related to the VP assist page is still managed by the LAPIC
code in the pv_eoi field.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

72bbf935

KVM: x86: hyperv: optimize sparse VP set processing · f21dd494

由 Vitaly Kuznetsov 提交于 10月 10, 2018

Rewrite kvm_hv_flush_tlb()/send_ipi_vcpus_mask() making them cleaner and
somewhat more optimal.

hv_vcpu_in_sparse_set() is converted to sparse_set_to_vcpu_mask()
which copies sparse banks u64-at-a-time and then, depending on the
num_mismatched_vp_indexes value, returns immediately or does
vp index to vcpu index conversion by walking all vCPUs.

To support the change and make kvm_hv_send_ipi() look similar to
kvm_hv_flush_tlb() send_ipi_vcpus_mask() is introduced.
Suggested-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f21dd494

KVM: x86: hyperv: fix 'tlb_lush' typo · e6b6c483

由 Vitaly Kuznetsov 提交于 10月 08, 2018

Regardless of whether your TLB is lush or not it still needs flushing.
Reported-by: NRoman Kagan <rkagan@virtuozzo.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e6b6c483

KVM: x86: hyperv: implement PV IPI send hypercalls · 214ff83d

由 Vitaly Kuznetsov 提交于 9月 26, 2018

Using hypercall for sending IPIs is faster because this allows to specify
any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
will take only one VMEXIT.

Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
hypercall can't be 'fast' (passing parameters through registers) but
apparently this is not true, Windows always uses it as 'fast' so we need
to support that.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

214ff83d

KVM: x86: hyperv: optimize kvm_hv_flush_tlb() for vp_index == vcpu_idx case · 2cefc5fe

由 Vitaly Kuznetsov 提交于 9月 26, 2018

VP inedx almost always matches VCPU and when it does it's faster to walk
the sparse set instead of all vcpus.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2cefc5fe

KVM: x86: hyperv: valid_bank_mask should be 'u64' · 0b0a31ba

由 Vitaly Kuznetsov 提交于 9月 26, 2018

This probably doesn't matter much (KVM_MAX_VCPUS is much lower nowadays)
but valid_bank_mask is really u64 and not unsigned long.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0b0a31ba

KVM: x86: hyperv: keep track of mismatched VP indexes · 87ee613d

由 Vitaly Kuznetsov 提交于 9月 26, 2018

In most common cases VP index of a vcpu matches its vcpu index. Userspace
is, however, free to set any mapping it wishes and we need to account for
that when we need to find a vCPU with a particular VP index. To keep search
algorithms optimal in both cases introduce 'num_mismatched_vp_indexes'
counter showing how many vCPUs with mismatching VP index we have. In case
the counter is zero we can assume vp_index == vcpu_idx.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

87ee613d

KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv' variables · 1779a39f

由 Vitaly Kuznetsov 提交于 9月 26, 2018

Rename 'hv' to 'hv_vcpu' in kvm_hv_set_msr/kvm_hv_get_msr(); 'hv' is
'reserved' for 'struct kvm_hv' variables across the file.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1779a39f

KVM: x86: hyperv: optimize 'all cpus' case in kvm_hv_flush_tlb() · a812297c

由 Vitaly Kuznetsov 提交于 8月 22, 2018

We can use 'NULL' to represent 'all cpus' case in
kvm_make_vcpus_request_mask() and avoid building vCPU mask with
all vCPUs.
Suggested-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a812297c

KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS · 9170200e

由 Vitaly Kuznetsov 提交于 8月 22, 2018

Hyper-V TLFS (5.0b) states:

> Virtual processors are identified by using an index (VP index). The
> maximum number of virtual processors per partition supported by the
> current implementation of the hypervisor can be obtained through CPUID
> leaf 0x40000005. A virtual processor index must be less than the
> maximum number of virtual processors per partition.

Forbid userspace to set VP_INDEX above KVM_MAX_VCPUS. get_vcpu_by_vpidx()
can now be optimized to bail early when supplied vpidx is >= KVM_MAX_VCPUS.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9170200e

06 8月, 2018 1 次提交

KVM: x86: ensure all MSRs can always be KVM_GET/SET_MSR'd · 44883f01

由 Paolo Bonzini 提交于 7月 26, 2018

Some of the MSRs returned by GET_MSR_INDEX_LIST currently cannot be sent back
to KVM_GET_MSR and/or KVM_SET_MSR; either they can never be sent back, or you
they are only accepted under special conditions.  This makes the API a pain to
use.

To avoid this pain, this patch makes it so that the result of the get-list
ioctl can always be used for host-initiated get and set.  Since we don't have
a separate way to check for read-only MSRs, this means some Hyper-V MSRs are
ignored when written.  Arguably they should not even be in the result of
GET_MSR_INDEX_LIST, but I am leaving there in case userspace is using the
outcome of GET_MSR_INDEX_LIST to derive the support for the corresponding
Hyper-V feature.

Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

44883f01

26 5月, 2018 5 次提交

KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}_EX implementation · c7012676

由 Vitaly Kuznetsov 提交于 5月 16, 2018

Implement HvFlushVirtualAddress{List,Space}Ex hypercalls in the same way
we've implemented non-EX counterparts.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
[Initialized valid_bank_mask to silence misguided GCC warnigs. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c7012676

KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} implementation · e2f11f42

由 Vitaly Kuznetsov 提交于 5月 16, 2018

Implement HvFlushVirtualAddress{List,Space} hypercalls in a simplistic way:
do full TLB flush with KVM_REQ_TLB_FLUSH and kick vCPUs which are currently
IN_GUEST_MODE.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

e2f11f42

KVM: x86: hyperv: do rep check for each hypercall separately · 56b9ae78

由 Vitaly Kuznetsov 提交于 5月 16, 2018

Prepare to support TLB flush hypercalls, some of which are REP hypercalls.
Also, return HV_STATUS_INVALID_HYPERCALL_INPUT as it seems more
appropriate.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

56b9ae78

KVM: x86: hyperv: use defines when parsing hypercall parameters · 142c95da

由 Vitaly Kuznetsov 提交于 5月 16, 2018

Avoid open-coding offsets for hypercall input parameters, we already
have defines for them.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

142c95da

KVM: x86: fix #UD address of failed Hyper-V hypercalls · 696ca779

由 Radim Krčmář 提交于 5月 24, 2018

If the hypercall was called from userspace or real mode, KVM injects #UD
and then advances RIP, so it looks like #UD was caused by the following
instruction.  This probably won't cause more than confusion, but could
give an unexpected access to guest OS' instruction emulator.

Also, refactor the code to count hv hypercalls that were handled by the
virt userspace.

Fixes: 6356ee0c ("x86: Delay skip of emulated hypercall instruction")
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

696ca779

11 5月, 2018 2 次提交

KVM: hyperv: idr_find needs RCU protection · 452a68d0

由 Paolo Bonzini 提交于 5月 07, 2018

Even though the eventfd is released after the KVM SRCU grace period
elapses, the conn_to_evt data structure itself is not; it uses RCU
internally, instead.  Fix the read-side critical section to happen
under rcu_read_lock/unlock; the result is still protected by
vcpu->kvm->srcu.
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

452a68d0

x86: Delay skip of emulated hypercall instruction · 6356ee0c

由 Marian Rotariu 提交于 4月 30, 2018

The IP increment should be done after the hypercall emulation, after
calling the various handlers. In this way, these handlers can accurately
identify the the IP of the VMCALL if they need it.

This patch keeps the same functionality for the Hyper-V handler which does
not use the return code of the standard kvm_skip_emulated_instruction()
call.
Signed-off-by: NMarian Rotariu <mrotariu@bitdefender.com>
[Hyper-V hypercalls also need kvm_skip_emulated_instruction() - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6356ee0c

29 3月, 2018 1 次提交

x86/kvm: rename HV_X64_MSR_APIC_ASSIST_PAGE to HV_X64_MSR_VP_ASSIST_PAGE · d4abc577

由 Ladi Prosek 提交于 3月 20, 2018

The assist page has been used only for the paravirtual EOI so far, hence
the "APIC" in the MSR name. Renaming to match the Hyper-V TLFS where it's
called "Virtual VP Assist MSR".
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d4abc577

24 3月, 2018 1 次提交

kvm: x86: hyperv: delete dead code in kvm_hv_hypercall() · d32ef547

由 Dan Carpenter 提交于 3月 17, 2018

"rep_done" is always zero so the "(((u64)rep_done & 0xfff) << 32)"
expression is just zero.  We can remove the "res" temporary variable as
well and just use "ret" directly.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d32ef547

17 3月, 2018 3 次提交

x86/kvm/hyper-v: inject #GP only when invalid SINTx vector is unmasked · 915e6f78

由 Vitaly Kuznetsov 提交于 3月 01, 2018

Hyper-V 2016 on KVM with SynIC enabled doesn't boot with the following
trace:

    kvm_entry:            vcpu 0
    kvm_exit:             reason MSR_WRITE rip 0xfffff8000131c1e5 info 0 0
    kvm_hv_synic_set_msr: vcpu_id 0 msr 0x40000090 data 0x10000 host 0
    kvm_msr:              msr_write 40000090 = 0x10000 (#GP)
    kvm_inj_exception:    #GP (0x0)

KVM acts according to the following statement from TLFS:

"
11.8.4 SINTx Registers
...
Valid values for vector are 16-255 inclusive. Specifying an invalid
vector number results in #GP.
"

However, I checked and genuine Hyper-V doesn't #GP when we write 0x10000
to SINTx. I checked with Microsoft and they confirmed that if either the
Masked bit (bit 16) or the Polling bit (bit 18) is set to 1, then they
ignore the value of Vector. Make KVM act accordingly.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

915e6f78

x86/kvm/hyper-v: remove stale entries from vec_bitmap/auto_eoi_bitmap on vector change · 98f65ad4

由 Vitaly Kuznetsov 提交于 3月 01, 2018

When a new vector is written to SINx we update vec_bitmap/auto_eoi_bitmap
but we forget to remove old vector from these masks (in case it is not
present in some other SINTx).
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

98f65ad4

x86/kvm/hyper-v: add reenlightenment MSRs support · a2e164e7

由 Vitaly Kuznetsov 提交于 3月 01, 2018

Nested Hyper-V/Windows guest running on top of KVM will use TSC page
clocksource in two cases:
- L0 exposes invariant TSC (CPUID.80000007H:EDX[8]).
- L0 provides Hyper-V Reenlightenment support (CPUID.40000003H:EAX[13]).

Exposing invariant TSC effectively blocks migration to hosts with different
TSC frequencies, providing reenlightenment support will be needed when we
start migrating nested workloads.

Implement rudimentary support for reenlightenment MSRs. For now, these are
just read/write MSRs with no effect.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a2e164e7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功