提交 · eabeaaccfca0ed61b8e00a09b8cfa703c4f11b59 · openeuler / Kernel

13 3月, 2013 2 次提交

KVM: nVMX: Clean up and fix pin-based execution controls · eabeaacc

由 Jan Kiszka 提交于 3月 13, 2013

Only interrupt and NMI exiting are mandatory for KVM to work, thus can
be exposed to the guest unconditionally, virtual NMI exiting is
optional. So we must not advertise it unless the host supports it.

Introduce the symbolic constant PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR at
this chance.
Reviewed-by: N: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

eabeaacc

KVM: x86: Rework INIT and SIPI handling · 66450a21

由 Jan Kiszka 提交于 3月 13, 2013

A VCPU sending INIT or SIPI to some other VCPU races for setting the
remote VCPU's mp_state. When we were unlucky, KVM_MP_STATE_INIT_RECEIVED
was overwritten by kvm_emulate_halt and, thus, got lost.

This introduces APIC events for those two signals, keeping them in
kvm_apic until kvm_apic_accept_events is run over the target vcpu
context. kvm_apic_has_events reports to kvm_arch_vcpu_runnable if there
are pending events, thus if vcpu blocking should end.

The patch comes with the side effect of effectively obsoleting
KVM_MP_STATE_SIPI_RECEIVED. We still accept it from user space, but
immediately translate it to KVM_MP_STATE_INIT_RECEIVED + KVM_APIC_SIPI.
The vcpu itself will no longer enter the KVM_MP_STATE_SIPI_RECEIVED
state. That also means we no longer exit to user space after receiving a
SIPI event.

Furthermore, we already reset the VCPU on INIT, only fixing up the code
segment later on when SIPI arrives. Moreover, we fix INIT handling for
the BSP: it never enter wait-for-SIPI but directly starts over on INIT.
Tested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

66450a21

12 3月, 2013 1 次提交

KVM: x86: Drop unused return code from VCPU reset callback · 57f252f2

由 Jan Kiszka 提交于 3月 12, 2013

Neither vmx nor svm nor the common part may generate an error on
kvm_vcpu_reset. So drop the return code.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

57f252f2

11 3月, 2013 1 次提交

kvm: remove cast for kmalloc return value · 0fa24ce3

由 Ioan Orghici 提交于 3月 10, 2013

Signed-off-by: Ioan Orghici<ioan.orghici@gmail.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

0fa24ce3

08 3月, 2013 2 次提交

KVM: nVMX: Fix setting of CR0 and CR4 in guest mode · 1a0d74e6

由 Jan Kiszka 提交于 3月 07, 2013

The logic for calculating the value with which we call kvm_set_cr0/4 was
broken (will definitely be visible with nested unrestricted guest mode
support). Also, we performed the check regarding CR0_ALWAYSON too early
when in guest mode.

What really needs to be done on both CR0 and CR4 is to mask out L1-owned
bits and merge them in from L1's guest_cr0/4. In contrast, arch.cr0/4
and arch.cr0/4_guest_owned_bits contain the mangled L0+L1 state and,
thus, are not suited as input.

For both CRs, we can then apply the check against VMXON_CRx_ALWAYSON and
refuse the update if it fails. To be fully consistent, we implement this
check now also for CR4. For CR4, we move the check into vmx_set_cr4
while we keep it in handle_set_cr0. This is because the CR0 checks for
vmxon vs. guest mode will diverge soon when adding unrestricted guest
mode support.

Finally, we have to set the shadow to the value L2 wanted to write
originally.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1a0d74e6

KVM: nVMX: Fix content of MSR_IA32_VMX_ENTRY/EXIT_CTLS · 33fb20c3

由 Jan Kiszka 提交于 3月 06, 2013

Properly set those bits to 1 that the spec demands in case bit 55 of
VMX_BASIC is 0 - like in our case.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

33fb20c3

06 3月, 2013 1 次提交

KVM: nVMX: Reset RFLAGS on VM-exit · c4627c72

由 Jan Kiszka 提交于 3月 03, 2013

Ouch, how could this work so well that far? We need to clear RFLAGS to
the reset value as specified by the SDM. Particularly, IF must be off
after VM-exit!
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c4627c72

05 3月, 2013 2 次提交

KVM: nVMX: Fix switching of debug state · 503cd0c5

由 Jan Kiszka 提交于 3月 03, 2013

First of all, do not blindly overwrite GUEST_DR7 on L2 entry. The host
may have guest debugging enabled. Then properly reset DR7 and DEBUG_CTL
on L2->L1 switch as specified in the SDM.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

503cd0c5

KVM: set_memory_region: Drop user_alloc from set_memory_region() · 47ae31e2

由 Takuya Yoshikawa 提交于 2月 27, 2013

Except ia64's stale code, KVM_SET_MEMORY_REGION support, this is only
used for sanity checks in __kvm_set_memory_region() which can easily
be changed to use slot id instead.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

47ae31e2

28 2月, 2013 2 次提交

KVM: VMX: Pass vcpu to __vmx_complete_interrupts · 3ab66e8a

由 Jan Kiszka 提交于 2月 20, 2013

Cleanup: __vmx_complete_interrupts has no use for the vmx structure.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

3ab66e8a

KVM: nVMX: Avoid one redundant vmcs_read in prepare_vmcs12 · 44ceb9d6

由 Jan Kiszka 提交于 2月 20, 2013

IDT_VECTORING_INFO_FIELD was already read right after vmexit.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

44ceb9d6

27 2月, 2013 4 次提交

KVM: nVMX: Use cached exit reason · 957c897e

由 Jan Kiszka 提交于 2月 24, 2013

No need to re-read what vmx_vcpu_run already picked up for us.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

957c897e

KVM: nVMX: Clear segment cache after switching between L1 and L2 · 36c3cc42

由 Jan Kiszka 提交于 2月 23, 2013

Switching the VMCS obviously invalidates what may have been cached about
the guest segments.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

36c3cc42

KVM: nVMX: Advertise PAUSE and WBINVD exiting support · d6851fbe

由 Jan Kiszka 提交于 2月 23, 2013

These exits have no preconditions, and we already process the
corresponding reasons in nested_vmx_exit_handled correctly.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

d6851fbe

KVM: VMX: Make prepare_vmcs12 and load_vmcs12_host_state static · 733568f9

由 Jan Kiszka 提交于 2月 23, 2013

Both are only used locally.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

733568f9

22 2月, 2013 2 次提交

KVM: nVMX: Trap unconditionally if msr bitmap access fails · bd31a7f5

由 Jan Kiszka 提交于 2月 14, 2013

This avoids basing decisions on uninitialized variables, potentially
leaking kernel data to the L1 guest.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bd31a7f5

KVM: nVMX: Improve I/O exit handling · 908a7bdd

由 Jan Kiszka 提交于 2月 18, 2013

This prevents trapping L2 I/O exits if L1 has neither unconditional nor
bitmap-based exiting enabled. Furthermore, it implements I/O bitmap
handling.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

908a7bdd

14 2月, 2013 1 次提交

KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr · cbd29cb6

由 Jan Kiszka 提交于 2月 11, 2013

We already pass vmcs12 as argument.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

cbd29cb6

11 2月, 2013 1 次提交

KVM: VMX: disable apicv by default · 257090f7

由 Yang Zhang 提交于 2月 10, 2013

Without Posted Interrupt, current code is broken. Just disable by
default until Posted Interrupt is ready.
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

257090f7

07 2月, 2013 1 次提交

KVM: VMX: cleanup vmx_set_cr0(). · 5037878e

由 Gleb Natapov 提交于 2月 04, 2013

When calculating hw_cr0 teh current code masks bits that should be always
on and re-adds them back immediately after. Cleanup the code by masking
only those bits that should be dropped from hw_cr0. This allow us to
get rid of some defines.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5037878e

06 2月, 2013 1 次提交

KVM: VMX: disable SMEP feature when guest is in non-paging mode · c08800a5

由 Dongxiao Xu 提交于 2月 04, 2013

SMEP is disabled if CPU is in non-paging mode in hardware.
However KVM always uses paging mode to emulate guest non-paging
mode with TDP. To emulate this behavior, SMEP needs to be manually
disabled when guest switches to non-paging mode.

We met an issue that, SMP Linux guest with recent kernel (enable
SMEP support, for example, 3.5.3) would crash with triple fault if
setting unrestricted_guest=0. This is because KVM uses an identity
mapping page table to emulate the non-paging mode, where the page
table is set with USER flag. If SMEP is still enabled in this case,
guest will meet unhandlable page fault and then crash.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NDongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: NXiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c08800a5

29 1月, 2013 3 次提交

x86, apicv: add virtual interrupt delivery support · c7c9c56c

由 Yang Zhang 提交于 1月 25, 2013

Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:

- for pending interrupt, instead of direct injection, we may need
  update architecture specific indicators before resuming to guest.

- A pending interrupt, which is masked by ISR, should be also
  considered in above update action, since hardware will decide
  when to inject it at right time. Current has_interrupt and
  get_interrupt only returns a valid vector from injection p.o.v.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

c7c9c56c

x86, apicv: add virtual x2apic support · 8d14695f

由 Yang Zhang 提交于 1月 25, 2013

basically to benefit from apicv, we need to enable virtualized x2apic mode.
Currently, we only enable it when guest is really using x2apic.

Also, clear MSR bitmap for corresponding x2apic MSRs when guest enabled x2apic:
0x800 - 0x8ff: no read intercept for apicv register virtualization,
               except APIC ID and TMCCT which need software's assistance to
               get right value.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

8d14695f

x86, apicv: add APICv register virtualization support · 83d4c286

由 Yang Zhang 提交于 1月 25, 2013

- APIC read doesn't cause VM-Exit
- APIC write becomes trap-like
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

83d4c286

24 1月, 2013 8 次提交

KVM: VMX: set vmx->emulation_required only when needed. · 14168786