提交 · 45bd07b9d5202c910b31c92bd15572b560198c26 · openeuler / Kernel

12 7月, 2011 40 次提交

KVM: MMU: make kvm_mmu_reset_context() flush the guest TLB · 45bd07b9

由 Avi Kivity 提交于 6月 12, 2011

kvm_set_cr0() and kvm_set_cr4(), and possible other functions,
assume that kvm_mmu_reset_context() flushes the guest TLB.  However,
it does not.

Fix by flushing the tlb (and syncing the new root as well).
Signed-off-by: NAvi Kivity <avi@redhat.com>

45bd07b9

KVM: MMU: Adjust shadow paging to work when SMEP=1 and CR0.WP=0 · 411c588d

由 Avi Kivity 提交于 6月 06, 2011

When CR0.WP=0, we sometimes map user pages as kernel pages (to allow
the kernel to write to them).  Unfortunately this also allows the kernel
to fetch from these pages, even if CR4.SMEP is set.

Adjust for this by also setting NX on the spte in these circumstances.
Signed-off-by: NAvi Kivity <avi@redhat.com>

411c588d

KVM: Enable ERMS feature support for KVM · a01c8f9b

由 Yang, Wei 提交于 6月 14, 2011

This patch exposes ERMS feature to KVM guests.

The REP MOVSB/STOSB instruction can enhance fast strings attempts to
move as much of the data with larger size load/stores as possible.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a01c8f9b

KVM: Expose RDWRGSFS bit to KVM guests · 176f61da

由 Yang, Wei 提交于 6月 14, 2011

This patch exposes RDWRGSFS bit to KVM guests.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

176f61da

KVM: Add RDWRGSFS support when setting CR4 · 74dc2b4f

由 Yang, Wei 提交于 6月 14, 2011

This patch adds RDWRGSFS support when setting CR4.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

74dc2b4f

KVM: Remove RDWRGSFS bit from CR4_RESERVED_BITS · d9c3476d

由 Yang, Wei 提交于 6月 14, 2011

This patch removes RDWRGSFS bit from CR4_RESERVED_BITS.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d9c3476d

KVM: Enable DRNG feature support for KVM · 4a00efdf

由 Yang, Wei Y 提交于 6月 13, 2011

This patch exposes DRNG feature to KVM guests.

The RDRAND instruction can provide software with sequences of
random numbers generated from white noise.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4a00efdf

KVM: fix XSAVE bit scanning (now properly) · 02668b06

由 Andre Przywara 提交于 6月 10, 2011

commit 123108f1c1aafd51d6a5c79cc04d7999dd88a930 tried to fix KVMs
XSAVE valid feature scanning, but it was wrong. It was not considering
the sparse nature of this bitfield, instead reading values from
uninitialized members of the entries array.
This patch now separates subleaf indicies from KVM's array indicies
and fills the entry before querying it's value.
This fixes AVX support in KVM guests.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

02668b06

KVM: Fix KVM_ASSIGN_SET_MSIX_ENTRY documentation · 58f0964e

由 Jan Kiszka 提交于 6月 11, 2011

The documented behavior did not match the implemented one (which also
never changed).
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

58f0964e

KVM: Fix off-by-one in overflow check of KVM_ASSIGN_SET_MSIX_NR · 9f3191ae

由 Jan Kiszka 提交于 6月 11, 2011

KVM_MAX_MSIX_PER_DEV implies that up to that many MSI-X entries can be
requested. But the kernel so far rejected already the upper limit.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9f3191ae

KVM: Add compat ioctl for KVM_SET_SIGNAL_MASK · 1dda606c

由 Alexander Graf 提交于 6月 08, 2011

KVM has an ioctl to define which signal mask should be used while running
inside VCPU_RUN. At least for big endian systems, this mask is different
on 32-bit and 64-bit systems (though the size is identical).

Add a compat wrapper that converts the mask to whatever the kernel accepts,
allowing 32-bit kvm user space to set signal masks.

This patch fixes qemu with --enable-io-thread on ppc64 hosts when running
32-bit user land.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1dda606c

KVM: Clarify KVM_ASSIGN_PCI_DEVICE documentation · 91e3d71d

由 Jan Kiszka 提交于 6月 03, 2011

Neither host_irq nor the guest_msi struct are used anymore today.
Tag the former, drop the latter to avoid confusion.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

91e3d71d

KVM: Add instruction fetch checking when walking guest page table · e57d4a35

由 Yang, Wei Y 提交于 6月 03, 2011

This patch adds instruction fetch checking when walking guest page table,
to implement SMEP when emulating instead of executing natively.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NShan, Haitao <haitao.shan@intel.com>
Signed-off-by: NLi, Xin <xin.li@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e57d4a35

KVM: Mask function7 ebx against host capability word9 · 611c120f

由 Yang, Wei Y 提交于 6月 03, 2011

This patch masks CPUID leaf 7 ebx against host capability word9.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NShan, Haitao <haitao.shan@intel.com>
Signed-off-by: NLi, Xin <xin.li@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

611c120f

KVM: Add SMEP support when setting CR4 · c68b734f

由 Yang, Wei Y 提交于 6月 03, 2011

This patch adds SMEP handling when setting CR4.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NShan, Haitao <haitao.shan@intel.com>
Signed-off-by: NLi, Xin <xin.li@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c68b734f

KVM: Remove SMEP bit from CR4_RESERVED_BITS · 8d9c975f

由 Yang, Wei Y 提交于 6月 03, 2011

This patch removes SMEP bit from CR4_RESERVED_BITS.
Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
Signed-off-by: NShan, Haitao <haitao.shan@intel.com>
Signed-off-by: NLi, Xin <xin.li@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8d9c975f

KVM: nVMX: Fix bug preventing more than two levels of nesting · 509c75ea

由 Nadav Har'El 提交于 6月 02, 2011

The nested VMX feature is supposed to fully emulate VMX for the guest. This
(theoretically) not only allows it to run its own guests, but also also
to further emulate VMX for its own guests, and allow arbitrarily deep nesting.

This patch fixes a bug (discovered by Kevin Tian) in handling a VMLAUNCH
by L2, which prevented deeper nesting.

Deeper nesting now works (I only actually tested L3), but is currently
*absurdly* slow, to the point of being unusable.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

509c75ea

KVM: Fixup documentation section numbering · 7f4382e8

由 Jan Kiszka 提交于 6月 02, 2011

Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7f4382e8

KVM: x86 emulator: fold decode_cache into x86_emulate_ctxt · 9dac77fa

由 Avi Kivity 提交于 6月 01, 2011

This saves a lot of pointless casts x86_emulate_ctxt and decode_cache.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9dac77fa

KVM: x86 emulator: rename decode_cache::eip to _eip · 36dd9bb5

由 Avi Kivity 提交于 6月 01, 2011

The name eip conflicts with a field of the same name in x86_emulate_ctxt,
which we plan to fold decode_cache into.

The name _eip is unfortunate, but what's really needed is a refactoring
here, not a better name.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

36dd9bb5

KVM: VMX: Silence warning on 32-bit hosts · 2e4ce7f5

由 Jan Kiszka 提交于 6月 01, 2011

a is unused now on CONFIG_X86_32.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2e4ce7f5

KVM: x86 emulator: Use opcode::execute for CLI/STI(FA/FB) · f411e6cd

由 Takuya Yoshikawa 提交于 5月 29, 2011

Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f411e6cd

KVM: x86 emulator: Use opcode::execute for LOOP/JCXZ · d06e03ad

由 Takuya Yoshikawa 提交于 5月 29, 2011

  LOOP/LOOPcc      : E0-E2
  JCXZ/JECXZ/JRCXZ : E3
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d06e03ad

KVM: x86 emulator: Clean up INT n/INTO/INT 3(CC/CD/CE) · 5c5df76b

由 Takuya Yoshikawa 提交于 5月 29, 2011

Call emulate_int() directly to avoid spaghetti goto's.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5c5df76b

KVM: x86 emulator: Use opcode::execute for MOV(8C/8E) · 1bd5f469

由 Takuya Yoshikawa 提交于 5月 29, 2011

Different functions for those which take segment register operands.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1bd5f469

KVM: x86 emulator: Use opcode::execute for RET(C3) · ebda02c2

由 Takuya Yoshikawa 提交于 5月 29, 2011

Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ebda02c2

KVM: x86 emulator: Use opcode::execute for XCHG(86/87) · e4f973ae

由 Takuya Yoshikawa 提交于 5月 29, 2011

In addition, replace one "goto xchg" with an em_xchg() call.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e4f973ae

KVM: x86 emulator: Use opcode::execute for TEST(84/85, A8/A9) · 9f21ca59

由 Takuya Yoshikawa 提交于 5月 29, 2011

Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9f21ca59

KVM: x86 emulator: Use opcode::execute for some instructions · db5b0762

由 Takuya Yoshikawa 提交于 5月 29, 2011

Move the following functions to the opcode tables:

  RET (Far return) : CB
  IRET             : CF
  JMP (Jump far)   : EA

  SYSCALL          : 0F 05
  CLTS             : 0F 06
  SYSENTER         : 0F 34
  SYSEXIT          : 0F 35
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

db5b0762

KVM: x86 emulator: Rename emulate_xxx() to em_xxx() · e01991e7

由 Takuya Yoshikawa 提交于 5月 29, 2011

The next patch will change these to be called by opcode::execute.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e01991e7

KVM: x86 emulator: Use the pointers ctxt and c consistently · 9d74191a

由 Takuya Yoshikawa 提交于 5月 29, 2011

We should use the local variables ctxt and c when the emulate_ctxt and
decode appears many times.  At least, we need to be consistent about
how we use these in a function.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9d74191a

KVM: Document KVM_IOEVENTFD · 55399a02

由 Sasha Levin 提交于 5月 28, 2011

Document KVM_IOEVENTFD that can be used to receive
notifications of PIO/MMIO events without triggering
an exit.
Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

55399a02

KVM: nVMX: Documentation · 823e3965

由 Nadav Har'El 提交于 5月 25, 2011

This patch includes a brief introduction to the nested vmx feature in the
Documentation/kvm directory. The document also includes a copy of the
vmcs12 structure, as requested by Avi Kivity.

[marcelo: move to Documentation/virtual/kvm]
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

823e3965

KVM: nVMX: Miscellenous small corrections · 2844d849

由 Nadav Har'El 提交于 5月 25, 2011

Small corrections of KVM (spelling, etc.) not directly related to nested VMX.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2844d849

KVM: nVMX: Add VMX to list of supported cpuid features · 7b8050f5

由 Nadav Har'El 提交于 5月 25, 2011

If the "nested" module option is enabled, add the "VMX" CPU feature to the
list of CPU features KVM advertises with the KVM_GET_SUPPORTED_CPUID ioctl.

Qemu uses this ioctl, and intersects KVM's list with its own list of desired
cpu features (depending on the -cpu option given to qemu) to determine the
final list of features presented to the guest.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7b8050f5

KVM: nVMX: Additional TSC-offset handling · 7991825b

由 Nadav Har'El 提交于 5月 25, 2011

In the unlikely case that L1 does not capture MSR_IA32_TSC, L0 needs to
emulate this MSR write by L2 by modifying vmcs02.tsc_offset. We also need to
set vmcs12.tsc_offset, for this change to survive the next nested entry (see
prepare_vmcs02()).
Additionally, we also need to modify vmx_adjust_tsc_offset: The semantics
of this function is that the TSC of all guests on this vcpu, L1 and possibly
several L2s, need to be adjusted. To do this, we need to adjust vmcs01's
tsc_offset (this offset will also apply to each L2s we enter). We can't set
vmcs01 now, so we have to remember this adjustment and apply it when we
later exit to L1.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7991825b

KVM: nVMX: Further fixes for lazy FPU loading · 36cf24e0

由 Nadav Har'El 提交于 5月 25, 2011

KVM's "Lazy FPU loading" means that sometimes L0 needs to set CR0.TS, even
if a guest didn't set it. Moreover, L0 must also trap CR0.TS changes and
NM exceptions, even if we have a guest hypervisor (L1) who didn't want these
traps. And of course, conversely: If L1 wanted to trap these events, we
must let it, even if L0 is not interested in them.

This patch fixes some existing KVM code (in update_exception_bitmap(),
vmx_fpu_activate(), vmx_fpu_deactivate()) to do the correct merging of L0's
and L1's needs. Note that handle_cr() was already fixed in the above patch,
and that new code in introduced in previous patches already handles CR0
correctly (see prepare_vmcs02(), prepare_vmcs12(), and nested_vmx_vmexit()).
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

36cf24e0

KVM: nVMX: Handling of CR0 and CR4 modifying instructions · eeadf9e7

由 Nadav Har'El 提交于 5月 25, 2011

When L2 tries to modify CR0 or CR4 (with mov or clts), and modifies a bit
which L1 asked to shadow (via CR[04]_GUEST_HOST_MASK), we already do the right
thing: we let L1 handle the trap (see nested_vmx_exit_handled_cr() in a
previous patch).
When L2 modifies bits that L1 doesn't care about, we let it think (via
CR[04]_READ_SHADOW) that it did these modifications, while only changing
(in GUEST_CR[04]) the bits that L0 doesn't shadow.

This is needed for corect handling of CR0.TS for lazy FPU loading: L0 may
want to leave TS on, while pretending to allow the guest to change it.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

eeadf9e7

KVM: nVMX: Correct handling of idt vectoring info · 66c78ae4

由 Nadav Har'El 提交于 5月 25, 2011

This patch adds correct handling of IDT_VECTORING_INFO_FIELD for the nested
case.

When a guest exits while delivering an interrupt or exception, we get this
information in IDT_VECTORING_INFO_FIELD in the VMCS. When L2 exits to L1,
there's nothing we need to do, because L1 will see this field in vmcs12, and
handle it itself. However, when L2 exits and L0 handles the exit itself and
plans to return to L2, L0 must inject this event to L2.

In the normal non-nested case, the idt_vectoring_info case is discovered after
the exit, and the decision to inject (though not the injection itself) is made
at that point. However, in the nested case a decision of whether to return
to L2 or L1 also happens during the injection phase (see the previous
patches), so in the nested case we can only decide what to do about the
idt_vectoring_info right after the injection, i.e., in the beginning of
vmx_vcpu_run, which is the first time we know for sure if we're staying in
L2.

Therefore, when we exit L2 (is_guest_mode(vcpu)), we disable the regular
vmx_complete_interrupts() code which queues the idt_vectoring_info for
injection on next entry - because such injection would not be appropriate
if we will decide to exit to L1. Rather, we just save the idt_vectoring_info
and related fields in vmcs12 (which is a convenient place to save these
fields). On the next entry in vmx_vcpu_run (*after* the injection phase,
potentially exiting to L1 to inject an event requested by user space), if
we find ourselves in L1 we don't need to do anything with those values
we saved (as explained above). But if we find that we're in L2, or rather
*still* at L2 (it's not nested_run_pending, meaning that this is the first
round of L2 running after L1 having just launched it), we need to inject
the event saved in those fields - by writing the appropriate VMCS fields.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

66c78ae4

KVM: nVMX: Correct handling of exception injection · 0b6ac343

由 Nadav Har'El 提交于 5月 25, 2011

Similar to the previous patch, but concerning injection of exceptions rather
than external interrupts.
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0b6ac343

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功