提交 · 2c57ee6f924c95e4dce61ed4776fb3f62e1b9f92 · openeuler / Kernel

31 1月, 2008 5 次提交

KVM: x86 emulator: Only allow VMCALL/VMMCALL trapped by #UD · 571008da

由 Sheng Yang 提交于 1月 02, 2008

When executing a test program called "crashme", we found the KVM guest cannot
survive more than ten seconds, then encounterd kernel panic. The basic concept
of "crashme" is generating random assembly code and trying to execute it.

After some fixes on emulator insn validity judgment, we found it's hard to
get the current emulator handle the invalid instructions correctly, for the
#UD trap for hypercall patching caused troubles. The problem is, if the opcode
itself was OK, but combination of opcode and modrm_reg was invalid, and one
operand of the opcode was memory (SrcMem or DstMem), the emulator will fetch
the memory operand first rather than checking the validity, and may encounter
an error there. For example, ".byte 0xfe, 0x34, 0xcd" has this problem.

In the patch, we simply check that if the invalid opcode wasn't vmcall/vmmcall,
then return from emulate_instruction() and inject a #UD to guest. With the
patch, the guest had been running for more than 12 hours.
Signed-off-by: NFeng (Eric) Liu <eric.e.liu@intel.com>
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

571008da

KVM: MMU: Switch to mmu spinlock · aaee2c94

由 Marcelo Tosatti 提交于 12月 20, 2007

Convert the synchronization of the shadow handling to a separate mmu_lock
spinlock.

Also guard fetch() by mmap_sem in read-mode to protect against alias
and memslot changes.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

aaee2c94

KVM: MMU: Concurrent guest walkers · 10589a46

由 Marcelo Tosatti 提交于 12月 20, 2007

Do not hold kvm->lock mutex across the entire pagefault code,
only acquire it in places where it is necessary, such as mmu
hash list, active list, rmap and parent pte handling.

Allow concurrent guest walkers by switching walk_addr() to use
mmap_sem in read-mode.

And get rid of the lockless __gfn_to_page.

[avi: move kvm_mmu_pte_write() locking inside the function]
[avi: add locking for real mode]
[avi: fix cmpxchg locking]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

10589a46

A
KVM: Disable vapic support on Intel machines with FlexPriority · 774ead3a
由 Avi Kivity 提交于 12月 26, 2007
```
FlexPriority accelerates the tpr without any patching.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
774ead3a

KVM: Move arch dependent files to new directory arch/x86/kvm/ · edf88417

由 Avi Kivity 提交于 12月 16, 2007

This paves the way for multiple architecture support.  Note that while
ioapic.c could potentially be shared with ia64, it is also moved.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

edf88417

30 1月, 2008 35 次提交

KVM: VMX: Add printk_ratelimit in vmx_intr_assist · 9584bf2c

由 Ryan Harper 提交于 12月 13, 2007

Add printk_ratelimit check in front of printk.  This prevents spamming
of the message during 32-bit ubuntu 6.06server install.  Previously, it
would hang during the partition formatting stage.
Signed-off-by: NRyan Harper <ryanh@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9584bf2c

KVM: Portability: Move round_robin_prev_vcpu and tss_addr to kvm_arch · bfc6d222

由 Zhang Xiantao 提交于 12月 14, 2007

This patches moves two fields round_robin_prev_vcpu and tss to kvm_arch.
Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com>
Acked-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

bfc6d222

KVM: Portability: Split mmu-related static inline functions to mmu.h · 1d737c8a

由 Zhang Xiantao 提交于 12月 14, 2007

Since these functions need to know the details of kvm or kvm_vcpu structure,
it can't be put in x86.h.  Create mmu.h to hold them.
Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com>
Acked-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1d737c8a

KVM: Portability: Introduce kvm_vcpu_arch · ad312c7c

由 Zhang Xiantao 提交于 12月 13, 2007

Move all the architecture-specific fields in kvm_vcpu into a new struct
kvm_vcpu_arch.
Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com>
Acked-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ad312c7c

KVM: VMX: Avoid exit when setting cr8 if the local apic is in the kernel · e5314067

由 Avi Kivity 提交于 12月 06, 2007

With apic in userspace, we must exit to userspace after a cr8 write in order
to update the tpr.  But if the apic is in the kernel, the exit is unnecessary.

Noticed by Joerg Roedel.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e5314067

A
KVM: Use generalized exception queue for injecting #UD · 7ee5d940
由 Avi Kivity 提交于 11月 25, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
7ee5d940
A
KVM: Replace #GP injection by the generalized exception queue · c1a5d4f9
由 Avi Kivity 提交于 11月 25, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
c1a5d4f9
A
KVM: Replace page fault injection by the generalized exception queue · c3c91fee
由 Avi Kivity 提交于 11月 25, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
c3c91fee

KVM: Generalize exception injection mechanism · 298101da

由 Avi Kivity 提交于 11月 25, 2007

Instead of each subarch doing its own thing, add an API for queuing an
injection, and manage failed exception injection centerally (i.e., if
an inject failed due to a shadow page fault, we need to requeue it).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

298101da

KVM: VMX: Remove the secondary execute control dependency on irqchip · 83ff3b9d

由 Sheng Yang 提交于 11月 21, 2007

The state of SECONDARY_VM_EXEC_CONTROL shouldn't depend on in-kernel IRQ chip,
this patch fix this.
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

83ff3b9d

KVM: VMX: Force seg.base == (seg.sel << 4) in real mode · 15b00f32

由 Jan Kiszka 提交于 11月 19, 2007

Ensure that segment.base == segment.selector << 4 when entering the real
mode on Intel so that the CPU will not bark at us. This fixes some old
protected mode demo from http://www.x86.org/articles/pmbasics/tspec_a1_doc.htm.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

15b00f32

KVM: Make unloading of FPU state when putting vcpu arch-independent · 9327fd11

由 Amit Shah 提交于 11月 15, 2007

Instead of having each architecture do it individually, we
do this in the arch-independent code (just x86 as of now).

[avi: add svm to the mix, which was added to mainline during the
 2.6.24-rc process]
Signed-off-by: NAmit Shah <amit.shah@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9327fd11

KVM: Replace 'light_exits' stat with 'host_state_reload' · e1beb1d3

由 Avi Kivity 提交于 11月 18, 2007

This is a little more accurate (since it counts actual reloads, not potential
reloads), and reverses the sense of the statistic to measure a bad event like
most of the other stats (e.g. we want to minimize all counters).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e1beb1d3

KVM: VMX: Consolidate register usage in vmx_vcpu_run() · e08aa78a

由 Avi Kivity 提交于 11月 15, 2007

We pass vcpu, vmx->fail, and vmx->launched to assembly code, but all three
are fields within vmx.  Consolidate by only passing in vmx and offsets for
the rest.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e08aa78a

KVM: Portability: Combine kvm_init and kvm_init_x86 · cb498ea2

由 Zhang Xiantao 提交于 11月 14, 2007

Will be called once arch module registers itself.
Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com>
Acked-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

cb498ea2

KVM: VMX: wbinvd exiting · e5edaa01

由 Eddie Dong 提交于 11月 11, 2007

Add wbinvd VM Exit support to prepare for pass-through
device cache emulation and also enhance real time
responsiveness.
Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e5edaa01

KVM: Fix faults during injection of real-mode interrupts · 9c8cba37

由 Avi Kivity 提交于 11月 22, 2007

If vmx fails to inject a real-mode interrupt while fetching the interrupt
redirection table, it fails to record this in the vectoring information
field. So we detect this condition and do it ourselves.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9c8cba37

KVM: VMX: Read & store IDT_VECTORING_INFO_FIELD · 1155f76a

由 Avi Kivity 提交于 11月 22, 2007

We'll want to write to it in order to fix real-mode irq injection problems,
but it is a read-only field. Storing it in a variable solves that issue.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1155f76a

KVM: VMX: Use vmx to inject real-mode interrupts · 9c5623e3

由 Avi Kivity 提交于 11月 08, 2007

Instead of injecting real-mode interrupts by writing the interrupt frame into
guest memory, abuse vmx by injecting a software interrupt.  We need to
pretend the software interrupt instruction had a length > 0, so we have to
adjust rip backward.

This lets us not to mess with writing guest memory, which is complex and also
sleeps.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9c5623e3

KVM: VMX: Enable memory mapped TPR shadow (FlexPriority) · f78e0e2e

由 Sheng Yang 提交于 10月 29, 2007

This patch based on CR8/TPR patch, and enable the TPR shadow (FlexPriority)
for 32bit Windows.  Since TPR is accessed very frequently by 32bit
Windows, especially SMP guest, with FlexPriority enabled, we saw significant
performance gain.
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f78e0e2e

KVM: Move page fault processing to common code · 3067714c

由 Avi Kivity 提交于 10月 28, 2007

The code that dispatches the page fault and emulates if we failed to map
is duplicated across vmx and svm. Merge it to simplify further bugfixing.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

3067714c

KVM: VMX: Let gcc to choose which registers to save (i386) · ff593e5a

由 Laurent Vivier 提交于 10月 25, 2007

This patch lets GCC to determine which registers to save when we
switch to/from a VCPU in the case of intel i386.

* Original code saves following registers:

    eax, ebx, ecx, edx, edi, esi, ebp (using popa)

* Patched code:

  - informs GCC that we modify following registers
    using the clobber description:

    ebx, edi, rsi

  - doesn't save eax because it is an output operand (vmx->fail)

  - cannot put ecx in clobber description because it is an input operand,
    but as we modify it and we want to keep its value (vcpu), we must
    save it (pop/push)

  - ebp is saved (pop/push) because GCC seems to ignore its use the clobber
    description.

  - edx is saved (pop/push) because it is reserved by GCC (REGPARM) and
    cannot be put in the clobber description.

  - line "mov (%%esp), %3 \n\t" has been removed because %3
    is ecx and ecx is restored just after.
Signed-off-by: NLaurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ff593e5a

KVM: VMX: Let gcc to choose which registers to save (x86_64) · c2036300

由 Laurent Vivier 提交于 10月 25, 2007

This patch lets GCC to determine which registers to save when we
switch to/from a VCPU in the case of intel x86_64.

* Original code saves following registers:

    rax, rbx, rcx, rdx, rsi, rdi, rbp,
    r8, r9, r10, r11, r12, r13, r14, r15

* Patched code:

  - informs GCC that we modify following registers
    using the clobber description:

    rbx, rdi, rsi,
    r8, r9, r10, r11, r12, r13, r14, r15

  - doesn't save rax because it is an output operand (vmx->fail)

  - cannot put rcx in clobber description because it is an input operand,
    but as we modify it and we want to keep its value (vcpu), we must
    save it (pop/push)

  - rbp is saved (pop/push) because GCC seems to ignore its use in the clobber
    description.

  - rdx is saved (pop/push) because it is reserved by GCC (REGPARM) and
    cannot be put in the clobber description.

  - line "mov (%%rsp), %3 \n\t" has been removed because %3
    is rcx and rcx is restored just after.

  - line ASM_VMX_VMWRITE_RSP_RDX() is moved out of the ifdef/else/endif
Signed-off-by: NLaurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

c2036300

KVM: Add ioctl to tss address from userspace, · cbc94022

由 Izik Eidus 提交于 10月 25, 2007

Currently kvm has a wart in that it requires three extra pages for use
as a tss when emulating real mode on Intel.  This patch moves the allocation
internally, only requiring userspace to tell us where in the physical address
space we can place the tss.
Signed-off-by: NIzik Eidus <izike@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

cbc94022

KVM: Move vmx_vcpu_reset() out of vmx_vcpu_setup() · e00c8cf2

由 Avi Kivity 提交于 10月 21, 2007

Split guest reset code out of vmx_vcpu_setup().  Besides being cleaner, this
moves the realmode tss setup (which can sleep) outside vmx_vcpu_setup()
(which is executed with preemption enabled).

[izik: remove unused variable]
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e00c8cf2

KVM: Portability: Split kvm_vcpu into arch dependent and independent parts (part 1) · 34c16eec

由 Zhang Xiantao 提交于 10月 20, 2007

First step to split kvm_vcpu.  Currently, we just use an macro to define
the common fields in kvm_vcpu for all archs, and all archs need to define
its own kvm_vcpu struct.
Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

34c16eec

KVM: Move apic timer interrupt backlog processing to common code · ab6ef34b

由 Avi Kivity 提交于 10月 16, 2007

Beside the obvious goodness of making code more common, this prevents
a livelock with the next patch which moves interrupt injection out of the
critical section.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ab6ef34b

KVM: CodingStyle cleanup · d77c26fc

由 Mike Day 提交于 10月 08, 2007

Signed-off-by: NMike D. Day <ncmike@ncultra.org>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d77c26fc

KVM: Hoist kvm_create_lapic() into kvm_vcpu_init() · 76fafa5e

由 Rusty Russell 提交于 10月 08, 2007

Move kvm_create_lapic() into kvm_vcpu_init(), rather than having svm
and vmx do it.  And make it return the error rather than a fairly
random -ENOMEM.

This also solves the problem that neither svm.c nor vmx.c actually
handles the error path properly.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

76fafa5e

I
KVM: Add general accessors to read and write guest memory · 195aefde
由 Izik Eidus 提交于 10月 01, 2007
```
Signed-off-by: NIzik Eidus <izike@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
195aefde

KVM: VMX: Simplify vcpu_clear() · f566e09f

由 Avi Kivity 提交于 9月 30, 2007

Now that smp_call_function_single() knows how to call a function on the
current cpu, there's no need to check explicitly.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f566e09f

A
KVM: VMX: Don't clear the vmcs if the vcpu is not loaded on any processor · eae5ecb5
由 Avi Kivity 提交于 9月 30, 2007
```
Noted by Eddie Dong.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
eae5ecb5

KVM: Allow not-present guest page faults to bypass kvm · c7addb90

由 Avi Kivity 提交于 9月 16, 2007

There are two classes of page faults trapped by kvm:
 - host page faults, where the fault is needed to allow kvm to install
   the shadow pte or update the guest accessed and dirty bits
 - guest page faults, where the guest has faulted and kvm simply injects
   the fault back into the guest to handle

The second class, guest page faults, is pure overhead.  We can eliminate
some of it on vmx using the following evil trick:
 - when we set up a shadow page table entry, if the corresponding guest pte
   is not present, set up the shadow pte as not present
 - if the guest pte _is_ present, mark the shadow pte as present but also
   set one of the reserved bits in the shadow pte
 - tell the vmx hardware not to trap faults which have the present bit clear

With this, normal page-not-present faults go directly to the guest,
bypassing kvm entirely.

Unfortunately, this trick only works on Intel hardware, as AMD lacks a
way to discriminate among page faults based on error code.  It is also
a little risky since it uses reserved bits which might become unreserved
in the future, so a module parameter is provided to disable it.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

c7addb90

KVM: VMX: Further reduce efer reloads · 51c6cf66

由 Avi Kivity 提交于 8月 29, 2007

KVM avoids reloading the efer msr when the difference between the guest
and host values consist of the long mode bits (which are switched by
hardware) and the NX bit (which is emulated by the KVM MMU).

This patch also allows KVM to ignore SCE (syscall enable) when the guest
is running in 32-bit mode.  This is because the syscall instruction is
not available in 32-bit mode on Intel processors, so the SCE bit is
effectively meaningless.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

51c6cf66

KVM: Call x86_decode_insn() only when needed · 3427318f

由 Laurent Vivier 提交于 9月 18, 2007

Move emulate_ctxt to kvm_vcpu to keep emulate context when we exit from kvm
module. Call x86_decode_insn() only when needed. Modify x86_emulate_insn() to
not modify the context if it must be re-entered.
Signed-off-by: NLaurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

3427318f

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功