提交 · 03b82a30ea8b26199901b219848d706dbd70c609 · openanolis / cloud-kernel

25 4月, 2010 17 次提交

KVM: x86: Do not return soft events in vcpu_events · 03b82a30

由 Jan Kiszka 提交于 2月 15, 2010

To avoid that user space migrates a pending software exception or
interrupt, mask them out on KVM_GET_VCPU_EVENTS. Without this, user
space would try to reinject them, and we would have to reconstruct the
proper instruction length for VMX event injection. Now the pending event
will be reinjected via executing the triggering instruction again.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

03b82a30

KVM: SVM: Fix wrong interrupt injection in enable_irq_windows · 8fe54654

由 Joerg Roedel 提交于 2月 19, 2010

The nested_svm_intr() function does not execute the vmexit
anymore. Therefore we may still be in the nested state after
that function ran. This patch changes the nested_svm_intr()
function to return wether the irq window could be enabled.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8fe54654

KVM: drop unneeded kvm_run check in emulate_instruction() · 112592da

由 Gleb Natapov 提交于 2月 21, 2010

vcpu->run is initialized on vcpu creation and can never be NULL
here.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

112592da

KVM: SVM: Remove newlines from nested trace points · 052ce621

由 Joerg Roedel 提交于 2月 19, 2010

The tracing infrastructure adds its own newlines. Remove
them from the trace point printk format strings.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

052ce621

KVM: SVM: Make lazy FPU switching work with nested svm · 66a562f7

由 Joerg Roedel 提交于 2月 19, 2010

The new lazy fpu switching code may disable cr0 intercepts
when running nested. This is a bug because the nested
hypervisor may still want to intercept cr0 which will break
in this situation. This patch fixes this issue and makes
lazy fpu switching working with nested svm.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

66a562f7

KVM: SVM: Activate nested state only when guest state is complete · 06fc7772

由 Joerg Roedel 提交于 2月 19, 2010

Certain functions called during the emulated world switch
behave differently when the vcpu is running nested. This is
not the expected behavior during a world switch emulation.
This patch ensures that the nested state is activated only
if the vcpu is completly in nested state.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

06fc7772

KVM: SVM: Don't sync nested cr8 to lapic and back · 88ab24ad

由 Joerg Roedel 提交于 2月 19, 2010

This patch makes syncing of the guest tpr to the lapic
conditional on !nested. Otherwise a nested guest using the
TPR could freeze the guest.
Another important change this patch introduces is that the
cr8 intercept bits are no longer ORed at vmrun emulation if
the guest sets VINTR_MASKING in its VMCB. The reason is that
nested cr8 accesses need alway be handled by the nested
hypervisor because they change the shadow version of the
tpr.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

88ab24ad

KVM: SVM: Fix nested msr intercept handling · 4c7da8cb

由 Joerg Roedel 提交于 2月 19, 2010

The nested_svm_exit_handled_msr() function maps only one
page of the guests msr permission bitmap. This patch changes
the code to use kvm_read_guest to fix the bug.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4c7da8cb

KVM: SVM: Annotate nested_svm_map with might_sleep() · 6c3bd3d7

由 Joerg Roedel 提交于 2月 19, 2010

The nested_svm_map() function can sleep and must not be
called from atomic context. So annotate that function.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6c3bd3d7

KVM: SVM: Sync all control registers on nested vmexit · cdbbdc12

由 Joerg Roedel 提交于 2月 19, 2010

Currently the vmexit emulation does not sync control
registers were the access is typically intercepted by the
nested hypervisor. But we can not count on that intercepts
to sync these registers too and make the code
architecturally more correct.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

cdbbdc12

KVM: SVM: Fix schedule-while-atomic on nested exception handling · b8e88bc8

由 Joerg Roedel 提交于 2月 19, 2010

Move the actual vmexit routine out of code that runs with
irqs and preemption disabled.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b8e88bc8

KVM: SVM: Don't use kmap_atomic in nested_svm_map · 7597f129

由 Joerg Roedel 提交于 2月 19, 2010

Use of kmap_atomic disables preemption but if we run in
shadow-shadow mode the vmrun emulation executes kvm_set_cr3
which might sleep or fault. So use kmap instead for
nested_svm_map.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7597f129

KVM: x86 emulator: Fix x86_emulate_insn() not to use the variable rc for non-X86EMUL values · 0e4176a1

由 Takuya Yoshikawa 提交于 2月 12, 2010

This patch makes non-X86EMUL_* family functions not to use
the variable rc.

Be sure that this changes nothing but makes the purpose of
the variable rc clearer.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0e4176a1

KVM: x86 emulator: X86EMUL macro replacements: x86_emulate_insn() and its helpers · 1b30eaa8

由 Takuya Yoshikawa 提交于 2月 12, 2010

This patch just replaces integer values used inside
x86_emulate_insn() and its helper functions to X86EMUL_*.

The purpose of this is to make it clear what will happen
when the variable rc is compared to X86EMUL_* at the end
of x86_emulate_insn().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1b30eaa8

KVM: x86 emulator: X86EMUL macro replacements: from do_fetch_insn_byte() to x86_decode_insn() · 3e2815e9

由 Takuya Yoshikawa 提交于 2月 12, 2010

This patch just replaces the integer values used inside x86's
decode functions to X86EMUL_*.

By this patch, it becomes clearer that we are using X86EMUL_*
value propagated from ops->read_std() in do_fetch_insn_byte().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3e2815e9

KVM: inject #UD in 64bit mode from instruction that are not valid there · 1161624f

由 Gleb Natapov 提交于 2月 11, 2010

Some instruction are obsolete in a long mode. Inject #UD.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1161624f

KVM: use desc_ptr struct instead of kvm private descriptor_table · 89a27f4d

由 Gleb Natapov 提交于 2月 16, 2010

x86 arch defines desc_ptr for idt/gdt pointers, no need to define
another structure in kvm code.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

89a27f4d

21 4月, 2010 1 次提交

KVM: x86: Fix TSS size check for 16-bit tasks · e8861cfe

由 Jan Kiszka 提交于 4月 14, 2010

A 16-bit TSS is only 44 bytes long. So make sure to test for the correct
size on task switch.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e8861cfe

20 4月, 2010 7 次提交

KVM: fix the handling of dirty bitmaps to avoid overflows · 87bf6e7d

由 Takuya Yoshikawa 提交于 4月 12, 2010

Int is not long enough to store the size of a dirty bitmap.

This patch fixes this problem with the introduction of a wrapper
function to calculate the sizes of dirty bitmaps.

Note: in mark_page_dirty(), we have to consider the fact that
  __set_bit() takes the offset as int, not long.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

87bf6e7d

KVM: MMU: fix kvm_mmu_zap_page() and its calling path · 77662e00

由 Xiao Guangrong 提交于 4月 16, 2010

This patch fix:

- calculate zapped page number properly in mmu_zap_unsync_children()
- calculate freeed page number properly kvm_mmu_change_mmu_pages()
- if zapped children page it shoud restart hlist walking

KVM-Stable-Tag.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

77662e00

KVM: VMX: Save/restore rflags.vm correctly in real mode · 78ac8b47

由 Avi Kivity 提交于 4月 08, 2010

Currently we set eflags.vm unconditionally when entering real mode emulation
through virtual-8086 mode, and clear it unconditionally when we enter protected
mode.  The means that the following sequence

  KVM_SET_REGS  (rflags.vm=1)
  KVM_SET_SREGS (cr0.pe=1)

Ends up with rflags.vm clear due to KVM_SET_SREGS triggering enter_pmode().

Fix by shadowing rflags.vm (and rflags.iopl) correctly while in real mode:
reads and writes to those bits access a shadow register instead of the actual
register.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

78ac8b47

KVM: allow bit 10 to be cleared in MSR_IA32_MC4_CTL · 114be429

由 Andre Przywara 提交于 3月 24, 2010

There is a quirk for AMD K8 CPUs in many Linux kernels (see
arch/x86/kernel/cpu/mcheck/mce.c:__mcheck_cpu_apply_quirks()) that
clears bit 10 in that MCE related MSR. KVM can only cope with all
zeros or all ones, so it will inject a #GP into the guest, which
will let it panic.
So lets add a quirk to the quirk and ignore this single cleared bit.
This fixes -cpu kvm64 on all machines and -cpu host on K8 machines
with some guest Linux kernels.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

114be429

A
KVM: Don't spam kernel log when injecting exceptions due to bad cr writes · d6a23895
由 Avi Kivity 提交于 3月 11, 2010
```
These are guest-triggerable.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
d6a23895

KVM: SVM: Fix memory leaks that happen when svm_create_vcpu() fails · b7af4043

由 Takuya Yoshikawa 提交于 3月 09, 2010

svm_create_vcpu() does not free the pages allocated during the creation
when it fails to complete the allocations. This patch fixes it.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b7af4043

KVM: take srcu lock before call to complete_pio() · 7567cae1

由 Gleb Natapov 提交于 3月 09, 2010

complete_pio() may use slot table which is protected by srcu.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

7567cae1

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

01 3月, 2010 14 次提交

KVM: x86: Add KVM_CAP_X86_ROBUST_SINGLESTEP · d2be1651

由 Jan Kiszka 提交于 2月 23, 2010

This marks the guest single-step API improvement of 94fe45da and
91586a3b with a capability flag to allow reliable detection by user
space.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Cc: stable@kernel.org (2.6.33)
Signed-off-by: NAvi Kivity <avi@redhat.com>

d2be1651

KVM: VMX: Update instruction length on intercepted BP · c573cd22

由 Jan Kiszka 提交于 2月 23, 2010

We intercept #BP while in guest debugging mode. As VM exits due to
intercepted exceptions do not necessarily come with valid
idt_vectoring, we have to update event_exit_inst_len explicitly in such
cases. At least in the absence of migration, this ensures that
re-injections of #BP will find and use the correct instruction length.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Cc: stable@kernel.org (2.6.32, 2.6.33)
Signed-off-by: NAvi Kivity <avi@redhat.com>

c573cd22

KVM: Fix emulate_sys[call, enter, exit]()'s fault handling · e54cfa97

由 Takuya Yoshikawa 提交于 2月 18, 2010

This patch fixes emulate_syscall(), emulate_sysenter() and
emulate_sysexit() to handle injected faults properly.

Even though original code injects faults in these functions,
we cannot handle these unless we use the different return
value from the UNHANDLEABLE case. So this patch use X86EMUL_*
codes instead of -1 and 0 and makes x86_emulate_insn() to
handle these propagated faults.

Be sure that, in x86_emulate_insn(), goto cannot_emulate and
goto done with rc equals X86EMUL_UNHANDLEABLE have same effect.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e54cfa97

KVM: Fix segment descriptor loading · c697518a

由 Gleb Natapov 提交于 2月 18, 2010

Add proper error and permission checking. This patch also change task
switching code to load segment selectors before segment descriptors, like
SDM requires, otherwise permission checking during segment descriptor
loading will be incorrect.

Cc: stable@kernel.org (2.6.33, 2.6.32)
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c697518a

KVM: Fix load_guest_segment_descriptor() to inject page fault · 6f550484

由 Takuya Yoshikawa 提交于 2月 18, 2010

This patch injects page fault when reading descriptor in
load_guest_segment_descriptor() fails with FAULT.

Effects of this injection: This function is used by
kvm_load_segment_descriptor() which is necessary for the
following instructions:

 - mov seg,r/m16
 - jmp far
 - pop ?s

This patch makes it possible to emulate the page faults
generated by these instructions. But be sure that unless
we change the kvm_load_segment_descriptor()'s ret value
propagation this patch has no effect.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6f550484

KVM: x86 emulator: Forbid modifying CS segment register by mov instruction · 8b9f4414

由 Gleb Natapov 提交于 2月 18, 2010

Inject #UD if guest attempts to do so. This is in accordance to Intel
SDM.

Cc: stable@kernel.org (2.6.33, 2.6.32)
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8b9f4414

KVM: Convert i8254/i8259 locks to raw_spinlocks · fa8273e9

由 Thomas Gleixner 提交于 2月 17, 2010

The i8254/i8259 locks need to be real spinlocks on preempt-rt. Convert
them to raw_spinlock. No change for !RT kernels.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fa8273e9

KVM: x86 emulator: disallow opcode 82 in 64-bit mode · e424e191

由 Gleb Natapov 提交于 2月 11, 2010

Instructions with opcode 82 are not valid in 64 bit mode.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e424e191

KVM: x86 emulator: code style cleanup · 1d327eac

由 Wei Yongjun 提交于 2月 11, 2010

Just remove redundant semicolon.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1d327eac

KVM: x86 emulator: Add LOCK prefix validity checking · d380a5e4

由 Gleb Natapov 提交于 2月 10, 2010

Instructions which are not allowed to have LOCK prefix should
generate #UD if one is used.

[avi: fold opcode 82 fix from another patch]
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d380a5e4

KVM: x86 emulator: Check CPL level during privilege instruction emulation · e92805ac

由 Gleb Natapov 提交于 2月 10, 2010

Add CPL checking in case emulator is tricked into emulating
privilege instruction from userspace.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

e92805ac

KVM: x86 emulator: Fix popf emulation · d4c6a154

由 Gleb Natapov 提交于 2月 10, 2010

POPF behaves differently depending on current CPU mode. Emulate correct
logic to prevent guest from changing flags that it can't change otherwise.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

d4c6a154

KVM: x86 emulator: Check IOPL level during io instruction emulation · f850e2e6

由 Gleb Natapov 提交于 2月 10, 2010

Make emulator check that vcpu is allowed to execute IN, INS, OUT,
OUTS, CLI, STI.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

f850e2e6

KVM: x86 emulator: fix memory access during x86 emulation · 1871c602

由 Gleb Natapov 提交于 2月 10, 2010

Currently when x86 emulator needs to access memory, page walk is done with
broadest permission possible, so if emulated instruction was executed
by userspace process it can still access kernel memory. Fix that by
providing correct memory access to page walker during emulation.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

1871c602

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功