提交 · 03b82a30ea8b26199901b219848d706dbd70c609 · openanolis / cloud-kernel

25 4月, 2010 3 次提交

KVM: x86: Do not return soft events in vcpu_events · 03b82a30

由 Jan Kiszka 提交于 2月 15, 2010

To avoid that user space migrates a pending software exception or
interrupt, mask them out on KVM_GET_VCPU_EVENTS. Without this, user
space would try to reinject them, and we would have to reconstruct the
proper instruction length for VMX event injection. Now the pending event
will be reinjected via executing the triggering instruction again.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

03b82a30

KVM: drop unneeded kvm_run check in emulate_instruction() · 112592da

由 Gleb Natapov 提交于 2月 21, 2010

vcpu->run is initialized on vcpu creation and can never be NULL
here.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

112592da

KVM: use desc_ptr struct instead of kvm private descriptor_table · 89a27f4d

由 Gleb Natapov 提交于 2月 16, 2010

x86 arch defines desc_ptr for idt/gdt pointers, no need to define
another structure in kvm code.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

89a27f4d

21 4月, 2010 1 次提交

KVM: x86: Fix TSS size check for 16-bit tasks · e8861cfe

由 Jan Kiszka 提交于 4月 14, 2010

A 16-bit TSS is only 44 bytes long. So make sure to test for the correct
size on task switch.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e8861cfe

20 4月, 2010 4 次提交

KVM: fix the handling of dirty bitmaps to avoid overflows · 87bf6e7d

由 Takuya Yoshikawa 提交于 4月 12, 2010

Int is not long enough to store the size of a dirty bitmap.

This patch fixes this problem with the introduction of a wrapper
function to calculate the sizes of dirty bitmaps.

Note: in mark_page_dirty(), we have to consider the fact that
  __set_bit() takes the offset as int, not long.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

87bf6e7d

KVM: allow bit 10 to be cleared in MSR_IA32_MC4_CTL · 114be429

由 Andre Przywara 提交于 3月 24, 2010

There is a quirk for AMD K8 CPUs in many Linux kernels (see
arch/x86/kernel/cpu/mcheck/mce.c:__mcheck_cpu_apply_quirks()) that
clears bit 10 in that MCE related MSR. KVM can only cope with all
zeros or all ones, so it will inject a #GP into the guest, which
will let it panic.
So lets add a quirk to the quirk and ignore this single cleared bit.
This fixes -cpu kvm64 on all machines and -cpu host on K8 machines
with some guest Linux kernels.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

114be429

A
KVM: Don't spam kernel log when injecting exceptions due to bad cr writes · d6a23895
由 Avi Kivity 提交于 3月 11, 2010
```
These are guest-triggerable.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
d6a23895

KVM: take srcu lock before call to complete_pio() · 7567cae1

由 Gleb Natapov 提交于 3月 09, 2010

complete_pio() may use slot table which is protected by srcu.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

7567cae1

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

01 3月, 2010 31 次提交

KVM: x86: Add KVM_CAP_X86_ROBUST_SINGLESTEP · d2be1651

由 Jan Kiszka 提交于 2月 23, 2010

This marks the guest single-step API improvement of 94fe45da and
91586a3b with a capability flag to allow reliable detection by user
space.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Cc: stable@kernel.org (2.6.33)
Signed-off-by: NAvi Kivity <avi@redhat.com>

d2be1651

KVM: Fix segment descriptor loading · c697518a

由 Gleb Natapov 提交于 2月 18, 2010

Add proper error and permission checking. This patch also change task
switching code to load segment selectors before segment descriptors, like
SDM requires, otherwise permission checking during segment descriptor
loading will be incorrect.

Cc: stable@kernel.org (2.6.33, 2.6.32)
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c697518a

KVM: Fix load_guest_segment_descriptor() to inject page fault · 6f550484

由 Takuya Yoshikawa 提交于 2月 18, 2010

This patch injects page fault when reading descriptor in
load_guest_segment_descriptor() fails with FAULT.

Effects of this injection: This function is used by
kvm_load_segment_descriptor() which is necessary for the
following instructions:

 - mov seg,r/m16
 - jmp far
 - pop ?s

This patch makes it possible to emulate the page faults
generated by these instructions. But be sure that unless
we change the kvm_load_segment_descriptor()'s ret value
propagation this patch has no effect.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6f550484

KVM: Convert i8254/i8259 locks to raw_spinlocks · fa8273e9

由 Thomas Gleixner 提交于 2月 17, 2010

The i8254/i8259 locks need to be real spinlocks on preempt-rt. Convert
them to raw_spinlock. No change for !RT kernels.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fa8273e9

KVM: x86 emulator: Check IOPL level during io instruction emulation · f850e2e6

由 Gleb Natapov 提交于 2月 10, 2010

Make emulator check that vcpu is allowed to execute IN, INS, OUT,
OUTS, CLI, STI.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

f850e2e6

KVM: x86 emulator: fix memory access during x86 emulation · 1871c602

由 Gleb Natapov 提交于 2月 10, 2010

Currently when x86 emulator needs to access memory, page walk is done with
broadest permission possible, so if emulated instruction was executed
by userspace process it can still access kernel memory. Fix that by
providing correct memory access to page walker during emulation.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

1871c602

KVM: x86 emulator: Add Virtual-8086 mode of emulation · a0044755

由 Gleb Natapov 提交于 2月 10, 2010

For some instructions CPU behaves differently for real-mode and
virtual 8086. Let emulator know which mode cpu is in, so it will
not poke into vcpu state directly.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

a0044755

KVM: cleanup the failure path of KVM_CREATE_IRQCHIP ioctrl · 72bb2fcd

由 Wei Yongjun 提交于 2月 09, 2010

If we fail to init ioapic device or the fail to setup the default irq
routing, the device register by kvm_create_pic() and kvm_ioapic_init()
remain unregister. This patch fixed to do this.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

72bb2fcd

KVM: Remove redundant reading of rax on OUT instructions · 1976d2d2

由 Takuya Yoshikawa 提交于 2月 05, 2010

kvm_emulate_pio() and complete_pio() both read out the
RAX register value and copy it to a place into which
the value read out from the port will be copied later.

This patch removes this redundancy.

/*** snippet from arch/x86/kvm/x86.c ***/
int complete_pio(struct kvm_vcpu *vcpu)
{
	...
	if (!io->string) {
		if (io->in) {
			val = kvm_register_read(vcpu, VCPU_REGS_RAX);
			memcpy(&val, vcpu->arch.pio_data, io->size);
			kvm_register_write(vcpu, VCPU_REGS_RAX, val);
		}
	...
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1976d2d2

KVM: fix kvm_fix_hypercall() to return X86EMUL_* · 7edcface

由 Takuya Yoshikawa 提交于 2月 01, 2010

This patch fixes kvm_fix_hypercall() to propagate X86EMUL_*
info generated by emulator_write_emulated() to its callers:
suggested by Marcelo.

The effect of this is x86_emulate_insn() will begin to handle
the page faults which occur in emulator_write_emulated():
this should be OK because emulator_write_emulated_onepage()
always injects page fault when emulator_write_emulated()
returns X86EMUL_PROPAGATE_FAULT.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7edcface

KVM: fix load_guest_segment_descriptor() to return X86EMUL_* · c125c607

由 Takuya Yoshikawa 提交于 2月 01, 2010

This patch fixes load_guest_segment_descriptor() to return
X86EMUL_PROPAGATE_FAULT when it tries to access the descriptor
table beyond the limit of it: suggested by Marcelo.

I have checked current callers of this helper function,
  - kvm_load_segment_descriptor()
  - kvm_task_switch()
and confirmed that this patch will change nothing in the
upper layers if we do not change the handling of this
return value from load_guest_segment_descriptor().

Next step: Although fixing the kvm_task_switch() to handle the
propagated faults properly seems difficult, and maybe not worth
it because TSS is not used commonly these days, we can fix
kvm_load_segment_descriptor(). By doing so, the injected #GP
becomes possible to be handled by the guest. The only problem
for this is how to differentiate this fault from the page faults
generated by kvm_read_guest_virt(). We may have to split this
function to achive this goal.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c125c607

KVM: enable PCI multiple-segments for pass-through device · ab9f4ecb

由 Zhai, Edwin 提交于 1月 29, 2010

Enable optional parameter (default 0) - PCI segment (or domain) besides
BDF, when assigning PCI device to guest.
Signed-off-by: NZhai Edwin <edwin.zhai@intel.com>
Acked-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ab9f4ecb

KVM: mark segments accessed on HW task switch · e01c2426

由 Gleb Natapov 提交于 1月 25, 2010

On HW task switch newly loaded segments should me marked as accessed.
Reported-by: NLorenzo Martignoni <martignlo@gmail.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e01c2426

KVM: trace guest fpu loads and unloads · 0c04851c

由 Avi Kivity 提交于 1月 21, 2010

Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0c04851c

KVM: Rename vcpu->shadow_efer to efer · f6801dff

由 Avi Kivity 提交于 1月 21, 2010

None of the other registers have the shadow_ prefix.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f6801dff

A
KVM: Add a helper for checking if the guest is in protected mode · 3eeb3288
由 Avi Kivity 提交于 1月 21, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
3eeb3288

KVM: Activate fpu on clts · 6b52d186

由 Avi Kivity 提交于 1月 21, 2010

Assume that if the guest executes clts, it knows what it's doing, and load the
guest fpu to prevent an #NM exception.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6b52d186

KVM: Drop kvm_{load,put}_guest_fpu() exports · e5bb4025

由 Avi Kivity 提交于 1月 21, 2010

Not used anymore.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e5bb4025

KVM: Allow kvm_load_guest_fpu() even when !vcpu->fpu_active · 2608d7a1

由 Avi Kivity 提交于 1月 21, 2010

This allows accessing the guest fpu from the instruction emulator, as well as
being symmetric with kvm_put_guest_fpu().
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2608d7a1

KVM: x86: fix checking of cr0 validity · ab344828

由 Gleb Natapov 提交于 1月 21, 2010

Move to/from Control Registers chapter of Intel SDM says. "Reserved bits
in CR0 remain clear after any load of those registers; attempts to set
them have no impact". Control Register chapter says "Bits 63:32 of CR0 are
reserved and must be written with zeros. Writing a nonzero value to any
of the upper 32 bits results in a general-protection exception, #GP(0)."

This patch tries to implement this twisted logic.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Reported-by: NLorenzo Martignoni <martignlo@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ab344828

KVM: SVM: Clean up and enhance mov dr emulation · c76de350

由 Jan Kiszka 提交于 1月 20, 2010

Enhance mov dr instruction emulation used by SVM so that it properly
handles dr4/5: alias to dr6/7 if cr4.de is cleared. Otherwise return
EMULATE_FAIL which will let our only possible caller in that scenario,
ud_interception, re-inject UD.

We do not need to inject faults, SVM does this for us (exceptions take
precedence over instruction interceptions). For the same reason, the
value overflow checks can be removed.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c76de350

KVM: fix cleanup_srcu_struct on vm destruction · 64749204

由 Marcelo Tosatti 提交于 1月 19, 2010

cleanup_srcu_struct on VM destruction remains broken:

BUG: unable to handle kernel paging request at ffffffffffffffff
IP: [<ffffffff802533d2>] srcu_read_lock+0x16/0x21
RIP: 0010:[<ffffffff802533d2>]  [<ffffffff802533d2>] srcu_read_lock+0x16/0x21
Call Trace:
 [<ffffffffa05354c4>] kvm_arch_vcpu_uninit+0x1b/0x48 [kvm]
 [<ffffffffa05339c6>] kvm_vcpu_uninit+0x9/0x15 [kvm]
 [<ffffffffa0569f7d>] vmx_free_vcpu+0x7f/0x8f [kvm_intel]
 [<ffffffffa05357b5>] kvm_arch_destroy_vm+0x78/0x111 [kvm]
 [<ffffffffa053315b>] kvm_put_kvm+0xd4/0xfe [kvm]

Move it to kvm_arch_destroy_vm.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Reported-by: NJan Kiszka <jan.kiszka@siemens.com>

64749204

KVM: fix Hyper-V hypercall warnings and wrong mask value · ccd46936

由 Gleb Natapov 提交于 1月 19, 2010

Fix compilation warnings and wrong mask value.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ccd46936

KVM: Implement NotifyLongSpinWait HYPER-V hypercall · c25bc163

由 Gleb Natapov 提交于 1月 17, 2010

Windows issues this hypercall after guest was spinning on a spinlock
for too many iterations.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NVadim Rozenfeld <vrozenfe@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c25bc163

KVM: Add HYPER-V apic access MSRs · 10388a07

由 Gleb Natapov 提交于 1月 17, 2010

Implement HYPER-V apic MSRs. Spec defines three MSRs that speed-up
access to EOI/TPR/ICR apic registers for PV guests.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NVadim Rozenfeld <vrozenfe@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

10388a07

KVM: Implement bare minimum of HYPER-V MSRs · 55cd8e5a

由 Gleb Natapov 提交于 1月 17, 2010

Minimum HYPER-V implementation should have GUEST_OS_ID, HYPERCALL and
VP_INDEX MSRs.

[avi: fix build on i386]
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NVadim Rozenfeld <vrozenfe@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

55cd8e5a

A
KVM: Set cr0.et when the guest writes cr0 · f9a48e6a
由 Avi Kivity 提交于 1月 06, 2010
```
Follow the hardware.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
f9a48e6a

KVM: Lazify fpu activation and deactivation · 02daab21

由 Avi Kivity 提交于 12月 30, 2009

Defer fpu deactivation as much as possible - if the guest fpu is loaded, keep
it loaded until the next heavyweight exit (where we are forced to unload it).
This reduces unnecessary exits.

We also defer fpu activation on clts; while clts signals the intent to use the
fpu, we can't be sure the guest will actually use it.
Signed-off-by: NAvi Kivity <avi@redhat.com>

02daab21

KVM: Replace read accesses of vcpu->arch.cr0 by an accessor · 4d4ec087

由 Avi Kivity 提交于 12月 29, 2009

Since we'd like to allow the guest to own a few bits of cr0 at times, we need
to know when we access those bits.
Signed-off-by: NAvi Kivity <avi@redhat.com>

4d4ec087

KVM: x86: Rename gb_page_enable() to get_lpage_level() in kvm_x86_ops · 17cc3935

由 Sheng Yang 提交于 1月 05, 2010

Then the callback can provide the maximum supported large page level, which
is more flexible.

Also move the gb page support into x86_64 specific.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

17cc3935

M
KVM: convert slots_lock to a mutex · 79fac95e
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
79fac95e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功