提交 · 4a5f48f666ccc4ffdbc54241d9cab06806ed7922 · openeuler / raspberrypi-kernel

17 5月, 2010 5 次提交

KVM: Don't follow an atomic operation by a non-atomic one · 4a5f48f6

由 Avi Kivity 提交于 3月 15, 2010

Currently emulated atomic operations are immediately followed by a non-atomic
operation, so that kvm_mmu_pte_write() can be invoked.  This updates the mmu
but undoes the whole point of doing things atomically.

Fix by only performing the atomic operation and the mmu update, and avoiding
the non-atomic write.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4a5f48f6

KVM: Make locked operations truly atomic · daea3e73

由 Avi Kivity 提交于 3月 15, 2010

Once upon a time, locked operations were emulated while holding the mmu mutex.
Since mmu pages were write protected, it was safe to emulate the writes in
a non-atomic manner, since there could be no other writer, either in the
guest or in the kernel.

These days emulation takes place without holding the mmu spinlock, so the
write could be preempted by an unshadowing event, which exposes the page
to writes by the guest. This may cause corruption of guest page tables.

Fix by using an atomic cmpxchg for these operations.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

daea3e73

KVM: x86: fix the error of ioctl KVM_IRQ_LINE if no irq chip · 160d2f6c

由 Wei Yongjun 提交于 3月 12, 2010

If no irq chip in kernel, ioctl KVM_IRQ_LINE will return -EFAULT.
But I see in other place such as KVM_[GET|SET]IRQCHIP, -ENXIO is
return. So this patch used -ENXIO instead of -EFAULT.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

160d2f6c

KVM: Trace exception injection · 5c1c85d0

由 Avi Kivity 提交于 3月 11, 2010

Often an exception can help point out where things start to go wrong.
Signed-off-by: NAvi Kivity <avi@redhat.com>

5c1c85d0

KVM: cleanup kvm trace · 2ed152af

由 Xiao Guangrong 提交于 3月 10, 2010

This patch does:

 - no need call tracepoint_synchronize_unregister() when kvm module
   is unloaded since ftrace can handle it

 - cleanup ftrace's macro
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2ed152af

25 4月, 2010 14 次提交

KVM: move segment_base() into vmx.c · 2d49ec72

由 Gleb Natapov 提交于 2月 25, 2010

segment_base() is used only by vmx so move it there.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2d49ec72

KVM: fix segment_base() error checking · 254d4d48

由 Gleb Natapov 提交于 2月 25, 2010

fix segment_base() to properly check for null segment selector and
avoid accessing NULL pointer if ldt selector in null.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

254d4d48

KVM: Drop kvm_get_gdt() in favor of generic linux function · d6ab1ed4

由 Gleb Natapov 提交于 2月 25, 2010

Linux now has native_store_gdt() to do the same. Use it instead of
kvm local version.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d6ab1ed4

KVM: x86: Don't set arch.cr0 in kvm_set_cr0 · b44ea385

由 Joerg Roedel 提交于 2月 24, 2010

The vcpu->arch.cr0 variable is already set in the
architecture specific set_cr0 callbacks. There is no need to
set it in the common code.
This allows the architecture code to keep the old arch.cr0
value if it wants. This is required for nested svm to decide
if a selective_cr0 exit needs to be injected.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b44ea385

KVM: SVM: Ignore write of hwcr.ignne · 82494028

由 Joerg Roedel 提交于 2月 24, 2010

Hyper-V as a guest wants to write this bit. This patch
ignores it.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

82494028

KVM: SVM: Add kvm_nested_intercepts tracepoint · 2e554e8d

由 Joerg Roedel 提交于 2月 24, 2010

This patch adds a tracepoint to get information about the
most important intercept bitmasks from the nested vmcb.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2e554e8d

KVM: x86: Preserve injected TF across emulation · 83bf0002

由 Jan Kiszka 提交于 2月 23, 2010

Call directly into the vendor services for getting/setting rflags in
emulate_instruction to ensure injected TF survives the emulation.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

83bf0002

KVM: x86: Drop RF manipulation for guest single-stepping · c310bac5

由 Jan Kiszka 提交于 2月 23, 2010

RF is not required for injecting TF as the latter will trigger only
after an instruction execution anyway. So do not touch RF when arming or
disarming guest single-step mode.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c310bac5

KVM: x86: Add kvm_is_linear_rip · f92653ee

由 Jan Kiszka 提交于 2月 23, 2010

Based on Gleb's suggestion: Add a helper kvm_is_linear_rip that matches
a given linear RIP against the current one. Use this for guest
single-stepping, more users will follow.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f92653ee

KVM: x86: Add support for saving&restoring debug registers · a1efbe77

由 Jan Kiszka 提交于 2月 15, 2010

So far user space was not able to save and restore debug registers for
migration or after reset. Plug this hole.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a1efbe77

KVM: x86: Save&restore interrupt shadow mask · 48005f64

由 Jan Kiszka 提交于 2月 19, 2010

The interrupt shadow created by STI or MOV-SS-like operations is part of
the VCPU state and must be preserved across migration. Transfer it in
the spare padding field of kvm_vcpu_events.interrupt.

As a side effect we now have to make vmx_set_interrupt_shadow robust
against both shadow types being set. Give MOV SS a higher priority and
skip STI in that case to avoid that VMX throws a fault on next entry.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

48005f64

KVM: x86: Do not return soft events in vcpu_events · 03b82a30

由 Jan Kiszka 提交于 2月 15, 2010

To avoid that user space migrates a pending software exception or
interrupt, mask them out on KVM_GET_VCPU_EVENTS. Without this, user
space would try to reinject them, and we would have to reconstruct the
proper instruction length for VMX event injection. Now the pending event
will be reinjected via executing the triggering instruction again.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

03b82a30

KVM: drop unneeded kvm_run check in emulate_instruction() · 112592da

由 Gleb Natapov 提交于 2月 21, 2010

vcpu->run is initialized on vcpu creation and can never be NULL
here.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

112592da

KVM: use desc_ptr struct instead of kvm private descriptor_table · 89a27f4d

由 Gleb Natapov 提交于 2月 16, 2010

x86 arch defines desc_ptr for idt/gdt pointers, no need to define
another structure in kvm code.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

89a27f4d

21 4月, 2010 1 次提交

KVM: x86: Fix TSS size check for 16-bit tasks · e8861cfe

由 Jan Kiszka 提交于 4月 14, 2010

A 16-bit TSS is only 44 bytes long. So make sure to test for the correct
size on task switch.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e8861cfe

20 4月, 2010 4 次提交

KVM: fix the handling of dirty bitmaps to avoid overflows · 87bf6e7d

由 Takuya Yoshikawa 提交于 4月 12, 2010

Int is not long enough to store the size of a dirty bitmap.

This patch fixes this problem with the introduction of a wrapper
function to calculate the sizes of dirty bitmaps.

Note: in mark_page_dirty(), we have to consider the fact that
  __set_bit() takes the offset as int, not long.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

87bf6e7d

KVM: allow bit 10 to be cleared in MSR_IA32_MC4_CTL · 114be429

由 Andre Przywara 提交于 3月 24, 2010

There is a quirk for AMD K8 CPUs in many Linux kernels (see
arch/x86/kernel/cpu/mcheck/mce.c:__mcheck_cpu_apply_quirks()) that
clears bit 10 in that MCE related MSR. KVM can only cope with all
zeros or all ones, so it will inject a #GP into the guest, which
will let it panic.
So lets add a quirk to the quirk and ignore this single cleared bit.
This fixes -cpu kvm64 on all machines and -cpu host on K8 machines
with some guest Linux kernels.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

114be429

A
KVM: Don't spam kernel log when injecting exceptions due to bad cr writes · d6a23895
由 Avi Kivity 提交于 3月 11, 2010
```
These are guest-triggerable.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
d6a23895

KVM: take srcu lock before call to complete_pio() · 7567cae1

由 Gleb Natapov 提交于 3月 09, 2010

complete_pio() may use slot table which is protected by srcu.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

7567cae1

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

01 3月, 2010 15 次提交

KVM: x86: Add KVM_CAP_X86_ROBUST_SINGLESTEP · d2be1651

由 Jan Kiszka 提交于 2月 23, 2010

This marks the guest single-step API improvement of 94fe45da and
91586a3b with a capability flag to allow reliable detection by user
space.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Cc: stable@kernel.org (2.6.33)
Signed-off-by: NAvi Kivity <avi@redhat.com>

d2be1651

KVM: Fix segment descriptor loading · c697518a

由 Gleb Natapov 提交于 2月 18, 2010

Add proper error and permission checking. This patch also change task
switching code to load segment selectors before segment descriptors, like
SDM requires, otherwise permission checking during segment descriptor
loading will be incorrect.

Cc: stable@kernel.org (2.6.33, 2.6.32)
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c697518a

KVM: Fix load_guest_segment_descriptor() to inject page fault · 6f550484

由 Takuya Yoshikawa 提交于 2月 18, 2010

This patch injects page fault when reading descriptor in
load_guest_segment_descriptor() fails with FAULT.

Effects of this injection: This function is used by
kvm_load_segment_descriptor() which is necessary for the
following instructions:

 - mov seg,r/m16
 - jmp far
 - pop ?s

This patch makes it possible to emulate the page faults
generated by these instructions. But be sure that unless
we change the kvm_load_segment_descriptor()'s ret value
propagation this patch has no effect.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6f550484

KVM: Convert i8254/i8259 locks to raw_spinlocks · fa8273e9

由 Thomas Gleixner 提交于 2月 17, 2010

The i8254/i8259 locks need to be real spinlocks on preempt-rt. Convert
them to raw_spinlock. No change for !RT kernels.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fa8273e9

KVM: x86 emulator: Check IOPL level during io instruction emulation · f850e2e6

由 Gleb Natapov 提交于 2月 10, 2010

Make emulator check that vcpu is allowed to execute IN, INS, OUT,
OUTS, CLI, STI.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

f850e2e6

KVM: x86 emulator: fix memory access during x86 emulation · 1871c602

由 Gleb Natapov 提交于 2月 10, 2010

Currently when x86 emulator needs to access memory, page walk is done with
broadest permission possible, so if emulated instruction was executed
by userspace process it can still access kernel memory. Fix that by
providing correct memory access to page walker during emulation.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

1871c602

KVM: x86 emulator: Add Virtual-8086 mode of emulation · a0044755

由 Gleb Natapov 提交于 2月 10, 2010

For some instructions CPU behaves differently for real-mode and
virtual 8086. Let emulator know which mode cpu is in, so it will
not poke into vcpu state directly.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

a0044755

KVM: cleanup the failure path of KVM_CREATE_IRQCHIP ioctrl · 72bb2fcd

由 Wei Yongjun 提交于 2月 09, 2010

If we fail to init ioapic device or the fail to setup the default irq
routing, the device register by kvm_create_pic() and kvm_ioapic_init()
remain unregister. This patch fixed to do this.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

72bb2fcd

KVM: Remove redundant reading of rax on OUT instructions · 1976d2d2

由 Takuya Yoshikawa 提交于 2月 05, 2010

kvm_emulate_pio() and complete_pio() both read out the
RAX register value and copy it to a place into which
the value read out from the port will be copied later.

This patch removes this redundancy.

/*** snippet from arch/x86/kvm/x86.c ***/
int complete_pio(struct kvm_vcpu *vcpu)
{
	...
	if (!io->string) {
		if (io->in) {
			val = kvm_register_read(vcpu, VCPU_REGS_RAX);
			memcpy(&val, vcpu->arch.pio_data, io->size);
			kvm_register_write(vcpu, VCPU_REGS_RAX, val);
		}
	...
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1976d2d2

KVM: fix kvm_fix_hypercall() to return X86EMUL_* · 7edcface

由 Takuya Yoshikawa 提交于 2月 01, 2010

This patch fixes kvm_fix_hypercall() to propagate X86EMUL_*
info generated by emulator_write_emulated() to its callers:
suggested by Marcelo.

The effect of this is x86_emulate_insn() will begin to handle
the page faults which occur in emulator_write_emulated():
this should be OK because emulator_write_emulated_onepage()
always injects page fault when emulator_write_emulated()
returns X86EMUL_PROPAGATE_FAULT.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7edcface

KVM: fix load_guest_segment_descriptor() to return X86EMUL_* · c125c607

由 Takuya Yoshikawa 提交于 2月 01, 2010

This patch fixes load_guest_segment_descriptor() to return
X86EMUL_PROPAGATE_FAULT when it tries to access the descriptor
table beyond the limit of it: suggested by Marcelo.

I have checked current callers of this helper function,
  - kvm_load_segment_descriptor()
  - kvm_task_switch()
and confirmed that this patch will change nothing in the
upper layers if we do not change the handling of this
return value from load_guest_segment_descriptor().

Next step: Although fixing the kvm_task_switch() to handle the
propagated faults properly seems difficult, and maybe not worth
it because TSS is not used commonly these days, we can fix
kvm_load_segment_descriptor(). By doing so, the injected #GP
becomes possible to be handled by the guest. The only problem
for this is how to differentiate this fault from the page faults
generated by kvm_read_guest_virt(). We may have to split this
function to achive this goal.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c125c607

KVM: enable PCI multiple-segments for pass-through device · ab9f4ecb

由 Zhai, Edwin 提交于 1月 29, 2010

Enable optional parameter (default 0) - PCI segment (or domain) besides
BDF, when assigning PCI device to guest.
Signed-off-by: NZhai Edwin <edwin.zhai@intel.com>
Acked-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ab9f4ecb

KVM: mark segments accessed on HW task switch · e01c2426

由 Gleb Natapov 提交于 1月 25, 2010

On HW task switch newly loaded segments should me marked as accessed.
Reported-by: NLorenzo Martignoni <martignlo@gmail.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e01c2426

KVM: trace guest fpu loads and unloads · 0c04851c

由 Avi Kivity 提交于 1月 21, 2010

Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0c04851c

KVM: Rename vcpu->shadow_efer to efer · f6801dff

由 Avi Kivity 提交于 1月 21, 2010

None of the other registers have the shadow_ prefix.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f6801dff