提交 · 1aa8ceef0312a6aae7dd863a120a55f1637b361d · openanolis / cloud-kernel

18 3月, 2011 16 次提交

KVM: fix kvmclock regression due to missing clock update · 1aa8ceef

由 Nikola Ciprich 提交于 3月 09, 2011

commit 387b9f97750444728962b236987fbe8ee8cc4f8c moved kvm_request_guest_time_update(vcpu),
breaking 32bit SMP guests using kvm-clock. Fix this by moving (new) clock update function
to proper place.
Signed-off-by: NNikola Ciprich <nikola.ciprich@linuxbox.cz>
Acked-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1aa8ceef

KVM: emulator: Fix io permission checking for 64bit guest · 5601d05b

由 Gleb Natapov 提交于 3月 07, 2011

Current implementation truncates upper 32bit of TR base address during IO
permission bitmap check. The patch fixes this.
Reported-and-tested-by: NFrancis Moreau <francis.moro@gmail.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5601d05b

KVM: MMU: move mmu pages calculated out of mmu lock · 48c0e4e9

由 Xiao Guangrong 提交于 3月 04, 2011

kvm_mmu_calculate_mmu_pages need to walk all memslots and it's protected by
kvm->slots_lock, so move it out of mmu spinlock
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

48c0e4e9

KVM: better readability of efer_reserved_bits · 1260edbe

由 Lai Jiangshan 提交于 2月 21, 2011

use EFER_SCE, EFER_LME and EFER_LMA instead of magic numbers.
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1260edbe

KVM: Clear async page fault hash after switching to real mode · d170c419

由 Lai Jiangshan 提交于 2月 21, 2011

The hash array of async gfns may still contain some left gfns after
kvm_clear_async_pf_completion_queue() called, need to clear them.
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d170c419

KVM: x86: Convert tsc_write_lock to raw_spinlock · 038f8c11

由 Jan Kiszka 提交于 2月 04, 2011

Code under this lock requires non-preemptibility. Ensure this also over
-rt by converting it to raw spinlock.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

038f8c11

KVM: remove isr_ack logic from PIC · 7049467b

由 Gleb Natapov 提交于 2月 09, 2011

isr_ack logic was added by e4825800 to avoid unnecessary IPIs. Back
then it made sense, but now the code checks that vcpu is ready to accept
interrupt before sending IPI, so this logic is no longer needed. The
patch removes it.

Fixes a regression with Debian/Hurd.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Reported-and-tested-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7049467b

KVM: Convert kvm_lock to raw_spinlock · e935b837

由 Jan Kiszka 提交于 2月 08, 2011

Code under this lock requires non-preemptibility. Ensure this also over
-rt by converting it to raw spinlock.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e935b837

KVM: Fix race between nmi injection and enabling nmi window · f8636849

由 Avi Kivity 提交于 2月 03, 2011

The interrupt injection logic looks something like

  if an nmi is pending, and nmi injection allowed
    inject nmi
  if an nmi is pending
    request exit on nmi window

the problem is that "nmi is pending" can be set asynchronously by
the PIT; if it happens to fire between the two if statements, we
will request an nmi window even though nmi injection is allowed.  On
SVM, this has disasterous results, since it causes eflags.TF to be
set in random guest code.

The fix is simple; make nmi_pending synchronous using the standard
vcpu->requests mechanism; this ensures the code above is completely
synchronous wrt nmi_pending.
Signed-off-by: NAvi Kivity <avi@redhat.com>

f8636849

KVM: Drop ad-hoc vendor specific instruction restriction · 4005996e

由 Avi Kivity 提交于 2月 01, 2011

Use the new support in the emulator, and drop the ad-hoc code in x86.c.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4005996e

KVM: Drop bogus x86_decode_insn() error check · 3e909439

由 Avi Kivity 提交于 2月 01, 2011

x86_decode_insn() doesn't return X86EMUL_* values, so the check
for X86EMUL_PROPOGATE_FAULT will always fail.  There is a proper
check later on, so there is no need for a replacement for this
code.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

3e909439

KVM: x86: release kvmclock page on reset · 12f9a48f

由 Glauber Costa 提交于 2月 01, 2011

When a vcpu is reset, kvmclock page keeps being written to this days.
This is wrong and inconsistent: a cpu reset should take it to its
initial state.
Signed-off-by: NGlauber Costa <glommer@redhat.com>
CC: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

12f9a48f

KVM: x86: handle guest access to BBL_CR_CTL3 MSR · 91c9c3ed

由 john cooper 提交于 1月 21, 2011

A correction to Intel cpu model CPUID data (patch queued)
caused winxp to BSOD when booted with a Penryn model.
This was traced to the CPUID "model" field correction from
6 -> 23 (as is proper for a Penryn class of cpu).  Only in
this case does the problem surface.

The cause for this failure is winxp accessing the BBL_CR_CTL3
MSR which is unsupported by current kvm, appears to be a
legacy MSR not fully characterized yet existing in current
silicon, and is apparently carried forward in MSR space to
accommodate vintage code as here.  It is not yet conclusive
whether this MSR implements any of its legacy functionality
or is just an ornamental dud for compatibility.  While I
found no silicon version specific documentation link to
this MSR, a general description exists in Intel's developer's
reference which agrees with the functional behavior of
other bootloader/kernel code I've examined accessing
BBL_CR_CTL3.  Regrettably winxp appears to be setting bit #19
called out as "reserved" in the above document.

So to minimally accommodate this MSR, kvm msr get will provide
the equivalent mock data and kvm msr write will simply toss the
guest passed data without interpretation.  While this treatment
of BBL_CR_CTL3 addresses the immediate problem, the approach may
be modified pending clarification from Intel.
Signed-off-by: Njohn cooper <john.cooper@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

91c9c3ed

KVM: Add "exiting guest mode" state · 6b7e2d09

由 Xiao Guangrong 提交于 1月 12, 2011

Currently we keep track of only two states: guest mode and host
mode.  This patch adds an "exiting guest mode" state that tells
us that an IPI will happen soon, so unless we need to wait for the
IPI, we can avoid it completely.

Also
1: No need atomically to read/write ->mode in vcpu's thread

2: reorganize struct kvm_vcpu to make ->mode and ->requests
   in the same cache line explicitly
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6b7e2d09

KVM: x86: Remove user space triggerable MCE error message · 9ca52318

由 Jan Kiszka 提交于 1月 15, 2011

This case is a pure user space error we do not need to record. Moreover,
it can be misused to flood the kernel log. Remove it.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9ca52318

KVM: fix rcu usage warning in kvm_arch_vcpu_ioctl_set_sregs() · 63f42e02

由 Xiao Guangrong 提交于 1月 12, 2011

Fix:

[ 1001.499596] ===================================================
[ 1001.499599] [ INFO: suspicious rcu_dereference_check() usage. ]
[ 1001.499601] ---------------------------------------------------
[ 1001.499604] include/linux/kvm_host.h:301 invoked rcu_dereference_check() without protection!
	......
[ 1001.499636] Pid: 6035, comm: qemu-system-x86 Not tainted 2.6.37-rc6+ #62
[ 1001.499638] Call Trace:
[ 1001.499644]  [] lockdep_rcu_dereference+0x9d/0xa5
[ 1001.499653]  [] gfn_to_memslot+0x8d/0xc8 [kvm]
[ 1001.499661]  [] gfn_to_hva+0x16/0x3f [kvm]
[ 1001.499669]  [] kvm_read_guest_page+0x1e/0x5e [kvm]
[ 1001.499681]  [] kvm_read_guest_page_mmu+0x53/0x5e [kvm]
[ 1001.499699]  [] load_pdptrs+0x3f/0x9c [kvm]
[ 1001.499705]  [] ? vmx_set_cr0+0x507/0x517 [kvm_intel]
[ 1001.499717]  [] kvm_arch_vcpu_ioctl_set_sregs+0x1f3/0x3c0 [kvm]
[ 1001.499727]  [] kvm_vcpu_ioctl+0x6a5/0xbc5 [kvm]
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

63f42e02

12 1月, 2011 24 次提交

KVM: Initialize fpu state in preemptible context · e5c30142

由 Avi Kivity 提交于 1月 11, 2011

init_fpu() (which is indirectly called by the fpu switching code) assumes
it is in process context.  Rather than makeing init_fpu() use an atomic
allocation, which can cause a task to be killed, make sure the fpu is
already initialized when we enter the run loop.

KVM-Stable-Tag.
Reported-and-tested-by: NKirill A. Shutemov <kas@openvz.org>
Acked-by: NPekka Enberg <penberg@kernel.org>
Reviewed-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e5c30142

KVM: Fetch guest cr3 from hardware on demand · aff48baa

由 Avi Kivity 提交于 12月 05, 2010

Instead of syncing the guest cr3 every exit, which is expensince on vmx
with ept enabled, sync it only on demand.

[sheng: fix incorrect cr3 seen by Windows XP]
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

aff48baa

A
KVM: Replace reads of vcpu->arch.cr3 by an accessor · 9f8fe504
由 Avi Kivity 提交于 12月 05, 2010
```
This allows us to keep cr3 in the VMCS, later on.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
9f8fe504

KVM: SVM: copy instruction bytes from VMCB · dc25e89e

由 Andre Przywara 提交于 12月 21, 2010

In case of a nested page fault or an intercepted #PF newer SVM
implementations provide a copy of the faulting instruction bytes
in the VMCB.
Use these bytes to feed the instruction emulator and avoid the costly
guest instruction fetch in this case.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

dc25e89e

KVM: cleanup emulate_instruction · 51d8b661

由 Andre Przywara 提交于 12月 21, 2010

emulate_instruction had many callers, but only one used all
parameters. One parameter was unused, another one is now
hidden by a wrapper function (required for a future addition
anyway), so most callers use now a shorter parameter list.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

51d8b661

KVM: move complete_insn_gp() into x86.c · db8fcefa

由 Andre Przywara 提交于 12月 21, 2010

move the complete_insn_gp() helper function out of the VMX part
into the generic x86 part to make it usable by SVM.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

db8fcefa

KVM: x86: fix CR8 handling · eea1cff9

由 Andre Przywara 提交于 12月 21, 2010

The handling of CR8 writes in KVM is currently somewhat cumbersome.
This patch makes it look like the other CR register handlers
and fixes a possible issue in VMX, where the RIP would be incremented
despite an injected #GP.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

eea1cff9

KVM: Take missing slots_lock for kvm_io_bus_unregister_dev() · 175504cd

由 Takuya Yoshikawa 提交于 12月 16, 2010

In KVM_CREATE_IRQCHIP, kvm_io_bus_unregister_dev() is called without taking
slots_lock in the error handling path.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

175504cd

KVM: return true when user space query KVM_CAP_USER_NMI extension · a355c85c

由 Lai Jiangshan 提交于 12月 14, 2010

userspace may check this extension in runtime.
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a355c85c

KVM: Correct kvm_pio tracepoint count field · 61cfab2e

由 Avi Kivity 提交于 12月 13, 2010

Currently, we record '1' for count regardless of the real count.  Fix.
Signed-off-by: NAvi Kivity <avi@redhat.com>

61cfab2e

KVM: MMU: retry #PF for softmmu · fb67e14f

由 Xiao Guangrong 提交于 12月 07, 2010

Retry #PF for softmmu only when the current vcpu has the same cr3 as the time
when #PF occurs
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fb67e14f

KVM: X86: Don't report L2 emulation failures to user-space · fc3a9157

由 Joerg Roedel 提交于 11月 29, 2010

This patch prevents that emulation failures which result
from emulating an instruction for an L2-Guest results in
being reported to userspace.
Without this patch a malicious L2-Guest would be able to
kill the L1 by triggering a race-condition between an vmexit
and the instruction emulator.
With this patch the L2 will most likely only kill itself in
this situation.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

fc3a9157

KVM: Pull extra page fault information into struct x86_exception · 6389ee94

由 Avi Kivity 提交于 11月 29, 2010

Currently page fault cr2 and nesting infomation are carried outside
the fault data structure.  Instead they are placed in the vcpu struct,
which results in confusion as global variables are manipulated instead
of passing parameters.

Fix this issue by adding address and nested fields to struct x86_exception,
so this struct can carry all information associated with a fault.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Tested-by: NJoerg Roedel <joerg.roedel@amd.com>
Tested-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6389ee94

A
KVM: Push struct x86_exception info the various gva_to_gpa variants · ab9ae313
由 Avi Kivity 提交于 11月 22, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
ab9ae313

KVM: x86 emulator: make emulator memory callbacks return full exception · bcc55cba

由 Avi Kivity 提交于 11月 22, 2010

This way, they can return #GP, not just #PF.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bcc55cba

KVM: x86 emulator: introduce struct x86_exception to communicate faults · da9cb575

由 Avi Kivity 提交于 11月 22, 2010

Introduce a structure that can contain an exception to be passed back
to main kvm code.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

da9cb575

KVM: Mask KVM_GET_SUPPORTED_CPUID data with Linux cpuid info · 945ee35e

由 Avi Kivity 提交于 11月 09, 2010

This allows Linux to mask cpuid bits if, for example, nx is enabled on only
some cpus.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

945ee35e

KVM: MMU: fix apf prefault if nested guest is enabled · c4806acd

由 Xiao Guangrong 提交于 11月 12, 2010

If apf is generated in L2 guest and is completed in L1 guest, it will
prefault this apf in L1 guest's mmu context.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c4806acd

KVM: MMU: clear apfs if page state is changed · e5f3f027

由 Xiao Guangrong 提交于 11月 12, 2010

If CR0.PG is changed, the page fault cann't be avoid when the prefault address
is accessed later

And it also fix a bug: it can retry a page enabled #PF in page disabled context
if mmu is shadow page

This idear is from Gleb Natapov
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e5f3f027

KVM: Clean up vm creation and release · d89f5eff

由 Jan Kiszka 提交于 11月 09, 2010

IA64 support forces us to abstract the allocation of the kvm structure.
But instead of mixing this up with arch-specific initialization and
doing the same on destruction, split both steps. This allows to move
generic destruction calls into generic code.

It also fixes error clean-up on failures of kvm_create_vm for IA64.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d89f5eff

KVM: avoid unnecessary wait for a async pf · e6d53e3b

由 Xiao Guangrong 提交于 11月 01, 2010

In current code, it checks async pf completion out of the wait context,
like this:

if (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
		    !vcpu->arch.apf.halted)
			r = vcpu_enter_guest(vcpu);
		else {
			......
			kvm_vcpu_block(vcpu)
			 ^- waiting until 'async_pf.done' is not empty
}

kvm_check_async_pf_completion(vcpu)
 ^- delete list from async_pf.done

So, if we check aysnc pf completion first, it can be blocked at
kvm_vcpu_block

Fixed by mark the vcpu is unhalted in kvm_check_async_pf_completion()
path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e6d53e3b

KVM: fix searching async gfn in kvm_async_pf_gfn_slot · c7d28c24

由 Xiao Guangrong 提交于 11月 01, 2010

Don't search later slots if the slot is empty
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c7d28c24

KVM: x86: Avoid issuing wbinvd twice · 2eec7343

由 Jan Kiszka 提交于 11月 01, 2010

Micro optimization to avoid calling wbinvd twice on the CPU that has to
emulate it. As we might be preempted between smp_call_function_many and
the local wbinvd, the cache might be filled again so that real work
could be done uselessly.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2eec7343

KVM: pre-allocate one more dirty bitmap to avoid vmalloc() · 515a0127

由 Takuya Yoshikawa 提交于 10月 27, 2010

Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by
vmalloc() which will be used in the next logging and this has been causing
bad effect to VGA and live-migration: vmalloc() consumes extra systime,
triggers tlb flush, etc.

This patch resolves this issue by pre-allocating one more bitmap and switching
between two bitmaps during dirty logging.

Performance improvement:
  I measured performance for the case of VGA update by trace-cmd.
  The result was 1.5 times faster than the original one.

  In the case of live migration, the improvement ratio depends on the workload
  and the guest memory size. In general, the larger the memory size is the more
  benefits we get.

Note:
  This does not change other architectures's logic but the allocation size
  becomes twice. This will increase the actual memory consumption only when
  the new size changes the number of pages allocated by vmalloc().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

515a0127

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功