提交 · 90d83dc3d49f5101addae962ccc1b4aff66b68d8 · openanolis / cloud-kernel

17 5月, 2010 8 次提交

KVM: use the correct RCU API for PROVE_RCU=y · 90d83dc3

由 Lai Jiangshan 提交于 4月 19, 2010

The RCU/SRCU API have already changed for proving RCU usage.

I got the following dmesg when PROVE_RCU=y because we used incorrect API.
This patch coverts rcu_deference() to srcu_dereference() or family API.

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
2 locks held by qemu-system-x86/8550:
 #0:  (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm]
 #1:  (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm]

stack backtrace:
Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27
Call Trace:
 [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3
 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm]
 [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm]
 [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm]
 [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm]
 [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel]
 [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm]
 [<ffffffff810a8692>] ? unlock_page+0x27/0x2c
 [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1
 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm]
 [<ffffffff81060cfa>] ? up_read+0x23/0x3d
 [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6
 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db
 [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241
 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116
 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13
 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a
 [<ffffffff810021db>] system_call_fastpath+0x16/0x1b
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

90d83dc3

KVM: prevent spurious exit to userspace during task switch emulation. · acb54517

由 Gleb Natapov 提交于 4月 15, 2010

If kvm_task_switch() fails code exits to userspace without specifying
exit reason, so the previous exit reason is reused by userspace. Fix
this by specifying exit reason correctly.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

acb54517

KVM: x86: Push potential exception error code on task switches · e269fb21

由 Jan Kiszka 提交于 4月 14, 2010

When a fault triggers a task switch, the error code, if existent, has to
be pushed on the new task's stack. Implement the missing bits.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e269fb21

KVM: move DR register access handling into generic code · 020df079

由 Gleb Natapov 提交于 4月 13, 2010

Currently both SVM and VMX have their own DR handling code. Move it to
x86.c.
Acked-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

020df079

KVM: x86 emulator: fix in/out emulation. · cf8f70bf

由 Gleb Natapov 提交于 3月 18, 2010

in/out emulation is broken now. The breakage is different depending
on where IO device resides. If it is in userspace emulator reports
emulation failure since it incorrectly interprets kvm_emulate_pio()
return value. If IO device is in the kernel emulation of 'in' will do
nothing since kvm_emulate_pio() stores result directly into vcpu
registers, so emulator will overwrite result of emulation during
commit of shadowed register.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

cf8f70bf

KVM: VMX: change to use bool return values · 31299944

由 Gui Jianfeng 提交于 3月 15, 2010

Make use of bool as return values, and remove some useless
bool value converting. Thanks Avi to point this out.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

31299944

KVM: x86: Use native_store_idt() instead of kvm_get_idt() · ec68798c

由 Wei Yongjun 提交于 3月 05, 2010

This patch use generic linux function native_store_idt()
instead of kvm_get_idt(), and also removed the useless
function kvm_get_idt().
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ec68798c

KVM: Move kvm_exit tracepoint rip reading inside tracepoint · 5bfd8b54

由 Avi Kivity 提交于 3月 11, 2010

Reading rip is expensive on vmx, so move it inside the tracepoint so we only
incur the cost if tracing is enabled.
Signed-off-by: NAvi Kivity <avi@redhat.com>

5bfd8b54

25 4月, 2010 4 次提交

KVM: move segment_base() into vmx.c · 2d49ec72

由 Gleb Natapov 提交于 2月 25, 2010

segment_base() is used only by vmx so move it there.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2d49ec72

KVM: Drop kvm_get_gdt() in favor of generic linux function · d6ab1ed4

由 Gleb Natapov 提交于 2月 25, 2010

Linux now has native_store_gdt() to do the same. Use it instead of
kvm local version.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d6ab1ed4

KVM: x86: Save&restore interrupt shadow mask · 48005f64

由 Jan Kiszka 提交于 2月 19, 2010

The interrupt shadow created by STI or MOV-SS-like operations is part of
the VCPU state and must be preserved across migration. Transfer it in
the spare padding field of kvm_vcpu_events.interrupt.

As a side effect we now have to make vmx_set_interrupt_shadow robust
against both shadow types being set. Give MOV SS a higher priority and
skip STI in that case to avoid that VMX throws a fault on next entry.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

48005f64

KVM: use desc_ptr struct instead of kvm private descriptor_table · 89a27f4d

由 Gleb Natapov 提交于 2月 16, 2010

x86 arch defines desc_ptr for idt/gdt pointers, no need to define
another structure in kvm code.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

89a27f4d

20 4月, 2010 1 次提交

KVM: VMX: Save/restore rflags.vm correctly in real mode · 78ac8b47

由 Avi Kivity 提交于 4月 08, 2010

Currently we set eflags.vm unconditionally when entering real mode emulation
through virtual-8086 mode, and clear it unconditionally when we enter protected
mode.  The means that the following sequence

  KVM_SET_REGS  (rflags.vm=1)
  KVM_SET_SREGS (cr0.pe=1)

Ends up with rflags.vm clear due to KVM_SET_SREGS triggering enter_pmode().

Fix by shadowing rflags.vm (and rflags.iopl) correctly while in real mode:
reads and writes to those bits access a shadow register instead of the actual
register.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

78ac8b47

19 4月, 2010 1 次提交

KVM: Implement perf callbacks for guest sampling · ff9d07a0

由 Zhang, Yanmin 提交于 4月 19, 2010

Below patch implements the perf_guest_info_callbacks on kvm.
Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ff9d07a0

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

01 3月, 2010 25 次提交

KVM: VMX: Update instruction length on intercepted BP · c573cd22

由 Jan Kiszka 提交于 2月 23, 2010

We intercept #BP while in guest debugging mode. As VM exits due to
intercepted exceptions do not necessarily come with valid
idt_vectoring, we have to update event_exit_inst_len explicitly in such
cases. At least in the absence of migration, this ensures that
re-injections of #BP will find and use the correct instruction length.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Cc: stable@kernel.org (2.6.32, 2.6.33)
Signed-off-by: NAvi Kivity <avi@redhat.com>

c573cd22

KVM: VMX: Rename VMX_EPT_IGMT_BIT to VMX_EPT_IPAT_BIT · a19a6d11

由 Sheng Yang 提交于 2月 09, 2010

Following the new SDM. Now the bit is named "Ignore PAT memory type".
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a19a6d11

KVM: VMX: Remove redundant test in vmx_set_efer() · c45b4fd4

由 Julia Lawall 提交于 2月 06, 2010

msr was tested above, so the second test is not needed.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r@
expression *x;
expression e;
identifier l;
@@

if (x == NULL || ...) {
    ... when forall
    return ...; }
... when != goto l;
    when != x = e
    when != &x
*x == NULL
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c45b4fd4

A
KVM: VMX: Wire up .fpu_activate() callback · ebcbab4c
由 Avi Kivity 提交于 2月 07, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
ebcbab4c

KVM: VMX: Remove redundant check in vm_need_virtualize_apic_accesses() · 6d3e435e

由 Gui Jianfeng 提交于 1月 29, 2010

flexpriority_enabled implies cpu_has_vmx_virtualize_apic_accesses() returning
true, so we don't need this check here.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6d3e435e

KVM: Trace failed msr reads and writes · 59200273

由 Avi Kivity 提交于 1月 25, 2010

Record failed msrs reads and writes, and the fact that they failed as well.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

59200273

KVM: VMX: Pass cr0.mp through to the guest when the fpu is active · 81231c69

由 Avi Kivity 提交于 1月 24, 2010

When cr0.mp is clear, the guest doesn't expect a #NM in response to
a WAIT instruction.  Because we always keep cr0.mp set, it will get
a #NM, and potentially be confused.

Fix by keeping cr0.mp set only when the fpu is inactive, and passing
it through when inactive.
Reported-by: NLorenzo Martignoni <martignlo@gmail.com>
Analyzed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

81231c69

KVM: Rename vcpu->shadow_efer to efer · f6801dff

由 Avi Kivity 提交于 1月 21, 2010

None of the other registers have the shadow_ prefix.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f6801dff

A
KVM: Add a helper for checking if the guest is in protected mode · 3eeb3288
由 Avi Kivity 提交于 1月 21, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
3eeb3288

KVM: Activate fpu on clts · 6b52d186

由 Avi Kivity 提交于 1月 21, 2010

Assume that if the guest executes clts, it knows what it's doing, and load the
guest fpu to prevent an #NM exception.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6b52d186

KVM: VMX: Clean up DR6 emulation · fd7373cc

由 Jan Kiszka 提交于 1月 20, 2010

As we trap all debug register accesses, we do not need to switch real
DR6 at all. Clean up update_exception_bitmap at this chance, too.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

fd7373cc

KVM: VMX: Fix emulation of DR4 and DR5 · 138ac8d8

由 Jan Kiszka 提交于 1月 20, 2010

Make sure DR4 and DR5 are aliased to DR6 and DR7, respectively, if
CR4.DE is not set.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

138ac8d8

KVM: VMX: Fix exceptions of mov to dr · f2483415

由 Jan Kiszka 提交于 1月 20, 2010

Injecting GP without an error code is a bad idea (causes unhandled guest
exits). Moreover, we must not skip the instruction if we injected an
exception.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f2483415

KVM: VMX: Remove emulation failure report · 7062dcaa

由 Sheng Yang 提交于 1月 19, 2010

As Avi noted:

>There are two problems with the kernel failure report.  First, it
>doesn't report enough data - registers, surrounding instructions, etc.
>that are needed to explain what is going on.  Second, it can flood
>dmesg, which is a pretty bad thing to do.

So we remove the emulation failure report in handle_invalid_guest_state(),
and would inspected the guest using userspace tool in the future.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7062dcaa

KVM: VMX: Give the guest ownership of cr0.ts when the fpu is active · edcafe3c

由 Avi Kivity 提交于 12月 30, 2009

If the guest fpu is loaded, there is nothing interesing about cr0.ts; let
the guest play with it as it will.  This makes context switches between fpu
intensive guest processes faster, as we won't trap the clts and cr0 write
instructions.

[marcelo: fix cr0 read shadow update on fpu deactivation; kills F8 install]
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

edcafe3c

KVM: Lazify fpu activation and deactivation · 02daab21

由 Avi Kivity 提交于 12月 30, 2009

Defer fpu deactivation as much as possible - if the guest fpu is loaded, keep
it loaded until the next heavyweight exit (where we are forced to unload it).
This reduces unnecessary exits.

We also defer fpu activation on clts; while clts signals the intent to use the
fpu, we can't be sure the guest will actually use it.
Signed-off-by: NAvi Kivity <avi@redhat.com>

02daab21

KVM: VMX: Allow the guest to own some cr0 bits · e8467fda

由 Avi Kivity 提交于 12月 29, 2009

We will use this later to give the guest ownership of cr0.ts.
Signed-off-by: NAvi Kivity <avi@redhat.com>

e8467fda

KVM: Replace read accesses of vcpu->arch.cr0 by an accessor · 4d4ec087

由 Avi Kivity 提交于 12月 29, 2009

Since we'd like to allow the guest to own a few bits of cr0 at times, we need
to know when we access those bits.
Signed-off-by: NAvi Kivity <avi@redhat.com>

4d4ec087

A
KVM: VMX: trace clts and lmsw instructions as cr accesses · a1f83a74
由 Avi Kivity 提交于 12月 29, 2009
```
clts writes cr0.ts; lmsw writes cr0[0:15] - record that in ftrace.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
a1f83a74

KVM: VMX: Enable EPT 1GB page support · 878403b7

由 Sheng Yang 提交于 1月 05, 2010

Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

878403b7

KVM: x86: Rename gb_page_enable() to get_lpage_level() in kvm_x86_ops · 17cc3935

由 Sheng Yang 提交于 1月 05, 2010

Then the callback can provide the maximum supported large page level, which
is more flexible.

Also move the gb page support into x86_64 specific.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

17cc3935

KVM: Fill out ftrace exit reason strings · f4c9e87c

由 Avi Kivity 提交于 12月 28, 2009

Some exit reasons missed their strings; fill out the table.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f4c9e87c

M
KVM: convert slots_lock to a mutex · 79fac95e
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
79fac95e
M
KVM: switch vcpu context to use SRCU · f656ce01
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
f656ce01

KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update · bc6678a3

由 Marcelo Tosatti 提交于 12月 23, 2009

Use two steps for memslot deletion: mark the slot invalid (which stops
instantiation of new shadow pages for that slot, but allows destruction),
then instantiate the new empty slot.

Also simplifies kvm_handle_hva locking.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bc6678a3

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功