提交 · 4b8073e467e6a66b6a5a8e799d28bc3b243c0d78 · openanolis / cloud-kernel

19 9月, 2012 1 次提交

arch/x86: Remove unecessary semicolons · 4b8073e4

由 Peter Senna Tschudin 提交于 9月 18, 2012

Found by http://coccinelle.lip6.fr/Signed-off-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: rusty@rustcorp.com.au
Cc: masami.hiramatsu.pt@hitachi.com
Cc: suresh.b.siddha@intel.com
Cc: joerg.roedel@amd.com
Cc: agordeev@redhat.com
Cc: yinghai@kernel.org
Cc: bhelgaas@google.com
Cc: liuj97@gmail.com
Link: http://lkml.kernel.org/r/1347986174-30287-7-git-send-email-peter.senna@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

4b8073e4

10 9月, 2012 1 次提交

KVM: fix error paths for failed gfn_to_page() calls · 4484141a

由 Xiao Guangrong 提交于 9月 07, 2012

This bug was triggered:
[ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe
[ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34
......
[ 4220.237326] Call Trace:
[ 4220.237361]  [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm]
[ 4220.237382]  [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm]
[ 4220.237401]  [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm]
[ 4220.237407]  [<ffffffff81145425>] __fput+0x111/0x1ed
[ 4220.237411]  [<ffffffff8114550f>] ____fput+0xe/0x10
[ 4220.237418]  [<ffffffff81063511>] task_work_run+0x5d/0x88
[ 4220.237424]  [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca

The test case:

	printf(fmt, ##args);		\
	exit(-1);} while (0)

static int create_vm(void)
{
	int sys_fd, vm_fd;

	sys_fd = open("/dev/kvm", O_RDWR);
	if (sys_fd < 0)
		die("open /dev/kvm fail.\n");

	vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0);
	if (vm_fd < 0)
		die("KVM_CREATE_VM fail.\n");

	return vm_fd;
}

static int create_vcpu(int vm_fd)
{
	int vcpu_fd;

	vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0);
	if (vcpu_fd < 0)
		die("KVM_CREATE_VCPU ioctl.\n");
	printf("Create vcpu.\n");
	return vcpu_fd;
}

static void *vcpu_thread(void *arg)
{
	int vm_fd = (int)(long)arg;

	create_vcpu(vm_fd);
	return NULL;
}

int main(int argc, char *argv[])
{
	pthread_t thread;
	int vm_fd;

	(void)argc;
	(void)argv;

	vm_fd = create_vm();
	pthread_create(&thread, NULL, vcpu_thread, (void *)(long)vm_fd);
	printf("Exit.\n");
	return 0;
}

It caused by release kvm->arch.ept_identity_map_addr which is the
error page.

The parent thread can send KILL signal to the vcpu thread when it was
exiting which stops faulting pages and potentially allocating memory.
So gfn_to_pfn/gfn_to_page may fail at this time

Fixed by checking the page before it is used
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4484141a

09 9月, 2012 1 次提交

KVM: x86: Check INVPCID feature bit in EBX of leaf 7 · 4f977045

由 Ren, Yongjie 提交于 9月 07, 2012

Checks and operations on the INVPCID feature bit should use EBX
of CPUID leaf 7 instead of ECX.
Signed-off-by: NJunjie Mao <junjie.mao@intel.com>
Signed-off-by: NYongjie Ren <yongjien.ren@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4f977045

04 9月, 2012 1 次提交

KVM: PIC: fix use of uninitialised variable. · 749c59fd

由 Jamie Iles 提交于 8月 30, 2012

Commit aea218f3 (KVM: PIC: call ack notifiers for irqs that are
dropped form irr) used an uninitialised variable to track whether an
appropriate apic had been found.  This could result in calling the ack
notifier incorrectly.

Cc: Gleb Natapov <gleb@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: NJamie Iles <jamie@jamieiles.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

749c59fd

28 8月, 2012 1 次提交

KVM: x86: fix KVM_GET_MSR for PV EOI · 1d92128f

由 Michael S. Tsirkin 提交于 8月 26, 2012

KVM_GET_MSR was missing support for PV EOI,
which is needed for migration.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1d92128f

23 8月, 2012 1 次提交

KVM: x86 emulator: use stack size attribute to mask rsp in stack ops · 5ad105e5

由 Avi Kivity 提交于 8月 19, 2012

The sub-register used to access the stack (sp, esp, or rsp) is not
determined by the address size attribute like other memory references,
but by the stack segment's B bit (if not in x86_64 mode).

Fix by using the existing stack_mask() to figure out the correct mask.

This long-existing bug was exposed by a combination of a27685c3
(emulate invalid guest state by default), which causes many more
instructions to be emulated, and a seabios change (possibly a bug) which
causes the high 16 bits of esp to become polluted across calls to real
mode software interrupts.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5ad105e5

22 8月, 2012 1 次提交

KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended · 35f2d16b

由 Takuya Yoshikawa 提交于 8月 20, 2012

Although the possible race described in

  commit 85b70591
  KVM: MMU: fix shrinking page from the empty mmu

was correct, the real cause of that issue was a more trivial bug of
mmu_shrink() introduced by

  commit 19526396
  KVM: MMU: do not iterate over all VMs in mmu_shrink()

Here is the bug:

	if (kvm->arch.n_used_mmu_pages > 0) {
		if (!nr_to_scan--)
			break;
		continue;
	}

We skip VMs whose n_used_mmu_pages is not zero and try to shrink others:
in other words we try to shrink empty ones by mistake.

This patch reverses the logic so that mmu_shrink() can free pages from
the first VM whose n_used_mmu_pages is not zero.  Note that we also add
comments explaining the role of nr_to_scan which is not practically
important now, hoping this will be improved in the future.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

35f2d16b

05 8月, 2012 1 次提交

KVM: x86: update KVM_SAVE_MSRS_BEGIN to correct value · 439793d4

由 Gleb Natapov 提交于 8月 01, 2012

When MSR_KVM_PV_EOI_EN was added to msrs_to_save array
KVM_SAVE_MSRS_BEGIN was not updated accordingly.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

439793d4

02 8月, 2012 2 次提交

KVM: VMX: Fix ds/es corruption on i386 with preemption · aa67f609

由 Avi Kivity 提交于 8月 01, 2012

Commit b2da15ac ("KVM: VMX: Optimize %ds, %es reload") broke i386
in the following scenario:

  vcpu_load
  ...
  vmx_save_host_state
  vmx_vcpu_run
  (ds.rpl, es.rpl cleared by hardware)

  interrupt
    push ds, es  # pushes bad ds, es
    schedule
      vmx_vcpu_put
        vmx_load_host_state
          reload ds, es (with __USER_DS)
    pop ds, es  # of other thread's stack
    iret
  # other thread runs
  interrupt
    push ds, es
    schedule  # back in vcpu thread
    pop ds, es  # now with rpl=0
    iret
  ...
  vcpu_put
  resume_userspace
  iret  # clears ds, es due to mismatched rpl

(instead of resume_userspace, we might return with SYSEXIT and then
take an exception; when the exception IRETs we end up with cleared
ds, es)

Fix by avoiding the optimization on i386 and reloading ds, es on the
lightweight exit path.
Reported-by: NChris Clayron <chris2553@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

aa67f609

KVM: x86: apply kvmclock offset to guest wall clock time · 4b648665

由 Bruce Rogers 提交于 7月 20, 2012

When a guest migrates to a new host, the system time difference from the
previous host is used in the updates to the kvmclock system time visible
to the guest, resulting in a continuation of correct kvmclock based guest
timekeeping.

The wall clock component of the kvmclock provided time is currently not
updated with this same time offset. Since the Linux guest caches the
wall clock based time, this discrepency is not noticed until the guest is
rebooted. After reboot the guest's time calculations are off.

This patch adjusts the wall clock by the kvmclock_offset, resulting in
correct guest time after a reboot.

Cc: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: NBruce Rogers <brogers@suse.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4b648665

26 7月, 2012 1 次提交

KVM: PIC: call ack notifiers for irqs that are dropped form irr · aea218f3

由 Gleb Natapov 提交于 7月 26, 2012

After commit 242ec97c PIT interrupts are no longer delivered after
PIC reset. It happens because PIT injects interrupt only if previous one
was acked, but since on PIC reset it is dropped from irr it will never
be delivered and hence acknowledged. Fix that by calling ack notifier on
PIC reset.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

aea218f3

21 7月, 2012 1 次提交

KVM: fix race with level interrupts · 1a577b72

由 Michael S. Tsirkin 提交于 7月 19, 2012

When more than 1 source id is in use for the same GSI, we have the
following race related to handling irq_states race:

CPU 0 clears bit 0. CPU 0 read irq_state as 0. CPU 1 sets level to 1.
CPU 1 calls kvm_ioapic_set_irq(1). CPU 0 calls kvm_ioapic_set_irq(0).
Now ioapic thinks the level is 0 but irq_state is not 0.

Fix by performing all irq_states bitmap handling under pic/ioapic lock.
This also removes the need for atomics with irq_states handling.
Reported-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1a577b72

12 7月, 2012 1 次提交

KVM: VMX: Implement PCID/INVPCID for guests with EPT · ad756a16

由 Mao, Junjie 提交于 7月 02, 2012

This patch handles PCID/INVPCID for guests.

Process-context identifiers (PCIDs) are a facility by which a logical processor
may cache information for multiple linear-address spaces so that the processor
may retain cached information when software switches to a different linear
address space. Refer to section 4.10.1 in IA32 Intel Software Developer's Manual
Volume 3A for details.

For guests with EPT, the PCID feature is enabled and INVPCID behaves as running
natively.
For guests without EPT, the PCID feature is disabled and INVPCID triggers #UD.
Signed-off-by: NJunjie Mao <junjie.mao@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ad756a16

11 7月, 2012 9 次提交

KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint · 6fbc2770

由 Xiao Guangrong 提交于 6月 20, 2012

The P bit of page fault error code is missed in this tracepoint, fix it by
passing the full error code
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6fbc2770

KVM: MMU: trace fast page fault · a72faf25

由 Xiao Guangrong 提交于 6月 20, 2012

To see what happen on this path and help us to optimize it
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a72faf25

KVM: MMU: fast path of handling guest page fault · c7ba5b48

由 Xiao Guangrong 提交于 6月 20, 2012

If the the present bit of page fault error code is set, it indicates
the shadow page is populated on all levels, it means what we do is
only modify the access bit which can be done out of mmu-lock

Currently, in order to simplify the code, we only fix the page fault
caused by write-protect on the fast path
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c7ba5b48

KVM: MMU: introduce SPTE_MMU_WRITEABLE bit · 49fde340

由 Xiao Guangrong 提交于 6月 20, 2012

This bit indicates whether the spte can be writable on MMU, that means
the corresponding gpte is writable and the corresponding gfn is not
protected by shadow page protection

In the later path, SPTE_MMU_WRITEABLE will indicates whether the spte
can be locklessly updated
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

49fde340

KVM: MMU: fold tlb flush judgement into mmu_spte_update · 6e7d0354

由 Xiao Guangrong 提交于 6月 20, 2012

mmu_spte_update() is the common function, we can easily audit the path
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6e7d0354

KVM: VMX: export PFEC.P bit on ept · 4f5982a5

由 Xiao Guangrong 提交于 6月 20, 2012

Export the present bit of page fault error code, the later patch
will use it
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4f5982a5

KVM: MMU: cleanup spte_write_protect · 8e22f955

由 Xiao Guangrong 提交于 6月 20, 2012

Use __drop_large_spte to cleanup this function and comment spte_write_protect
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8e22f955

KVM: MMU: abstract spte write-protect · d13bc5b5

由 Xiao Guangrong 提交于 6月 20, 2012

Introduce a common function to abstract spte write-protect to
cleanup the code
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d13bc5b5

KVM: MMU: return bool in __rmap_write_protect · 2f84569f

由 Xiao Guangrong 提交于 6月 20, 2012

The reture value of __rmap_write_protect is either 1 or 0, use
true/false instead of these
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2f84569f

09 7月, 2012 18 次提交

KVM: VMX: Emulate invalid guest state by default · a27685c3

由 Avi Kivity 提交于 6月 12, 2012

Our emulation should be complete enough that we can emulate guests
while they are in big real mode, or in a mode transition that is not
virtualizable without unrestricted guest support.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a27685c3

KVM: x86 emulator: implement LTR · 80890006

由 Avi Kivity 提交于 6月 13, 2012

Opcode 0F 00 /3.  Encountered during Windows XP secondary processor bringup.
Signed-off-by: NAvi Kivity <avi@redhat.com>

80890006

KVM: x86 emulator: make loading TR set the busy bit · 869be99c

由 Avi Kivity 提交于 6月 13, 2012

Guest software doesn't actually depend on it, but vmx will refuse us
entry if we don't.  Set the bit in both the cached segment and memory,
just to be nice.
Signed-off-by: NAvi Kivity <avi@redhat.com>

869be99c

KVM: x86 emulator: make read_segment_descriptor() return the address · e919464b

由 Avi Kivity 提交于 6月 13, 2012

Some operations want to modify the descriptor later on, so save the
address for future use.
Signed-off-by: NAvi Kivity <avi@redhat.com>

e919464b

KVM: x86 emulator: emulate LLDT · a14e579f

由 Avi Kivity 提交于 6月 13, 2012

Opcode 0F 00 /2. Used by isolinux durign the protected mode transition.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a14e579f

KVM: x86 emulator: emulate BSWAP · 9299836e

由 Avi Kivity 提交于 6月 13, 2012

Opcodes 0F C8 - 0F CF.

Used by the SeaBIOS cdrom code (though not in big real mode).
Signed-off-by: NAvi Kivity <avi@redhat.com>

9299836e

A
KVM: VMX: Improve error reporting during invalid guest state emulation · de5f70e0
由 Avi Kivity 提交于 6月 12, 2012
```
If instruction emulation fails, report it properly to userspace.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
de5f70e0

KVM: VMX: Stop invalid guest state emulation on pending event · de87dcdd

由 Avi Kivity 提交于 6月 12, 2012

Process the event, possibly injecting an interrupt, before continuing.
Signed-off-by: NAvi Kivity <avi@redhat.com>

de87dcdd

KVM: x86 emulator: implement ENTER · 612e89f0

由 Avi Kivity 提交于 6月 12, 2012

Opcode C8.

Only ENTER with lexical nesting depth 0 is implemented, since others are
very rare.  We'll fail emulation if nonzero lexical depth is used so data
is not corrupted.
Signed-off-by: NAvi Kivity <avi@redhat.com>

612e89f0

KVM: x86 emulator: split push logic from push opcode emulation · 51ddff50

由 Avi Kivity 提交于 6月 12, 2012

This allows us to reuse the code without populating ctxt->src and
overriding ctxt->op_bytes.
Signed-off-by: NAvi Kivity <avi@redhat.com>

51ddff50

KVM: x86 emulator: fix byte-sized MOVZX/MOVSX · 361cad2b

由 Avi Kivity 提交于 6月 11, 2012

Commit 2adb5ad9 removed ByteOp from MOVZX/MOVSX, replacing them by
SrcMem8, but neglected to fix the dependency in the emulation code
on ByteOp.  This caused the instruction not to have any effect in
some circumstances.

Fix by replacing the check for ByteOp with the equivalent src.op_bytes == 1.
Signed-off-by: NAvi Kivity <avi@redhat.com>

361cad2b

A
KVM: x86 emulator: emulate LAHF · 2dd7caa0
由 Avi Kivity 提交于 6月 11, 2012
```
Opcode 9F.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
2dd7caa0

KVM: VMX: Continue emulating after batch exhausted · 7c068e45

由 Avi Kivity 提交于 6月 10, 2012

If we return early from an invalid guest state emulation loop, make
sure we return to it later if the guest state is still invalid.
Signed-off-by: NAvi Kivity <avi@redhat.com>

7c068e45

KVM: VMX: Fix interrupt exit condition during emulation · bdea48e3

由 Avi Kivity 提交于 6月 10, 2012

Checking EFLAGS.IF is incorrect as we might be in interrupt shadow.  If
that is the case, the main loop will notice that and not inject the interrupt,
causing an endless loop.

Fix by using vmx_interrupt_allowed() to check if we can inject an interrupt
instead.
Signed-off-by: NAvi Kivity <avi@redhat.com>

bdea48e3

A
KVM: x86 emulator: emulate SGDT/SIDT · 96051572
由 Avi Kivity 提交于 6月 10, 2012
```
Opcodes 0F 01 /0 and 0F 01 /1
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
96051572

KVM: Fix SS default ESP/EBP based addressing · a6e3407b

由 Avi Kivity 提交于 6月 10, 2012

We correctly default to SS when BP is used as a base in 16-bit address mode,
but we don't do that for 32-bit mode.

Fix by adjusting the default to SS when either ESP or EBP is used as the base
register.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a6e3407b

KVM: x86 emulator: emulate LEAVE · f47cfa31

由 Avi Kivity 提交于 6月 07, 2012

Opcode c9; used by some variants of Windows during boot, in big real mode.
Signed-off-by: NAvi Kivity <avi@redhat.com>

f47cfa31

KVM: VMX: Limit iterations with emulator_invalid_guest_state · b8405c18

由 Avi Kivity 提交于 6月 07, 2012

Otherwise, if the guest ends up looping, we never exit the srcu critical
section, which causes synchronize_srcu() to hang.
Signed-off-by: NAvi Kivity <avi@redhat.com>

b8405c18

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功