提交 · e3da3acdb32c1804a5c853feebcc037b7434076f · OpenHarmony / kernel_linux

27 4月, 2008 16 次提交

KVM: SVM: add detection of Nested Paging feature · e3da3acd

由 Joerg Roedel 提交于 2月 07, 2008

Let SVM detect if the Nested Paging feature is available on the hardware.
Disable it to keep this patch series bisectable.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e3da3acd

KVM: SVM: move feature detection to hardware setup code · 33bd6a0b

由 Joerg Roedel 提交于 2月 07, 2008

By moving the SVM feature detection from the each_cpu code to the hardware
setup code it runs only once. As an additional advance the feature check is now
available earlier in the module setup process.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

33bd6a0b

KVM: allow access to EFER in 32bit KVM · 9457a712

由 Joerg Roedel 提交于 1月 31, 2008

This patch makes the EFER register accessible on a 32bit KVM host. This is
necessary to boot 32 bit PAE guests under SVM.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9457a712

KVM: VMX: unifdef the EFER specific code · 9f62e19a

由 Joerg Roedel 提交于 1月 31, 2008

To allow access to the EFER register in 32bit KVM the EFER specific code has to
be exported to the x86 generic code. This patch does this in a backwards
compatible manner.

[avi: add check for EFER-less hosts]
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9f62e19a

KVM: align valid EFER bits with the features of the host system · 50a37eb4

由 Joerg Roedel 提交于 1月 31, 2008

This patch aligns the bits the guest can set in the EFER register with the
features in the host processor. Currently it lets EFER.NX disabled if the
processor does not support it and enables EFER.LME and EFER.LMA only for KVM on
64 bit hosts.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

50a37eb4

KVM: make EFER_RESERVED_BITS configurable for architecture code · f2b4b7dd

由 Joerg Roedel 提交于 1月 31, 2008

This patch give the SVM and VMX implementations the ability to add some bits
the guest can set in its EFER register.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f2b4b7dd

KVM: VMX: Enable Virtual Processor Identification (VPID) · 2384d2b3

由 Sheng Yang 提交于 1月 17, 2008

To allow TLB entries to be retained across VM entry and VM exit, the VMM
can now identify distinct address spaces through a new virtual-processor ID
(VPID) field of the VMCS.

[avi: drop vpid_sync_all()]
[avi: add "cc" to asm constraints]
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2384d2b3

KVM: MMU: Decouple mmio from shadow page tables · d196e343

由 Avi Kivity 提交于 1月 24, 2008

Currently an mmio guest pte is encoded in the shadow pagetable as a
not-present trapping pte, with the SHADOW_IO_MARK bit set. However
nothing is ever done with this information, so maintaining it is a
useless complication.

This patch moves the check for mmio to before shadow ptes are instantiated,
so the shadow code is never invoked for ptes that reference mmio. The code
is simpler, and with future work, can be made to handle mmio concurrently.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d196e343

A
KVM: x86 emulator: group decoding for group 1 instructions · 1d6ad207
由 Avi Kivity 提交于 1月 23, 2008
```
Opcodes 0x80-0x83
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
1d6ad207

KVM: x86 emulator: add group 7 decoding · d95058a1

由 Avi Kivity 提交于 1月 18, 2008

This adds group decoding for opcode 0x0f 0x01 (group 7).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d95058a1

KVM: x86 emulator: Group decoding for groups 4 and 5 · fd60754e

由 Avi Kivity 提交于 1月 18, 2008

Add group decoding support for opcode 0xfe (group 4) and 0xff (group 5).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

fd60754e

KVM: x86 emulator: Group decoding for group 3 · 7d858a19

由 Avi Kivity 提交于 1月 18, 2008

This adds group decoding support for opcodes 0xf6, 0xf7 (group 3).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

7d858a19

A
KVM: x86 emulator: group decoding for group 1A · 43bb19cd
由 Avi Kivity 提交于 1月 18, 2008
```
This adds group decode support for opcode 0x8f.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
43bb19cd

KVM: x86 emulator: add support for group decoding · e09d082c

由 Avi Kivity 提交于 1月 18, 2008

Certain x86 instructions use bits 3:5 of the byte following the opcode as an
opcode extension, with the decode sometimes depending on bits 6:7 as well.
Add support for this in the main decoding table rather than an ad-hock
adaptation per opcode.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e09d082c

KVM: MMU: Simplify hash table indexing · 1ae0a13d

由 Dong, Eddie 提交于 1月 07, 2008

Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1ae0a13d

KVM: MMU: Update shadow ptes on partial guest pte writes · 489f1d65

由 Dong, Eddie 提交于 1月 07, 2008

A guest partial guest pte write will leave shadow_trap_nonpresent_pte
in spte, which generates a vmexit at the next guest access through that pte.

This patch improves this by reading the full guest pte in advance and thus
being able to update the spte and eliminate the vmexit.

This helps pae guests which use two 32-bit writes to set a single 64-bit pte.

[truncation fix by Eric]
Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: NFeng (Eric) Liu <eric.e.liu@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

489f1d65

25 3月, 2008 5 次提交

KVM: MMU: Fix memory leak on guest demand faults · e48bb497

由 Avi Kivity 提交于 3月 23, 2008

While backporting 72dc67a6, a gfn_to_page()
call was duplicated instead of moved (due to an unrelated patch not being
present in mainline).  This caused a page reference leak, resulting in a
fairly massive memory leak.

Fix by removing the extraneous gfn_to_page() call.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e48bb497

KVM: VMX: convert init_rmode_tss() to slots_lock · 707a18a5

由 Marcelo Tosatti 提交于 3月 18, 2008

init_rmode_tss was forgotten during the conversion from mmap_sem to
slots_lock.

INFO: task qemu-system-x86:3748 blocked for more than 120 seconds.
Call Trace:
 [<ffffffff8053d100>] __down_read+0x86/0x9e
 [<ffffffff8053fb43>] do_page_fault+0x346/0x78e
 [<ffffffff8053d235>] trace_hardirqs_on_thunk+0x35/0x3a
 [<ffffffff8053dcad>] error_exit+0x0/0xa9
 [<ffffffff8035a7a7>] copy_user_generic_string+0x17/0x40
 [<ffffffff88099a8a>] :kvm:kvm_write_guest_page+0x3e/0x5f
 [<ffffffff880b661a>] :kvm_intel:init_rmode_tss+0xa7/0xf9
 [<ffffffff880b7d7e>] :kvm_intel:vmx_vcpu_reset+0x10/0x38a
 [<ffffffff8809b9a5>] :kvm:kvm_arch_vcpu_setup+0x20/0x53
 [<ffffffff8809a1e4>] :kvm:kvm_vm_ioctl+0xad/0x1cf
 [<ffffffff80249dea>] __lock_acquire+0x4f7/0xc28
 [<ffffffff8028fad9>] vfs_ioctl+0x21/0x6b
 [<ffffffff8028fd75>] do_vfs_ioctl+0x252/0x26b
 [<ffffffff8028fdca>] sys_ioctl+0x3c/0x5e
 [<ffffffff8020b01b>] system_call_after_swapgs+0x7b/0x80
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

707a18a5

KVM: MMU: handle page removal with shadow mapping · 15aaa819

由 Marcelo Tosatti 提交于 3月 17, 2008

Do not assume that a shadow mapping will always point to the same host
frame number.  Fixes crash with madvise(MADV_DONTNEED).

[avi: move after first printk(), add another printk()]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

15aaa819

KVM: MMU: Fix is_rmap_pte() with io ptes · 4b1a80fa

由 Avi Kivity 提交于 3月 23, 2008

is_rmap_pte() doesn't take into account io ptes, which have the avail bit set.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

4b1a80fa

KVM: VMX: Restore tss even on x86_64 · 5dc83262

由 Avi Kivity 提交于 3月 16, 2008

The vmx hardware state restore restores the tss selector and base address, but
not its length. Usually, this does not matter since most of the tss contents
is within the default length of 0x67. However, if a process is using ioperm()
to grant itself I/O port permissions, an additional bitmap within the tss,
but outside the default length is consulted. The effect is that the process
will receive a SIGSEGV instead of transparently accessing the port.

Fix by restoring the tss length. Note that i386 had this working already.

Closes bugzilla 10246.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

5dc83262

04 3月, 2008 7 次提交

KVM: VMX: Avoid rearranging switched guest msrs while they are loaded · 33f9c505

由 Avi Kivity 提交于 2月 27, 2008

KVM tries to run as much as possible with the guest msrs loaded instead of
host msrs, since switching msrs is very expensive.  It also tries to minimize
the number of msrs switched according to the guest mode; for example,
MSR_LSTAR is needed only by long mode guests.  This optimization is done by
setup_msrs().

However, we must not change which msrs are switched while we are running with
guest msr state:

 - switch to guest msr state
 - call setup_msrs(), removing some msrs from the list
 - switch to host msr state, leaving a few guest msrs loaded

An easy way to trigger this is to kexec an x86_64 linux guest.  Early during
setup, the guest will switch EFER to not include SCE.  KVM will stop saving
MSR_LSTAR, and on the next msr switch it will leave the guest LSTAR loaded.
The next host syscall will end up in a random location in the kernel.

Fix by reloading the host msrs before changing the msr list.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

33f9c505

KVM: MMU: Fix race when instantiating a shadow pte · f7d9c7b7

由 Avi Kivity 提交于 2月 26, 2008

For improved concurrency, the guest walk is performed concurrently with other
vcpus.  This means that we need to revalidate the guest ptes once we have
write-protected the guest page tables, at which point they can no longer be
modified.

The current code attempts to avoid this check if the shadow page table is not
new, on the assumption that if it has existed before, the guest could not have
modified the pte without the shadow lock.  However the assumption is incorrect,
as the racing vcpu could have modified the pte, then instantiated the shadow
page, before our vcpu regains control:

  vcpu0        vcpu1

  fault
  walk pte

               modify pte
               fault in same pagetable
               instantiate shadow page

  lookup shadow page
  conclude it is old
  instantiate spte based on stale guest pte

We could do something clever with generation counters, but a test run by
Marcelo suggests this is unnecessary and we can just do the revalidation
unconditionally.  The pte will be in the processor cache and the check can
be quite fast.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f7d9c7b7

KVM: Avoid infinite-frequency local apic timer · 0b975a3c

由 Avi Kivity 提交于 2月 24, 2008

If the local apic initial count is zero, don't start a an hrtimer with infinite
frequency, locking up the host.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0b975a3c

KVM: make MMU_DEBUG compile again · 24993d53

由 Marcelo Tosatti 提交于 2月 14, 2008

the cr3 variable is now inside the vcpu->arch structure.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

24993d53

KVM: move alloc_apic_access_page() outside of non-preemptable region · 5e4a0b3c

由 Marcelo Tosatti 提交于 2月 14, 2008

alloc_apic_access_page() can sleep, while vmx_vcpu_setup is called
inside a non preemptable region. Move it after put_cpu().
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

5e4a0b3c

KVM: SVM: fix Windows XP 64 bit installation crash · a2938c80

由 Joerg Roedel 提交于 2月 13, 2008

While installing Windows XP 64 bit wants to access the DEBUGCTL and the last
branch record (LBR) MSRs. Don't allowing this in KVM causes the installation to
crash. This patch allow the access to these MSRs and fixes the issue.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

a2938c80

KVM: remove the usage of the mmap_sem for the protection of the memory slots. · 72dc67a6

由 Izik Eidus 提交于 2月 10, 2008

This patch replaces the mmap_sem lock for the memory slots with a new
kvm private lock, it is needed beacuse untill now there were cases where
kvm accesses user memory while holding the mmap semaphore.
Signed-off-by: NIzik Eidus <izike@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

72dc67a6

03 3月, 2008 5 次提交

KVM: emulate access to MSR_IA32_MCG_CTL · c7ac679c

由 Joerg Roedel 提交于 2月 11, 2008

Injecting an GP when accessing this MSR lets Windows crash when running some
stress test tools in KVM.  So this patch emulates access to this MSR.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

c7ac679c

KVM: Make the supported cpuid list a host property rather than a vm property · 674eea0f

由 Avi Kivity 提交于 2月 11, 2008

One of the use cases for the supported cpuid list is to create a "greatest
common denominator" of cpu capabilities in a server farm. As such, it is
useful to be able to get the list without creating a virtual machine first.

Since the code does not depend on the vm in any way, all that is needed is
to move it to the device ioctl handler. The capability identifier is also
changed so that binaries made against -rc1 will fail gracefully.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

674eea0f

KVM: Fix kvm_arch_vcpu_ioctl_set_sregs so that set_cr0 works properly · d7306163

由 Paul Knowles 提交于 2月 06, 2008

Whilst working on getting a VM to initialize in to IA32e mode I found
this issue. set_cr0 relies on comparing the old cr0 to the new one to
work correctly.  Move the assignment below so the compare can work.
Signed-off-by: NPaul Knowles <paul@transitive.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d7306163

KVM: SVM: set NM intercept when enabling CR0.TS in the guest · 6b390b63

由 Joerg Roedel 提交于 1月 29, 2008

Explicitly enable the NM intercept in svm_set_cr0 if we enable TS in the guest
copy of CR0 for lazy FPU switching. This fixes guest SMP with Linux under SVM.
Without that patch Linux deadlocks or panics right after trying to boot the
other CPUs.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6b390b63

KVM: SVM: Fix lazy FPU switching · 334df50a

由 Joerg Roedel 提交于 1月 21, 2008

If the guest writes to cr0 and leaves the TS flag at 0 while vcpu->fpu_active
is also 0, the TS flag in the guest's cr0 gets lost. This leads to corrupt FPU
state an causes Windows Vista 64bit to crash very soon after boot.  This patch
fixes this bug.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

334df50a

06 2月, 2008 1 次提交

kvm: i386 fix · c0b49b0d

由 Andrew Morton 提交于 2月 04, 2008

arch/x86/kvm/x86.c: In function 'emulator_cmpxchg_emulated':
arch/x86/kvm/x86.c:1746: warning: passing argument 2 of 'vcpu->arch.mmu.gva_to_gpa' makes integer from pointer without a cast
arch/x86/kvm/x86.c:1746: warning: 'addr' is used uninitialized in this function

Is true.  Local variable `addr' shadows incoming arg `addr'.  Avi is on
vacation for a while, so...

Cc: Avi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c0b49b0d

04 2月, 2008 1 次提交

virtio: Put the virtio under the virtualization menu · 0ad07ec1

由 Anthony Liguori 提交于 11月 07, 2007

This patch moves virtio under the virtualization menu and changes virtio
devices to not claim to only be for lguest.
Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

0ad07ec1

31 1月, 2008 5 次提交

KVM: Move apic timer migration away from critical section · 2f52d58c

由 Avi Kivity 提交于 1月 16, 2008

Migrating the apic timer in the critical section is not very nice, and is
absolutely horrible with the real-time port.  Move migration to the regular
vcpu execution path, triggered by a new bitflag.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2f52d58c

KVM: Fix unbounded preemption latency · 6c142801

由 Avi Kivity 提交于 1月 15, 2008

When preparing to enter the guest, if an interrupt comes in while
preemption is disabled but interrupts are still enabled, we miss a
preemption point.  Fix by explicitly checking whether we need to
reschedule.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6c142801

KVM: Initialize the mmu caches only after verifying cpu support · 97db56ce

由 Avi Kivity 提交于 1月 13, 2008

Otherwise we re-initialize the mmu caches, which will fail since the
caches are already registered, which will cause us to deinitialize said caches.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

97db56ce

KVM: MMU: Fix dirty page setting for pages removed from rmap · 75e68e60

由 Izik Eidus 提交于 1月 12, 2008

Right now rmap_remove won't set the page as dirty if the shadow pte
pointed to this page had write access and then it became readonly.
This patches fixes that, by setting the page as dirty for spte changes from
write to readonly access.
Signed-off-by: NIzik Eidus <izike@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

75e68e60

KVM: x86 emulator: Only allow VMCALL/VMMCALL trapped by #UD · 571008da

由 Sheng Yang 提交于 1月 02, 2008

When executing a test program called "crashme", we found the KVM guest cannot
survive more than ten seconds, then encounterd kernel panic. The basic concept
of "crashme" is generating random assembly code and trying to execute it.

After some fixes on emulator insn validity judgment, we found it's hard to
get the current emulator handle the invalid instructions correctly, for the
#UD trap for hypercall patching caused troubles. The problem is, if the opcode
itself was OK, but combination of opcode and modrm_reg was invalid, and one
operand of the opcode was memory (SrcMem or DstMem), the emulator will fetch
the memory operand first rather than checking the validity, and may encounter
an error there. For example, ".byte 0xfe, 0x34, 0xcd" has this problem.

In the patch, we simply check that if the invalid opcode wasn't vmcall/vmmcall,
then return from emulate_instruction() and inject a #UD to guest. With the
patch, the guest had been running for more than 12 hours.
Signed-off-by: NFeng (Eric) Liu <eric.e.liu@intel.com>
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

571008da

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年