提交 · ed85c0685321a139cefd6622b21467643f0159e1 · openeuler / Kernel

10 9月, 2009 40 次提交

KVM: introduce module parameter for ignoring unknown MSRs accesses · ed85c068

由 Andre Przywara 提交于 6月 25, 2009

KVM will inject a #GP into the guest if that tries to access unhandled
MSRs. This will crash many guests. Although it would be the correct
way to actually handle these MSRs, we introduce a runtime switchable
module param called "ignore_msrs" (defaults to 0). If this is Y, unknown
MSR reads will return 0, while MSR writes are simply dropped. In both cases
we print a message to dmesg to inform the user about that.

You can change the behaviour at any time by saying:

 # echo 1 > /sys/modules/kvm/parameters/ignore_msrs
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ed85c068

KVM: ignore reads from AMDs C1E enabled MSR · 1fdbd48c

由 Andre Przywara 提交于 6月 24, 2009

If the Linux kernel detects an C1E capable AMD processor (K8 RevF and
higher), it will access a certain MSR on every attempt to go to halt.
Explicitly handle this read and return 0 to let KVM run a Linux guest
with the native AMD host CPU propagated to the guest.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1fdbd48c

KVM: ignore AMDs HWCR register access to set the FFDIS bit · 8f1589d9

由 Andre Przywara 提交于 6月 24, 2009

Linux tries to disable the flush filter on all AMD K8 CPUs. Since KVM
does not handle the needed MSR, the injected #GP will panic the Linux
kernel. Ignore setting of the HWCR.FFDIS bit in this MSR to let Linux
boot with an AMD K8 family guest CPU.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8f1589d9

KVM: x86: missing locking in PIT/IRQCHIP/SET_BSP_CPU ioctl paths · 894a9c55

由 Marcelo Tosatti 提交于 6月 23, 2009

Correct missing locking in a few places in x86's vm_ioctl handling path.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

894a9c55

KVM: Prepare memslot data structures for multiple hugepage sizes · ec04b260

由 Joerg Roedel 提交于 6月 19, 2009

[avi: fix build on non-x86]
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ec04b260

hugetlbfs: export vma_kernel_pagsize to modules · f340ca0f

由 Joerg Roedel 提交于 6月 19, 2009

This function is required by KVM.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f340ca0f

KVM: s390: Fix memslot initialization for userspace_addr != 0 · 3eea8437

由 Christian Borntraeger 提交于 6月 23, 2009

Since
commit 854b5338196b1175706e99d63be43a4f8d8ab607
Author: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
    KVM: s390: streamline memslot handling

s390 uses the values of the memslot instead of doing everything in the arch
ioctl handler of the KVM_SET_USER_MEMORY_REGION. Unfortunately we missed to
set the userspace_addr of our memslot due to our s390 ifdef in
__kvm_set_memory_region.
Old s390 userspace launchers did not notice, since they started the guest at
userspace address 0.
Because of CONFIG_DEFAULT_MMAP_MIN_ADDR we now put the guest at 1M userspace,
which does not work. This patch makes sure that new.userspace_addr is set
on s390.
This fix should go in quickly. Nevertheless, looking at the code we should
clean up that ifdef in the long term. Any kernel janitors?
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3eea8437

KVM: x86 emulator: Add sysexit emulation · 4668f050

由 Andre Przywara 提交于 6月 18, 2009

Handle #UD intercept of the sysexit instruction in 64bit mode returning to
32bit compat mode on an AMD host.
Setup the segment descriptors for CS and SS and the EIP/ESP registers
according to the manual.
Signed-off-by: NChristoph Egger <christoph.egger@amd.com>
Signed-off-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4668f050

KVM: x86 emulator: Add sysenter emulation · 8c604352

由 Andre Przywara 提交于 6月 18, 2009

Handle #UD intercept of the sysenter instruction in 32bit compat mode on
an AMD host.
Setup the segment descriptors for CS and SS and the EIP/ESP registers
according to the manual.
Signed-off-by: NChristoph Egger <christoph.egger@amd.com>
Signed-off-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8c604352

KVM: x86 emulator: add syscall emulation · e66bb2cc

由 Andre Przywara 提交于 6月 18, 2009

Handle #UD intercept of the syscall instruction in 32bit compat mode on
an Intel host.
Setup the segment descriptors for CS and SS and the EIP/ESP registers
according to the manual. Save the RIP and EFLAGS to the correct registers.

[avi: fix build on i386 due to missing R11]
Signed-off-by: NChristoph Egger <christoph.egger@amd.com>
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e66bb2cc

KVM: x86 emulator: Prepare for emulation of syscall instructions · e99f0507

由 Andre Przywara 提交于 6月 17, 2009

Add the flags needed for syscall, sysenter and sysexit to the opcode table.
Catch (but for now ignore) the opcodes in the emulation switch/case.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NChristoph Egger <christoph.egger@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e99f0507

KVM: x86 emulator: Add missing EFLAGS bit definitions · b1d86143

由 Andre Przywara 提交于 6月 17, 2009

Signed-off-by: NChristoph Egger <christoph.egger@amd.com>
Signed-off-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b1d86143

KVM: Allow emulation of syscalls instructions on #UD · 0cb5762e

由 Andre Przywara 提交于 6月 17, 2009

Add the opcodes for syscall, sysenter and sysexit to the list of instructions
handled by the undefined opcode handler.
Signed-off-by: NChristoph Egger <christoph.egger@amd.com>
Signed-off-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0cb5762e

KVM: convert custom marker based tracing to event traces · 229456fc

由 Marcelo Tosatti 提交于 6月 17, 2009

This allows use of the powerful ftrace infrastructure.

See Documentation/trace/ for usage information.

[avi, stephen: various build fixes]
[sheng: fix control register breakage]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

229456fc

KVM: SVM: Improve nested interrupt injection · 219b65dc

由 Alexander Graf 提交于 6月 15, 2009

While trying to get Hyper-V running, I realized that the interrupt injection
mechanisms that are in place right now are not 100% correct.

This patch makes nested SVM's interrupt injection behave more like on a
real machine.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

219b65dc

KVM: SVM: Implement INVLPGA · ff092385

由 Alexander Graf 提交于 6月 15, 2009

SVM adds another way to do INVLPG by ASID which Hyper-V makes use of,
so let's implement it!

For now we just do the same thing invlpg does, as asid switching
means we flush the mmu anyways. That might change one day though.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ff092385

KVM: Implement MSRs used by Hyper-V · 3c5d0a44

由 Alexander Graf 提交于 6月 15, 2009

Hyper-V uses some MSRs, some of which are actually reserved for BIOS usage.

But let's be nice today and have it its way, because otherwise it fails
terribly.

[jaswinder: fix build for linux-next changes]
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3c5d0a44

x86: Add definition for IGNNE MSR · 0367b433

由 Alexander Graf 提交于 6月 15, 2009

Hyper-V accesses MSR_IGNNE while running under KVM.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0367b433

KVM: SVM: Don't save/restore host cr2 · b3dbf89e

由 Avi Kivity 提交于 6月 16, 2009

The host never reads cr2 in process context, so are free to clobber it. The
vmx code does this, so we can safely remove the save/restore code.
Signed-off-by: NAvi Kivity <avi@redhat.com>

b3dbf89e

KVM: VMX: Only reload guest cr2 if different from host cr2 · d3edefc0

由 Avi Kivity 提交于 6月 16, 2009

cr2 changes only rarely, and writing it is expensive.  Avoid the costly cr2
writes by checking if it does not already hold the desired value.

Shaves 70 cycles off the vmexit latency.
Signed-off-by: NAvi Kivity <avi@redhat.com>

d3edefc0

KVM: Drop useless atomic test from timer function · 681405bf

由 Jan Kiszka 提交于 6月 09, 2009

The current code tries to optimize the setting of
KVM_REQ_PENDING_TIMER but used atomic_inc_and_test - which always
returns true unless pending had the invalid value of -1 on entry. This
patch drops the test part preserving the original semantic but
expressing it less confusingly.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

681405bf

KVM: Fix racy event propagation in timer · f7104db2

由 Jan Kiszka 提交于 6月 09, 2009

Minor issue that likely had no practical relevance: the kvm timer
function so far incremented the pending counter and then may reset it
again to 1 in case reinjection was disabled. This opened a small racy
window with the corresponding VCPU loop that may have happened to run
on another (real) CPU and already consumed the value.

Fix it by skipping the incrementation in case pending is already > 0.
This opens a different race windows, but may only rarely cause lost
events in case we do not care about them anyway (!reinject).
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f7104db2

KVM: Optimize searching for highest IRR · 33e4c686

由 Gleb Natapov 提交于 6月 11, 2009

Most of the time IRR is empty, so instead of scanning the whole IRR on
each VM entry keep a variable that tells us if IRR is not empty. IRR
will have to be scanned twice on each IRQ delivery, but this is much
more rare than VM entry.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

33e4c686

KVM: Replace pending exception by PF if it happens serially · 6edf14d8

由 Gleb Natapov 提交于 6月 11, 2009

Replace previous exception with a new one in a hope that instruction
re-execution will regenerate lost exception.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6edf14d8

KVM: VMX: conditionally disable 2M pages · 54dee993

由 Marcelo Tosatti 提交于 6月 11, 2009

Disable usage of 2M pages if VMX_EPT_2MB_PAGE_BIT (bit 16) is clear
in MSR_IA32_VMX_EPT_VPID_CAP and EPT is enabled.

[avi: s/largepages_disabled/largepages_enabled/ to avoid negative logic]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

54dee993

KVM: VMX: EPT misconfiguration handler · 68f89400

由 Marcelo Tosatti 提交于 6月 11, 2009

Handler for EPT misconfiguration which checks for valid state
in the shadow pagetables, printing the spte on each level.

The separate WARN_ONs are useful for kerneloops.org.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

68f89400

KVM: MMU: add kvm_mmu_get_spte_hierarchy helper · 94d8b056

由 Marcelo Tosatti 提交于 6月 11, 2009

Required by EPT misconfiguration handler.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

94d8b056

KVM: MMU: make for_each_shadow_entry aware of largepages · 4d88954d

由 Marcelo Tosatti 提交于 6月 11, 2009

This way there is no need to add explicit checks in every
for_each_shadow_entry user.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4d88954d

KVM: VMX: more MSR_IA32_VMX_EPT_VPID_CAP capability bits · e799794e

由 Marcelo Tosatti 提交于 6月 11, 2009

Required for EPT misconfiguration handler.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e799794e

KVM: Move performance counter MSR access interception to generic x86 path · 71db6023

由 Andre Przywara 提交于 6月 12, 2009

The performance counter MSRs are different for AMD and Intel CPUs and they
are chosen mainly by the CPUID vendor string. This patch catches writes to
all addresses (regardless of VMX/SVM path) and handles them in the generic
MSR handler routine. Writing a 0 into the event select register is something
we perfectly emulate ;-), so don't print out a warning to dmesg in this
case.
This fixes booting a 64bit Windows guest with an AMD CPUID on an Intel host.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

71db6023

KVM: MMU audit: largepage handling · 2920d728

由 Marcelo Tosatti 提交于 6月 10, 2009

Make the audit code aware of largepages.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2920d728

KVM: MMU audit: audit_mappings tweaks · 2aaf65e8

由 Marcelo Tosatti 提交于 6月 10, 2009

- Fail early in case gfn_to_pfn returns is_error_pfn.
- For the pre pte write case, avoid spurious "gva is valid but spte is notrap"
  messages (the emulation code does the guest write first, so this particular
  case is OK).
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2aaf65e8

KVM: MMU audit: nontrapping ptes in nonleaf level · 48fc0317

由 Marcelo Tosatti 提交于 6月 10, 2009

It is valid to set non leaf sptes as notrap.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

48fc0317

KVM: MMU audit: update audit_write_protection · e58b0f9e

由 Marcelo Tosatti 提交于 6月 10, 2009

- Unsync pages contain writable sptes in the rmap.
- rmaps do not exclusively contain writable sptes anymore.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e58b0f9e

KVM: MMU audit: update count_writable_mappings / count_rmaps · 08a3732b

由 Marcelo Tosatti 提交于 6月 10, 2009

Under testing, count_writable_mappings returns a value that is 2 integers
larger than what count_rmaps returns.

Suspicion is that either of the two functions is counting a duplicate (either
positively or negatively).

Modifying check_writable_mappings_rmap to check for rmap existance on
all present MMU pages fails to trigger an error, which should keep Avi
happy.

Also introduce mmu_spte_walk to invoke a callback on all present sptes visible
to the current vcpu, might be useful in the future.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

08a3732b

KVM: MMU: introduce is_last_spte helper · 776e6633

由 Marcelo Tosatti 提交于 6月 10, 2009

Hiding some of the last largepage / level interaction (which is useful
for gbpages and for zero based levels).

Also merge the PT_PAGE_TABLE_LEVEL clearing loop in unlink_children.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

776e6633

KVM: Return to userspace on emulation failure · 3f5d18a9

由 Avi Kivity 提交于 6月 11, 2009

Instead of mindlessly retrying to execute the instruction, report the
failure to userspace.
Signed-off-by: NAvi Kivity <avi@redhat.com>

3f5d18a9

KVM: Use macro to iterate over vcpus. · 988a2cae

由 Gleb Natapov 提交于 6月 09, 2009

[christian: remove unused variables on s390]
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

988a2cae

KVM: Break dependency between vcpu index in vcpus array and vcpu_id. · 73880c80

由 Gleb Natapov 提交于 6月 09, 2009

Archs are free to use vcpu_id as they see fit. For x86 it is used as
vcpu's apic id. New ioctl is added to configure boot vcpu id that was
assumed to be 0 till now.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

73880c80

G
KVM: Use pointer to vcpu instead of vcpu_id in timer code. · 1ed0ce00
由 Gleb Natapov 提交于 6月 09, 2009
```
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
1ed0ce00

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功