提交 · 34c33d163fe509da8414a736c6328855f8c164e5 · Linux-御风守护者 / linux

24 3月, 2009 11 次提交

KVM: Drop unused evaluations from string pio handlers · 34c33d16

由 Jan Kiszka 提交于 2月 08, 2009

Looks like neither the direction nor the rep prefix are used anymore.
Drop related evaluations from SVM's and VMX's I/O exit handlers.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

34c33d16

A
KVM: VMX: When emulating on invalid vmx state, don't return to userspace unnecessarily · 8b3079a5
由 Avi Kivity 提交于 1月 05, 2009
```
If we aren't doing mmio there's no need to exit to userspace (which will
just be confused).
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
8b3079a5

KVM: VMX: Prevent exit handler from running if emulating due to invalid state · 10f32d84

由 Avi Kivity 提交于 1月 05, 2009

If we've just emulated an instruction, we won't have any valid exit
reason and associated information.

Fix by moving the clearing of the emulation_required flag to the exit handler.
This way the exit handler can notice that we've been emulating and abort
early.
Signed-off-by: NAvi Kivity <avi@redhat.com>

10f32d84

KVM: VMX: don't clobber segment AR if emulating invalid state · 9fd4a3b7

由 Avi Kivity 提交于 1月 04, 2009

The ususable bit is important for determining state validity; don't
clobber it.
Signed-off-by: NAvi Kivity <avi@redhat.com>

9fd4a3b7

KVM: VMX: Fix guest state validity checks · 1872a3f4

由 Avi Kivity 提交于 1月 04, 2009

The vmx guest state validity checks are full of bugs.  Make them
conform to the manual.
Signed-off-by: NAvi Kivity <avi@redhat.com>

1872a3f4

KVM: VMX: initialize TSC offset relative to vm creation time · 53f658b3

由 Marcelo Tosatti 提交于 12月 11, 2008

VMX initializes the TSC offset for each vcpu at different times, and
also reinitializes it for vcpus other than 0 on APIC SIPI message.

This bug causes the TSC's to appear unsynchronized in the guest, even if
the host is good.

Older Linux kernels don't handle the situation very well, so
gettimeofday is likely to go backwards in time:

http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831

Fix it by initializating the offset of each vcpu relative to vm creation
time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
APIC MP init path.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

53f658b3

KVM: x86: Wire-up hardware breakpoints for guest debugging · ae675ef0

由 Jan Kiszka 提交于 12月 15, 2008

Add the remaining bits to make use of debug registers also for guest
debugging, thus enabling the use of hardware breakpoints and
watchpoints.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ae675ef0

KVM: x86: Virtualize debug registers · 42dbaa5a

由 Jan Kiszka 提交于 12月 15, 2008

So far KVM only had basic x86 debug register support, once introduced to
realize guest debugging that way. The guest itself was not able to use
those registers.

This patch now adds (almost) full support for guest self-debugging via
hardware registers. It refactors the code, moving generic parts out of
SVM (VMX was already cleaned up by the KVM_SET_GUEST_DEBUG patches), and
it ensures that the registers are properly switched between host and
guest.

This patch also prepares debug register usage by the host. The latter
will (once wired-up by the following patch) allow for hardware
breakpoints/watchpoints in guest code. If this is enabled, the guest
will only see faked debug registers without functionality, but with
content reflecting the guest's modifications.

Tested on Intel only, but SVM /should/ work as well, but who knows...

Known limitations: Trapping on tss switch won't work - most probably on
Intel.

Credits also go to Joerg Roedel - I used his once posted debugging
series as platform for this patch.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

42dbaa5a

KVM: VMX: Allow single-stepping when uninterruptible · 55934c0b

由 Jan Kiszka 提交于 12月 15, 2008

When single-stepping over STI and MOV SS, we must clear the
corresponding interruptibility bits in the guest state. Otherwise
vmentry fails as it then expects bit 14 (BS) in pending debug exceptions
being set, but that's not correct for the guest debugging case.

Note that clearing those bits is safe as we check for interruptibility
based on the original state and do not inject interrupts or NMIs if
guest interruptibility was blocked.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

55934c0b

KVM: New guest debug interface · d0bfb940

由 Jan Kiszka 提交于 12月 15, 2008

This rips out the support for KVM_DEBUG_GUEST and introduces a new IOCTL
instead: KVM_SET_GUEST_DEBUG. The IOCTL payload consists of a generic
part, controlling the "main switch" and the single-step feature. The
arch specific part adds an x86 interface for intercepting both types of
debug exceptions separately and re-injecting them when the host was not
interested. Moveover, the foundation for guest debugging via debug
registers is layed.

To signal breakpoint events properly back to userland, an arch-specific
data block is now returned along KVM_EXIT_DEBUG. For x86, the arch block
contains the PC, the debug exception, and relevant debug registers to
tell debug events properly apart.

The availability of this new interface is signaled by
KVM_CAP_SET_GUEST_DEBUG. Empty stubs for not yet supported archs are
provided.

Note that both SVM and VTX are supported, but only the latter was tested
yet. Based on the experience with all those VTX corner case, I would be
fairly surprised if SVM will work out of the box.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d0bfb940

KVM: VMX: Support for injecting software exceptions · 8ab2d2e2

由 Jan Kiszka 提交于 12月 15, 2008

VMX differentiates between processor and software generated exceptions
when injecting them into the guest. Extend vmx_queue_exception
accordingly (and refactor related constants) so that we can use this
service reliably for the new guest debugging framework.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8ab2d2e2

15 2月, 2009 3 次提交

KVM: VMX: Flush volatile msrs before emulating rdmsr · 516a1a7e

由 Avi Kivity 提交于 2月 15, 2009

Some msrs (notable MSR_KERNEL_GS_BASE) are held in the processor registers
and need to be flushed to the vcpu struture before they can be read.

This fixes cygwin longjmp() failure on Windows x64.
Signed-off-by: NAvi Kivity <avi@redhat.com>

516a1a7e

KVM: x86: fix LAPIC pending count calculation · b682b814

由 Marcelo Tosatti 提交于 2月 10, 2009

Simplify LAPIC TMCCT calculation by using hrtimer provided
function to query remaining time until expiration.

Fixes host hang with nested ESX.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b682b814

KVM: MMU: Map device MMIO as UC in EPT · 2aaf69dc

由 Sheng Yang 提交于 1月 21, 2009

Software are not allow to access device MMIO using cacheable memory type, the
patch limit MMIO region with UC and WC(guest can select WC using PAT and
PCD/PWT).
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2aaf69dc

31 12月, 2008 20 次提交

KVM: x86: Rework user space NMI injection as KVM_CAP_USER_NMI · 4531220b

由 Jan Kiszka 提交于 12月 11, 2008

There is no point in doing the ready_for_nmi_injection/
request_nmi_window dance with user space. First, we don't do this for
in-kernel irqchip anyway, while the code path is the same as for user
space irqchip mode. And second, there is nothing to loose if a pending
NMI is overwritten by another one (in contrast to IRQs where we have to
save the number). Actually, there is even the risk of raising spurious
NMIs this way because the reason for the held-back NMI might already be
handled while processing the first one.

Therefore this patch creates a simplified user space NMI injection
interface, exporting it under KVM_CAP_USER_NMI and dropping the old
KVM_CAP_NMI capability. And this time we also take care to provide the
interface only on archs supporting NMIs via KVM (right now only x86).
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4531220b

KVM: VMX: Fix pending NMI-vs.-IRQ race for user space irqchip · 264ff01d

由 Jan Kiszka 提交于 11月 24, 2008

As with the kernel irqchip, don't allow an NMI to stomp over an already
injected IRQ; instead wait for the IRQ injection to be completed.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

264ff01d

KVM: VMX: fix sparse warning · efff9e53

由 Hannes Eder 提交于 11月 28, 2008

Impact: make global function static

  arch/x86/kvm/vmx.c:134:3: warning: symbol 'vmx_capability' was not declared. Should it be static?
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NAvi Kivity <avi@redhat.com>

efff9e53

KVM: VMX: Conditionally request interrupt window after injecting irq · df203ec9

由 Avi Kivity 提交于 11月 23, 2008

If we're injecting an interrupt, and another one is pending, request
an interrupt window notification so we don't have excess latency on the
second interrupt.

This shouldn't happen in practice since an EOI will be issued, giving a second
chance to request an interrupt window, but...
Signed-off-by: NAvi Kivity <avi@redhat.com>

df203ec9

KVM: VMX: extract kvm_cpu_vmxoff() from hardware_disable() · 710ff4a8

由 Eduardo Habkost 提交于 11月 17, 2008

Along with some comments on why it is different from the core cpu_vmxoff()
function.
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

710ff4a8

KVM: VMX: move cpu_has_kvm_support() to an inline on asm/virtext.h · 6210e37b

由 Eduardo Habkost 提交于 11月 17, 2008

It will be used by core code on kdump and reboot, to disable
vmx if needed.
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6210e37b

KVM: VMX: move vmx.h to include/asm · 13673a90

由 Eduardo Habkost 提交于 11月 17, 2008

vmx.h will be used by core code that is independent of KVM, so I am
moving it outside the arch/x86/kvm directory.
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

13673a90

KVM: VMX: Handle mmio emulation when guest state is invalid · 1d5a4d9b

由 Guillaume Thouvenin 提交于 10月 29, 2008

If emulate_invalid_guest_state is enabled, the emulator is called
when guest state is invalid.  Until now, we reported an mmio failure
when emulate_instruction() returned EMULATE_DO_MMIO.  This patch adds
the case where emulate_instruction() failed and an MMIO emulation
is needed.
Signed-off-by: NGuillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1d5a4d9b

KVM: allow emulator to adjust rip for emulated pio instructions · e93f36bc

由 Guillaume Thouvenin 提交于 10月 28, 2008

If we call the emulator we shouldn't call skip_emulated_instruction()
in the first place, since the emulator already computes the next rip
for us. Thus we move ->skip_emulated_instruction() out of
kvm_emulate_pio() and into handle_io() (and the svm equivalent). We
also replaced "return 0" by "break" in the "do_io:" case because now
the shadow register state needs to be committed. Otherwise eip will never
be updated.
Signed-off-by: NGuillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e93f36bc

KVM: VMX: Move private memory slot position · 6fe63979

由 Sheng Yang 提交于 10月 16, 2008

PCI device assignment would map guest MMIO spaces as separate slot, so it is
possible that the device has more than 2 MMIO spaces and overwrite current
private memslot.

The patch move private memory slot to the top of userspace visible memory slots.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6fe63979

KVM: Enable MTRR for EPT · 64d4d521

由 Sheng Yang 提交于 10月 09, 2008

The effective memory type of EPT is the mixture of MSR_IA32_CR_PAT and memory
type field of EPT entry.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

64d4d521

KVM: VMX: Add PAT support for EPT · 468d472f

由 Sheng Yang 提交于 10月 09, 2008

GUEST_PAT support is a new feature introduced by Intel Core i7 architecture.
With this, cpu would save/load guest and host PAT automatically, for EPT memory
type in guest depends on MSR_IA32_CR_PAT.

Also add save/restore for MSR_IA32_CR_PAT.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

468d472f

KVM: VMX: work around lacking VNMI support · 3b86cd99

由 Jan Kiszka 提交于 9月 26, 2008

Older VMX supporting CPUs do not provide the "Virtual NMI" feature for
tracking the NMI-blocked state after injecting such events. For now
KVM is unable to inject NMIs on those CPUs.

Derived from Sheng Yang's suggestion to use the IRQ window notification
for detecting the end of NMI handlers, this patch implements virtual
NMI support without impact on the host's ability to receive real NMIs.
The downside is that the given approach requires some heuristics that
can cause NMI nesting in vary rare corner cases.

The approach works as follows:
 - inject NMI and set a software-based NMI-blocked flag
 - arm the IRQ window start notification whenever an NMI window is
   requested
 - if the guest exits due to an opening IRQ window, clear the emulated
   NMI-blocked flag
 - if the guest net execution time with NMI-blocked but without an IRQ
   window exceeds 1 second, force NMI-blocked reset and inject anyway

This approach covers most practical scenarios:
 - succeeding NMIs are seperated by at least one open IRQ window
 - the guest may spin with IRQs disabled (e.g. due to a bug), but
   leaving the NMI handler takes much less time than one second
 - the guest does not rely on strict ordering or timing of NMIs
   (would be problematic in virtualized environments anyway)

Successfully tested with the 'nmi n' monitor command, the kgdbts
testsuite on smp guests (additional patches required to add debug
register support to kvm) + the kernel's nmi_watchdog=1, and a Siemens-
specific board emulation (+ guest) that comes with its own NMI
watchdog mechanism.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3b86cd99

KVM: VMX: Provide support for user space injected NMIs · 487b391d

由 Jan Kiszka 提交于 9月 26, 2008

This patch adds the required bits to the VMX side for user space
injected NMIs. As with the preexisting in-kernel irqchip support, the
CPU must provide the "virtual NMI" feature for proper tracking of the
NMI blocking state.

Based on the original patch by Sheng Yang.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

487b391d

KVM: VMX: fix real-mode NMI support · 66a5a347

由 Jan Kiszka 提交于 9月 26, 2008

Fix NMI injection in real-mode with the same pattern we perform IRQ
injection.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

66a5a347

KVM: VMX: refactor IRQ and NMI window enabling · f460ee43

由 Jan Kiszka 提交于 9月 26, 2008

do_interrupt_requests and vmx_intr_assist go different way for
achieving the same: enabling the nmi/irq window start notification.
Unify their code over enable_{irq|nmi}_window, get rid of a redundant
call to enable_intr_window instead of direct enable_nmi_window
invocation and unroll enable_intr_window for both in-kernel and user
space irq injection accordingly.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f460ee43

KVM: VMX: refactor/fix IRQ and NMI injectability determination · 33f089ca

由 Jan Kiszka 提交于 9月 26, 2008

There are currently two ways in VMX to check if an IRQ or NMI can be
injected:
 - vmx_{nmi|irq}_enabled and
 - vcpu.arch.{nmi|interrupt}_window_open.
Even worse, one test (at the end of vmx_vcpu_run) uses an inconsistent,
likely incorrect logic.

This patch consolidates and unifies the tests over
{nmi|interrupt}_window_open as cache + vmx_update_window_states
for updating the cache content.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

33f089ca

KVM: VMX: Support for NMI task gates · 60637aac

由 Jan Kiszka 提交于 9月 26, 2008

Properly set GUEST_INTR_STATE_NMI and reset nmi_injected when a
task-switch vmexit happened due to a task gate being used for handling
NMIs. Also avoid the false warning about valid vectoring info in
kvm_handle_exit.

Based on original patch by Gleb Natapov.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

60637aac

J
KVM: VMX: Use INTR_TYPE_NMI_INTR instead of magic value · e4a41889
由 Jan Kiszka 提交于 9月 26, 2008
```
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
e4a41889

KVM: VMX: include all IRQ window exits in statistics · a26bf12a

由 Jan Kiszka 提交于 9月 26, 2008

irq_window_exits only tracks IRQ window exits due to user space
requests, nmi_window_exits include all exits. The latter makes more
sense, so let's adjust irq_window_exits accounting.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a26bf12a

23 11月, 2008 1 次提交

KVM: VMX: Fix interrupt loss during race with NMI · bd2b3ca7

由 Avi Kivity 提交于 11月 20, 2008

If an interrupt cannot be injected for some reason (say, page fault
when fetching the IDT descriptor), the interrupt is marked for
reinjection.  However, if an NMI is queued at this time, the NMI
will be injected instead and the NMI will be lost.

Fix by deferring the NMI injection until the interrupt has been
injected successfully.

Analyzed by Jan Kiszka.
Signed-off-by: NAvi Kivity <avi@redhat.com>

bd2b3ca7

12 11月, 2008 1 次提交

KVM: VMX: Set IGMT bit in EPT entry · 928d4bf7

由 Sheng Yang 提交于 11月 06, 2008

There is a potential issue that, when guest using pagetable without vmexit when
EPT enabled, guest would use PAT/PCD/PWT bits to index PAT msr for it's memory,
which would be inconsistent with host side and would cause host MCE due to
inconsistent cache attribute.

The patch set IGMT bit in EPT entry to ignore guest PAT and use WB as default
memory type to protect host (notice that all memory mapped by KVM should be WB).
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

928d4bf7

15 10月, 2008 4 次提交

KVM: VMX: enable invlpg exiting if EPT is disabled · 83dbc83a

由 Marcelo Tosatti 提交于 10月 07, 2008

Manually disabling EPT via module option fails to re-enable INVLPG
exiting.
Reported-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

83dbc83a

KVM: x86: trap invlpg · a7052897

由 Marcelo Tosatti 提交于 9月 23, 2008

With pages out of sync invlpg needs to be trapped. For now simply nuke
the entry.

Untested on AMD.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a7052897

KVM: switch to get_user_pages_fast · 4c2155ce

由 Marcelo Tosatti 提交于 9月 16, 2008

Convert gfn_to_pfn to use get_user_pages_fast, which can do lockless
pagetable lookups on x86. Kernel compilation on 4-way guest is 3.7%
faster on VMX.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4c2155ce

KVM: VMX: Rename IA32_FEATURE_CONTROL bits · 9ea542fa

由 Sheng Yang 提交于 9月 11, 2008

Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9ea542fa

Linux-御风守护者 / linux 与 Fork 源项目一致

Linux-御风守护者 / linux
与 Fork 源项目一致