提交 · 02f59dc9f1f51d2148d87d48f84adb455a4fd697 · openanolis / cloud-kernel

24 10月, 2010 29 次提交

KVM: MMU: Introduce init_kvm_nested_mmu() · 02f59dc9

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces the init_kvm_nested_mmu() function
which is used to re-initialize the nested mmu when the l2
guest changes its paging mode.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

02f59dc9

KVM: MMU: Introduce kvm_read_nested_guest_page() · 3d06b8bf

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces the kvm_read_guest_page_x86 function
which reads from the physical memory of the guest. If the
guest is running in guest-mode itself with nested paging
enabled it will read from the guest's guest physical memory
instead.
The patch also changes changes the code to use this function
where it is necessary.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3d06b8bf

KVM: X86: Add kvm_read_guest_page_mmu function · ec92fe44

由 Joerg Roedel 提交于 9月 10, 2010

This patch adds a function which can read from the guests
physical memory or from the guest's guest physical memory.
This will be used in the two-dimensional page table walker.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ec92fe44

KVM: X86: Introduce pointer to mmu context used for gva_to_gpa · 14dfe855

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces the walk_mmu pointer which points to
the mmu-context currently used for gva_to_gpa translations.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

14dfe855

KVM: MMU: Add infrastructure for two-level page walker · c30a358d

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces a mmu-callback to translate gpa
addresses in the walk_addr code. This is later used to
translate l2_gpa addresses into l1_gpa addresses.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c30a358d

KVM: MMU: Track page fault data in struct vcpu · 8df25a32

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces a struct with two new fields in
vcpu_arch for x86:

	* fault.address
	* fault.error_code

This will be used to correctly propagate page faults back
into the guest when we could have either an ordinary page
fault or a nested page fault. In the case of a nested page
fault the fault-address is different from the original
address that should be walked. So we need to keep track
about the real fault-address.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8df25a32

KVM: x86: Emulate MSR_EBC_FREQUENCY_ID · 7b914098

由 Jes Sorensen 提交于 9月 09, 2010

Some operating systems store data about the host processor at the
time of installation, and when booted on a more uptodate cpu tries
to read MSR_EBC_FREQUENCY_ID. This has been found with XP.
Signed-off-by: NJes Sorensen <Jes.Sorensen@redhat.com>
Reviewed-by: NJuan Quintela <quintela@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7b914098

KVM: Fix guest kernel crash on MSR_K7_CLK_CTL · 84e0cefa

由 Jes Sorensen 提交于 9月 01, 2010

MSR_K7_CLK_CTL is a no longer documented MSR, which is only relevant
on said old AMD CPU models. This change returns the expected value,
which the Linux kernel is expecting to avoid writing back the MSR,
plus it ignores all writes to the MSR.
Signed-off-by: NJes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

84e0cefa

KVM: Don't save/restore MSR_IA32_PERF_STATUS · e90aa41e

由 Avi Kivity 提交于 9月 01, 2010

It is read/only; restoring it only results in annoying messages.
Signed-off-by: NAvi Kivity <avi@redhat.com>

e90aa41e

KVM: Fix pio trace direction · c41a15dd

由 Avi Kivity 提交于 8月 30, 2010

out = write, in = read, not the other way round.
Signed-off-by: NAvi Kivity <avi@redhat.com>

c41a15dd

KVM: Fix build error due to 64-bit division in nsec_to_cycles() · 217fc9cf

由 Avi Kivity 提交于 8月 26, 2010

Use do_div() instead.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

217fc9cf

KVM: x86 emulator: get rid of "restart" in emulation context. · d2ddd1c4

由 Gleb Natapov 提交于 8月 25, 2010

x86_emulate_insn() will return 1 if instruction can be restarted
without re-entering a guest.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d2ddd1c4

KVM: x86: Fix a possible backwards warp of kvmclock · 1d5f066e

由 Zachary Amsden 提交于 8月 19, 2010

Kernel time, which advances in discrete steps may progress much slower
than TSC.  As a result, when kvmclock is adjusted to a new base, the
apparent time to the guest, which runs at a much higher, nsec scaled
rate based on the current TSC, may have already been observed to have
a larger value (kernel_ns + scaled tsc) than the value to which we are
setting it (kernel_ns + 0).

We must instead compute the clock as potentially observed by the guest
for kernel_ns to make sure it does not go backwards.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1d5f066e

KVM: x86: Add clock sync request to hardware enable · ca84d1a2

由 Zachary Amsden 提交于 8月 19, 2010

If there are active VCPUs which are marked as belonging to
a particular hardware CPU, request a clock sync for them when
enabling hardware; the TSC could be desynchronized on a newly
arriving CPU, and we need to recompute guests system time
relative to boot after a suspend event.

This covers both cases.

Note that it is acceptable to take the spinlock, as either
no other tasks will be running and no locks held (BSP after
resume), or other tasks will be guaranteed to drop the lock
relatively quickly (AP on CPU_STARTING).

Noting we now get clock synchronization requests for VCPUs
which are starting up (or restarting), it is tempting to
attempt to remove the arch/x86/kvm/x86.c CPU hot-notifiers
at this time, however it is not correct to do so; they are
required for systems with non-constant TSC as the frequency
may not be known immediately after the processor has started
until the cpufreq driver has had a chance to run and query
the chipset.

Updated: implement better locking semantics for hardware_enable

Removed the hack of dropping and retaking the lock by adding the
semantic that we always hold kvm_lock when hardware_enable is
called.  The one place that doesn't need to worry about it is
resume, as resuming a frozen CPU, the spinlock won't be taken.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ca84d1a2

KVM: x86: Robust TSC compensation · 46543ba4

由 Zachary Amsden 提交于 8月 19, 2010

Make the match of TSC find TSC writes that are close to each other
instead of perfectly identical; this allows the compensator to also
work in migration / suspend scenarios.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

46543ba4

KVM: x86: Add helper functions for time computation · 759379dd

由 Zachary Amsden 提交于 8月 19, 2010

Add a helper function to compute the kernel time and convert nanoseconds
back to CPU specific cycles.  Note that these must not be called in preemptible
context, as that would mean the kernel could enter software suspend state,
which would cause non-atomic operation.

Also, convert the KVM_SET_CLOCK / KVM_GET_CLOCK ioctls to use the kernel
time helper, these should be bootbased as well.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

759379dd

KVM: x86: Fix deep C-state TSC desynchronization · 48434c20

由 Zachary Amsden 提交于 8月 19, 2010

When CPUs with unstable TSCs enter deep C-state, TSC may stop
running.  This causes us to require resynchronization.  Since
we can't tell when this may potentially happen, we assume the
worst by forcing re-compensation for it at every point the VCPU
task is descheduled.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

48434c20

KVM: x86: Unify TSC logic · e48672fa

由 Zachary Amsden 提交于 8月 19, 2010

Move the TSC control logic from the vendor backends into x86.c
by adding adjust_tsc_offset to x86 ops.  Now all TSC decisions
can be done in one place.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e48672fa

KVM: x86: Warn about unstable TSC · 6755bae8

由 Zachary Amsden 提交于 8月 19, 2010

If creating an SMP guest with unstable host TSC, issue a warning
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6755bae8

KVM: x86: Make cpu_tsc_khz updates use local CPU · 8cfdc000

由 Zachary Amsden 提交于 8月 19, 2010

This simplifies much of the init code; we can now simply always
call tsc_khz_changed, optionally passing it a new value, or letting
it figure out the existing value (while interrupts are disabled, and
thus, by inference from the rule, not raceful against CPU hotplug or
frequency updates, which will issue IPIs to the local CPU to perform
this very same task).
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

8cfdc000

KVM: x86: TSC reset compensation · f38e098f

由 Zachary Amsden 提交于 8月 19, 2010

Attempt to synchronize TSCs which are reset to the same value.  In the
case of a reliable hardware TSC, we can just re-use the same offset, but
on non-reliable hardware, we can get closer by adjusting the offset to
match the elapsed time.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f38e098f

KVM: x86: Move TSC offset writes to common code · 99e3e30a

由 Zachary Amsden 提交于 8月 19, 2010

Also, ensure that the storing of the offset and the reading of the TSC
are never preempted by taking a spinlock.  While the lock is overkill
now, it is useful later in this patch series.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

99e3e30a

KVM: x86: Drop vm_init_tsc · ae38436b

由 Zachary Amsden 提交于 8月 19, 2010

This is used only by the VMX code, and is not done properly;
if the TSC is indeed backwards, it is out of sync, and will
need proper handling in the logic at each and every CPU change.
For now, drop this test during init as misguided.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ae38436b

KVM: rename x86 kvm->arch.n_alloc_mmu_pages · 39de71ec

由 Dave Hansen 提交于 8月 19, 2010

arch.n_alloc_mmu_pages is a poor choice of name. This value truly
means, "the number of pages which _may_ be allocated".  But,
reading the name, "n_alloc_mmu_pages" implies "the number of allocated
mmu pages", which is dead wrong.

It's really the high watermark, so let's give it a name to match:
nr_max_mmu_pages.  This change will make the next few patches
much more obvious and easy to read.
Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

39de71ec

KVM: Separate emulation context initialization in a separate function · 8ec4722d

由 Mohammed Gamal 提交于 8月 16, 2010

The code for initializing the emulation context is duplicated at two
locations (emulate_instruction() and kvm_task_switch()). Separate it
in a separate function and call it from there.
Signed-off-by: NMohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8ec4722d

KVM: x86 emulator: Allow accessing IDT via emulator ops · 160ce1f1

由 Mohammed Gamal 提交于 8月 04, 2010

The patch adds a new member get_idt() to x86_emulate_ops.
It also adds a function to get the idt in order to be used by the emulator.

This is needed for real mode interrupt injection and the emulation of int
instructions.
Signed-off-by: NMohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

160ce1f1

KVM: x86 emulator: check io permissions only once for string pio · 4fc40f07

由 Gleb Natapov 提交于 8月 02, 2010

Do not recheck io permission on every iteration.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4fc40f07

KVM: x86 emulator: don't update vcpu state if instruction is restarted · e85d28f8

由 Gleb Natapov 提交于 7月 29, 2010

No need to update vcpu state since instruction is in the middle of the
emulation.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e85d28f8

KVM: x86 emulator: store x86_emulate_ops in emulation context · 9aabc88f

由 Avi Kivity 提交于 7月 29, 2010

It doesn't ever change, so we don't need to pass it around everywhere.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9aabc88f

15 8月, 2010 1 次提交

KVM: fix poison overwritten caused by using wrong xstate size · f45755b8

由 Xiaotian Feng 提交于 8月 13, 2010

fpu.state is allocated from task_xstate_cachep, the size of task_xstate_cachep
is xstate_size. xstate_size is set from cpuid instruction, which is often
smaller than sizeof(struct xsave_struct). kvm is using sizeof(struct xsave_struct)
to fill in/out fpu.state.xsave, as what we allocated for fpu.state is
xstate_size, kernel will write out of memory and caused poison/redzone/padding
overwritten warnings.
Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
Reviewed-by: NSheng Yang <sheng@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f45755b8

02 8月, 2010 5 次提交

KVM: x86 emulator: fix xchg instruction emulation · c19b8bd6

由 Wei Yongjun 提交于 7月 15, 2010

If the destination is a memory operand and the memory cannot
map to a valid page, the xchg instruction emulation and locked
instruction will not work on io regions and stuck in endless
loop. We should emulate exchange as write to fix it.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c19b8bd6

KVM: x86: never re-execute instruction with enabled tdp · 68be0803

由 Gleb Natapov 提交于 7月 14, 2010

With tdp enabled we should get into emulator only when emulating io, so
reexecution will always bring us back into emulator.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

68be0803

KVM: Expose MCE control MSRs to userspace · 908e75f3

由 Avi Kivity 提交于 7月 07, 2010

Userspace needs to reset and save/restore these MSRs.

The MCE banks are not exposed since their number varies from vcpu to vcpu.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

908e75f3

KVM: PIT: stop vpit before freeing irq_routing · aea924f6

由 Xiao Guangrong 提交于 7月 10, 2010

Fix:
general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
......
Call Trace:
 [<ffffffffa0159bd1>] ? kvm_set_irq+0xdd/0x24b [kvm]
 [<ffffffff8106ea8b>] ? trace_hardirqs_off_caller+0x1f/0x10e
 [<ffffffff813ad17f>] ? sub_preempt_count+0xe/0xb6
 [<ffffffff8106d273>] ? put_lock_stats+0xe/0x27
...
RIP  [<ffffffffa0159c72>] kvm_set_irq+0x17e/0x24b [kvm]

This bug is triggered when guest is shutdown, is because we freed
irq_routing before pit thread stopped
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

aea924f6

KVM: Reenter guest after emulation failure if due to access to non-mmio address · a6f177ef

由 Gleb Natapov 提交于 7月 08, 2010

When shadow pages are in use sometimes KVM try to emulate an instruction
when it accesses a shadowed page. If emulation fails KVM un-shadows the
page and reenter guest to allow vcpu to execute the instruction. If page
is not in shadow page hash KVM assumes that this was attempt to do MMIO
and reports emulation failure to userspace since there is no way to fix
the situation. This logic has a race though. If two vcpus tries to write
to the same shadowed page simultaneously both will enter emulator, but
only one of them will find the page in shadow page hash since the one who
founds it also removes it from there, so another cpu will report failure
to userspace and will abort the guest.

Fix this by checking (in addition to checking shadowed page hash) that
page that caused the emulation belongs to valid memory slot. If it is
then reenter the guest to allow vcpu to reexecute the instruction.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a6f177ef

01 8月, 2010 5 次提交

KVM: VMX: Execute WBINVD to keep data consistency with assigned devices · f5f48ee1

由 Sheng Yang 提交于 6月 30, 2010

Some guest device driver may leverage the "Non-Snoop" I/O, and explicitly
WBINVD or CLFLUSH to a RAM space. Since migration may occur before WBINVD or
CLFLUSH, we need to maintain data consistency either by:
1: flushing cache (wbinvd) when the guest is scheduled out if there is no
wbinvd exit, or
2: execute wbinvd on all dirty physical CPUs when guest wbinvd exits.
Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f5f48ee1

KVM: Simplify vcpu_enter_guest() mmu reload logic slightly · 3e007509

由 Avi Kivity 提交于 6月 23, 2010

No need to reload the mmu in between two different vcpu->requests checks.

kvm_mmu_reload() may trigger KVM_REQ_TRIPLE_FAULT, but that will be caught
during atomic guest entry later.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

3e007509

KVM: x86: Enable AVX for guest · 6c3f6041

由 Sheng Yang 提交于 6月 22, 2010

Enable Intel(R) Advanced Vector Extension(AVX) for guest.

The detection of AVX feature includes OSXSAVE bit testing. When OSXSAVE bit is
not set, even if AVX is supported, the AVX instruction would result in UD as
well. So we're safe to expose AVX bits to guest directly.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6c3f6041

KVM: Prevent internal slots from being COWed · 7ac77099

由 Avi Kivity 提交于 6月 21, 2010

If a process with a memory slot is COWed, the page will change its address
(despite having an elevated reference count).  This breaks internal memory
slots which have their physical addresses loaded into vmcs registers (see
the APIC access memory slot).
Signed-off-by: NAvi Kivity <avi@redhat.com>

7ac77099

KVM: Add mini-API for vcpu->requests · a8eeb04a

由 Avi Kivity 提交于 5月 10, 2010

Makes it a little more readable and hackable.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a8eeb04a

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功