提交 · e1035715ef8d3171e29f9c6aee6f40d57b3fead5 · openeuler / raspberrypi-kernel

10 6月, 2009 12 次提交

KVM: change the way how lowest priority vcpu is calculated · e1035715

由 Gleb Natapov 提交于 3月 05, 2009

The new way does not require additional loop over vcpus to calculate
the one with lowest priority as one is chosen during delivery bitmap
construction.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e1035715

KVM: consolidate ioapic/ipi interrupt delivery logic · 343f94fe

由 Gleb Natapov 提交于 3月 05, 2009

Use kvm_apic_match_dest() in kvm_get_intr_delivery_bitmask() instead
of duplicating the same code. Use kvm_get_intr_delivery_bitmask() in
apic_send_ipi() to figure out ipi destination instead of reimplementing
the logic.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

343f94fe

KVM: APIC: kvm_apic_set_irq deliver all kinds of interrupts · 6da7e3f6

由 Gleb Natapov 提交于 3月 05, 2009

Get rid of ioapic_inj_irq() and ioapic_inj_nmi() functions.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6da7e3f6

KVM: MMU: remove call to kvm_mmu_pte_write from walk_addr · f5a1e9f8

由 Joerg Roedel 提交于 3月 05, 2009

There is no reason to update the shadow pte here because the guest pte
is only changed to dirty state.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f5a1e9f8

KVM: unify part of generic timer handling · d3c7b77d

由 Marcelo Tosatti 提交于 2月 23, 2009

Hide the internals of vcpu awakening / injection from the in-kernel
emulated timers. This makes future changes in this logic easier and
decreases the distance to more generic timer handling.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d3c7b77d

KVM: PIT: remove usage of count_load_time for channel 0 · fd668423

由 Marcelo Tosatti 提交于 2月 23, 2009

We can infer elapsed time from hrtimer_expires_remaining.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fd668423

KVM: PIT: remove unused scheduled variable · 5a05d545

由 Marcelo Tosatti 提交于 2月 23, 2009

Unused.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5a05d545

KVM: x86: silence preempt warning on kvm_write_guest_time · 2dea4c84

由 Matt T. Yourst 提交于 2月 24, 2009

This issue just appeared in kvm-84 when running on 2.6.28.7 (x86-64)
with PREEMPT enabled.

We're getting syslog warnings like this many (but not all) times qemu
tells KVM to run the VCPU:

BUG: using smp_processor_id() in preemptible [00000000] code:
qemu-system-x86/28938
caller is kvm_arch_vcpu_ioctl_run+0x5d1/0xc70 [kvm]
Pid: 28938, comm: qemu-system-x86 2.6.28.7-mtyrel-64bit
Call Trace:
debug_smp_processor_id+0xf7/0x100
kvm_arch_vcpu_ioctl_run+0x5d1/0xc70 [kvm]
? __wake_up+0x4e/0x70
? wake_futex+0x27/0x40
kvm_vcpu_ioctl+0x2e9/0x5a0 [kvm]
enqueue_hrtimer+0x8a/0x110
_spin_unlock_irqrestore+0x27/0x50
vfs_ioctl+0x31/0xa0
do_vfs_ioctl+0x74/0x480
sys_futex+0xb4/0x140
sys_ioctl+0x99/0xa0
system_call_fastpath+0x16/0x1b

As it turns out, the call trace is messed up due to gcc's inlining, but
I isolated the problem anyway: kvm_write_guest_time() is being used in a
non-thread-safe manner on preemptable kernels.

Basically kvm_write_guest_time()'s body needs to be surrounded by
preempt_disable() and preempt_enable(), since the kernel won't let us
query any per-CPU data (indirectly using smp_processor_id()) without
preemption disabled. The attached patch fixes this issue by disabling
preemption inside kvm_write_guest_time().

[marcelo: surround only __get_cpu_var calls since the warning
is harmless]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2dea4c84

KVM: bit ops for deliver_bitmap · bfd349d0

由 Sheng Yang 提交于 2月 11, 2009

It's also convenient when we extend KVM supported vcpu number in the future.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bfd349d0

KVM: Update intr delivery func to accept unsigned long* bitmap · 110c2fae

由 Sheng Yang 提交于 2月 11, 2009

Would be used with bit ops, and would be easily extended if KVM_MAX_VCPUS is
increased.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

110c2fae

KVM: VMX: Don't intercept MSR_KERNEL_GS_BASE · 5897297b

由 Avi Kivity 提交于 2月 24, 2009

Windows 2008 accesses this MSR often on context switch intensive workloads;
since we run in guest context with the guest MSR value loaded (so swapgs can
work correctly), we can simply disable interception of rdmsr/wrmsr for this
MSR.

A complication occurs since in legacy mode, we run with the host MSR value
loaded. In this case we enable interception.  This means we need two MSR
bitmaps, one for legacy mode and one for long mode.
Signed-off-by: NAvi Kivity <avi@redhat.com>

5897297b

KVM: VMX: Don't use highmem pages for the msr and pio bitmaps · 3e7c73e9

由 Avi Kivity 提交于 2月 24, 2009

Highmem pages are a pain, and saving three lowmem pages on i386 isn't worth
the extra code.
Signed-off-by: NAvi Kivity <avi@redhat.com>

3e7c73e9

26 5月, 2009 2 次提交

KVM: Fix PDPTR reloading on CR4 writes · a2edf57f

由 Avi Kivity 提交于 5月 24, 2009

The processor is documented to reload the PDPTRs while in PAE mode if any
of the CR4 bits PSE, PGE, or PAE change.  Linux relies on this
behaviour when zapping the low mappings of PAE kernels during boot.

The code already handled changes to CR4.PAE; augment it to also notice changes
to PSE and PGE.

This triggered while booting an F11 PAE kernel; the futex initialization code
runs before any CR3 reloads and writes to a NULL pointer; the futex subsystem
ended up uninitialized, killing PI futexes and pulseaudio which uses them.

Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

a2edf57f

KVM: Make paravirt tlb flush also reload the PAE PDPTRs · a8cd0244

由 Avi Kivity 提交于 5月 24, 2009

The paravirt tlb flush may be used not only to flush TLBs, but also
to reload the four page-directory-pointer-table entries, as it is used
as a replacement for reloading CR3.  Change the code to do the entire
CR3 reloading dance instead of simply flushing the TLB.

Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

a8cd0244

11 5月, 2009 4 次提交

KVM: SVM: Remove port 80 passthrough · 99f85a28

由 Avi Kivity 提交于 5月 11, 2009

KVM optimizes guest port 80 accesses by passthing them through to the host.
Some AMD machines die on port 80 writes, allowing the guest to hard-lock the
host.

Remove the port passthrough to avoid the problem.

Cc: stable@kernel.org
Reported-by: NPiotr Jaroszyński <p.jaroszynski@gmail.com>
Tested-by: NPiotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

99f85a28

KVM: Make EFER reads safe when EFER does not exist · e286e86e

由 Avi Kivity 提交于 5月 03, 2009

Some processors don't have EFER; don't oops if userspace wants us to
read EFER when we check NX.

Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

e286e86e

A
KVM: Fix NX support reporting · 334b8ad7
由 Avi Kivity 提交于 5月 03, 2009
```
NX support is bit 20, not bit 1.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
334b8ad7

KVM: SVM: Fix cross vendor migration issue with unusable bit · 19bca6ab

由 Andre Przywara 提交于 4月 28, 2009

AMDs VMCB does not have an explicit unusable segment descriptor field,
so we emulate it by using "not present". This has to be setup before
the fixups, because this field is used there.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

19bca6ab

22 4月, 2009 3 次提交

KVM: Unregister cpufreq notifier on unload · 888d256e

由 Jan Kiszka 提交于 4月 17, 2009

Properly unregister cpufreq notifier on onload if it was registered
during init.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

888d256e

KVM: x86: release time_page on vcpu destruction · 7f1ea208

由 Joerg Roedel 提交于 2月 25, 2009

Not releasing the time_page causes a leak of that page or the compound
page it is situated in.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7f1ea208

KVM: MMU: disable global page optimization · bf47a760

由 Marcelo Tosatti 提交于 4月 05, 2009

Complexity to fix it not worthwhile the gains, as discussed
in http://article.gmane.org/gmane.comp.emulators.kvm.devel/28649.

Cc: stable@kernel.org
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bf47a760

24 3月, 2009 19 次提交

KVM: VMX: Don't allow uninhibited access to EFER on i386 · 16175a79

由 Avi Kivity 提交于 3月 23, 2009

vmx_set_msr() does not allow i386 guests to touch EFER, but they can still
do so through the default: label in the switch.  If they set EFER_LME, they
can oops the host.

Fix by having EFER access through the normal channel (which will check for
EFER_LME) even on i386.
Reported-and-tested-by: NBenjamin Gilbert <bgilbert@cs.cmu.edu>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

16175a79

KVM: Fix missing smp tlb flush in invlpg · 4539b358

由 Andrea Arcangeli 提交于 3月 12, 2009

When kvm emulates an invlpg instruction, it can drop a shadow pte, but
leaves the guest tlbs intact.  This can cause memory corruption when
swapping out.

Without this the other cpu can still write to a freed host physical page.
tlb smp flush must happen if rmap_remove is called always before mmu_lock
is released because the VM will take the mmu_lock before it can finally add
the page to the freelist after swapout. mmu notifier makes it safe to flush
the tlb after freeing the page (otherwise it would never be safe) so we can do
a single flush for multiple sptes invalidated.

Cc: stable@kernel.org
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4539b358

KVM: fix sparse warnings: Should it be static? · cded19f3

由 Hannes Eder 提交于 2月 21, 2009

Impact: Make symbols static.

Fix this sparse warnings:
arch/x86/kvm/mmu.c:992:5: warning: symbol 'mmu_pages_add' was not declared. Should it be static?
arch/x86/kvm/mmu.c:1124:5: warning: symbol 'mmu_pages_next' was not declared. Should it be static?
arch/x86/kvm/mmu.c:1144:6: warning: symbol 'mmu_pages_clear_parents' was not declared. Should it be static?
arch/x86/kvm/x86.c:2037:5: warning: symbol 'kvm_read_guest_virt' was not declared. Should it be static?
arch/x86/kvm/x86.c:2067:5: warning: symbol 'kvm_write_guest_virt' was not declared. Should it be static?
virt/kvm/irq_comm.c:220:5: warning: symbol 'setup_routing_entry' was not declared. Should it be static?
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NAvi Kivity <avi@redhat.com>

cded19f3

KVM: fix sparse warnings: context imbalance · d7364a29

由 Hannes Eder 提交于 2月 21, 2009

Impact: Attribute function with __acquires(...) resp. __releases(...).

Fix this sparse warnings:
arch/x86/kvm/i8259.c:34:13: warning: context imbalance in 'pic_lock' - wrong count at exit
arch/x86/kvm/i8259.c:39:13: warning: context imbalance in 'pic_unlock' - unexpected unlock
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d7364a29

KVM: is_long_mode() should check for EFER.LMA · 41d6af11

由 Amit Shah 提交于 2月 28, 2008

is_long_mode currently checks the LongModeEnable bit in
EFER instead of the LongModeActive bit. This is wrong, but
we survived this till now since it wasn't triggered. This
breaks guests that go from long mode to compatibility mode.

This is noticed on a solaris guest and fixes bug #1842160
Signed-off-by: NAmit Shah <amit.shah@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

41d6af11

KVM: VMX: Update necessary state when guest enters long mode · 401d10de

由 Amit Shah 提交于 2月 20, 2009

setup_msrs() should be called when entering long mode to save the
shadow state for the 64-bit guest state.

Using vmx_set_efer() in enter_lmode() removes some duplicated code
and also ensures we call setup_msrs(). We can safely pass the value
of shadow_efer to vmx_set_efer() as no other bits in the efer change
while enabling long mode (guest first sets EFER.LME, then sets CR0.PG
which causes a vmexit where we activate long mode).

With this fix, is_long_mode() can check for EFER.LMA set instead of
EFER.LME and 5e23049e86dd298b72e206b420513dbc3a240cd9 can be reverted.
Signed-off-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

401d10de

KVM: MMU: Fix another largepage memory leak · c5bc2242

由 Joerg Roedel 提交于 2月 19, 2009

In the paging_fetch function rmap_remove is called after setting a large
pte to non-present. This causes rmap_remove to not drop the reference to
the large page. The result is a memory leak of that page.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c5bc2242

KVM: SVM: set accessed bit for VMCB segment selectors · 1fbdc7a5

由 Andre Przywara 提交于 1月 11, 2009

In the segment descriptor _cache_ the accessed bit is always set
(although it can be cleared in the descriptor itself). Since Intel
checks for this condition on a VMENTRY, set this bit in the AMD path
to enable cross vendor migration.

Cc: stable@kernel.org
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Acked-By: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1fbdc7a5

KVM: Report IRQ injection status to userspace. · 4925663a

由 Gleb Natapov 提交于 2月 04, 2009

IRQ injection status is either -1 (if there was no CPU found
that should except the interrupt because IRQ was masked or
ioapic was misconfigured or ...) or >= 0 in that case the
number indicates to how many CPUs interrupt was injected.
If the value is 0 it means that the interrupt was coalesced
and probably should be reinjected.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4925663a

KVM: MMU: remove assertion in kvm_mmu_alloc_page · 452425db

由 Joerg Roedel 提交于 2月 18, 2009

The assertion no longer makes sense since we don't clear page tables on
allocation; instead we clear them during prefetch.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

452425db

KVM: MMU: remove redundant check in mmu_set_spte · 6bed6b9e

由 Joerg Roedel 提交于 2月 18, 2009

The following code flow is unnecessary:

	if (largepage)
		was_rmapped = is_large_pte(*shadow_pte);
	 else
	 	was_rmapped = 1;

The is_large_pte() function will always evaluate to one here because the
(largepage && !is_large_pte) case is already handled in the first
if-clause. So we can remove this check and set was_rmapped to one always
here.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6bed6b9e

KVM: Fix kvmclock on !constant_tsc boxes · c8076604

由 Gerd Hoffmann 提交于 2月 04, 2009

kvmclock currently falls apart on machines without constant tsc.
This patch fixes it.  Changes:

  * keep tsc frequency in a per-cpu variable.
  * handle kvmclock update using a new request flag, thus checking
    whenever we need an update each time we enter guest context.
  * use a cpufreq notifier to track frequency changes and force
    kvmclock updates.
  * send ipis to kick cpu out of guest context if needed to make
    sure the guest doesn't see stale values.
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c8076604

KVM: VMX: Use kvm_mmu_page_fault() handle EPT violation mmio · 49cd7d22

由 Sheng Yang 提交于 2月 11, 2009

Removed duplicated code.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

49cd7d22

KVM: Drop unused evaluations from string pio handlers · 34c33d16

由 Jan Kiszka 提交于 2月 08, 2009

Looks like neither the direction nor the rep prefix are used anymore.
Drop related evaluations from SVM's and VMX's I/O exit handlers.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

34c33d16

KVM: Add FFXSR support · 1b2fd70c

由 Alexander Graf 提交于 2月 02, 2009

AMD K10 CPUs implement the FFXSR feature that gets enabled using
EFER. Let's check if the virtual CPU description includes that
CPUID feature bit and allow enabling it then.

This is required for Windows Server 2008 in Hyper-V mode.

v2 adds CPUID capability exposure
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1b2fd70c

KVM: make irq ack notifications aware of routing table · 44882eed

由 Marcelo Tosatti 提交于 1月 27, 2009

IRQ ack notifications assume an identity mapping between pin->gsi,
which might not be the case with, for example, HPET.

Translate before acking.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Acked-by: NGleb Natapov <gleb@redhat.com>

44882eed

KVM: Userspace controlled irq routing · 399ec807

由 Avi Kivity 提交于 11月 19, 2008

Currently KVM has a static routing from GSI numbers to interrupts (namely,
0-15 are mapped 1:1 to both PIC and IOAPIC, and 16:23 are mapped 1:1 to
the IOAPIC).  This is insufficient for several reasons:

- HPET requires non 1:1 mapping for the timer interrupt
- MSIs need a new method to assign interrupt numbers and dispatch them
- ACPI APIC mode needs to be able to reassign the PCI LINK interrupts to the
  ioapics

This patch implements an interrupt routing table (as a linked list, but this
can be easily changed) and a userspace interface to replace the table.  The
routing table is initialized according to the current hardwired mapping.
Signed-off-by: NAvi Kivity <avi@redhat.com>

399ec807

KVM: x86: Fix typos and whitespace errors · 19355475

由 Amit Shah 提交于 1月 14, 2009

Some typos, comments, whitespace errors corrected in the cpuid code
Signed-off-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

19355475

A
KVM: MMU: Only enable cr4_pge role in shadow mode · 5a41accd
由 Avi Kivity 提交于 1月 11, 2009
```
Two dimensional paging is only confused by it.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
5a41accd