提交 · 722c05f2192070bac0208b2c16ce13929b32d92f · openanolis / cloud-kernel

20 7月, 2008 9 次提交

KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts · 722c05f2

由 Avi Kivity 提交于 7月 13, 2008

The direct mapped shadow code (used for real mode and two dimensional paging)
sets upper-level ptes using direct assignment rather than calling
set_shadow_pte().  A nonpae host will split this into two writes, which opens
up a race if another vcpu accesses the same memory area.

Fix by calling set_shadow_pte() instead of assigning directly.

Noticed by Izik Eidus.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

722c05f2

KVM: MMU: improve invalid shadow root page handling · 376c53c2

由 Marcelo Tosatti 提交于 7月 10, 2008

Harden kvm_mmu_zap_page() against invalid root pages that
had been shadowed from memslots that are gone.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

376c53c2

KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held · 5a4c9288

由 Marcelo Tosatti 提交于 7月 03, 2008

kvm_mmu_zap_page() needs slots lock held (rmap_remove->gfn_to_memslot,
for example).

Since kvm_lock spinlock is held in mmu_shrink(), do a non-blocking
down_read_trylock().

Untested.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

5a4c9288

A
KVM: MMU: Fix printk format · db475c39
由 Avi Kivity 提交于 6月 22, 2008
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
db475c39
A
KVM: MMU: When debug is enabled, make it a run-time parameter · 6ada8cca
由 Avi Kivity 提交于 6月 22, 2008
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
6ada8cca

KVM: MMU: Avoid page prefetch on SVM · 131d8279

由 Avi Kivity 提交于 5月 29, 2008

SVM cannot benefit from page prefetching since guest page fault bypass
cannot by made to work there.  Avoid accessing the guest page table in
this case.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

131d8279

A
KVM: MMU: Move nonpaging_prefetch_page() · d761a501
由 Avi Kivity 提交于 5月 29, 2008
```
In preparation for next patch. No code change.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
d761a501

KVM: MMU: Fix false flooding when a pte points to page table · 1b7fcd32

由 Avi Kivity 提交于 5月 15, 2008

The KVM MMU tries to detect when a speculative pte update is not actually
used by demand fault, by checking the accessed bit of the shadow pte. If
the shadow pte has not been accessed, we deem that page table flooded and
remove the shadow page table, allowing further pte updates to proceed
without emulation.

However, if the pte itself points at a page table and only used for write
operations, the accessed bit will never be set since all access will happen
through the emulator.

This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels.
The kernel points a kmap_atomic() pte at a page table, and then
proceeds with read-modify-write operations to look at the dirty and accessed
bits. We get a false flood trigger on the kmap ptes, which results in the
mmu spending all its time setting up and tearing down shadows.

Fix by setting the shadow accessed bit on emulated accesses.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1b7fcd32

KVM: add statics were possible, function definition in lapic.h · 8b2cf73c

由 Harvey Harrison 提交于 4月 27, 2008

Noticed by sparse:
arch/x86/kvm/vmx.c:1583:6: warning: symbol 'vmx_disable_intercept_for_msr' was not declared. Should it be static?
arch/x86/kvm/x86.c:3406:5: warning: symbol 'kvm_task_switch_16' was not declared. Should it be static?
arch/x86/kvm/x86.c:3429:5: warning: symbol 'kvm_task_switch_32' was not declared. Should it be static?
arch/x86/kvm/mmu.c:1968:6: warning: symbol 'kvm_mmu_remove_one_alloc_mmu_page' was not declared. Should it be static?
arch/x86/kvm/mmu.c:2014:6: warning: symbol 'mmu_destroy_caches' was not declared. Should it be static?
arch/x86/kvm/lapic.c:862:5: warning: symbol 'kvm_lapic_get_base' was not declared. Should it be static?
arch/x86/kvm/i8254.c:94:5: warning: symbol 'pit_get_gate' was not declared. Should it be static?
arch/x86/kvm/i8254.c:196:5: warning: symbol '__pit_timer_fn' was not declared. Should it be static?
arch/x86/kvm/i8254.c:561:6: warning: symbol '__inject_pit_timer_intr' was not declared. Should it be static?
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

8b2cf73c

24 6月, 2008 3 次提交

KVM: MMU: Fix oops on guest userspace access to guest pagetable · 6bf6a953

由 Avi Kivity 提交于 6月 12, 2008

KVM has a heuristic to unshadow guest pagetables when userspace accesses
them, on the assumption that most guests do not allow userspace to access
pagetables directly. Unfortunately, in addition to unshadowing the pagetables,
it also oopses.

This never triggers on ordinary guests since sane OSes will clear the
pagetables before assigning them to userspace, which will trigger the flood
heuristic, unshadowing the pagetables before the first userspace access. One
particular guest, though (Xenner) will run the kernel in userspace, triggering
the oops.  Since the heuristic is incorrect in this case, we can simply
remove it.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6bf6a953

KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend) · 30945387

由 Marcelo Tosatti 提交于 6月 11, 2008

kvm_mmu_pte_write() does not handle 32-bit non-PAE large page backed
guests properly. It will instantiate two 2MB sptes pointing to the same
physical 2MB page when a guest large pte update is trapped.

Instead of duplicating code to handle this, disallow directory level
updates to happen through kvm_mmu_pte_write(), so the two 2MB sptes
emulating one guest 4MB pte can be correctly created by the page fault
handling path.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

30945387

KVM: MMU: Fix rmap_write_protect() hugepage iteration bug · 6597ca09

由 Marcelo Tosatti 提交于 6月 08, 2008

rmap_next() does not work correctly after rmap_remove(), as it expects
the rmap chains not to change during iteration.  Fix (for now) by restarting
iteration from the beginning.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6597ca09

07 6月, 2008 2 次提交

KVM: MMU: Fix is_empty_shadow_page() check · 3c915510

由 Avi Kivity 提交于 5月 20, 2008

The check is only looking at one of two possible empty ptes.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

3c915510

KVM: MMU: reschedule during shadow teardown · 8d2d73b9

由 Avi Kivity 提交于 6月 04, 2008

Shadows for large guests can take a long time to tear down, so reschedule
occasionally to avoid softlockup warnings.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

8d2d73b9

23 5月, 2008 1 次提交
- I
  namespacecheck: automated fixes · 2ddfd20e
  由 Ingo Molnar 提交于 5月 22, 2008
```
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
  2ddfd20e
04 5月, 2008 6 次提交

KVM: MMU: Allow more than PAGES_PER_HPAGE write protections per large page · 93df7663

由 Avi Kivity 提交于 5月 02, 2008

nonpae guests can call rmap_write_protect twice per page (for page tables)
or four times per page (for page directories), triggering a bogus warning.

Remove the warning.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

93df7663

KVM: VMX: Enable EPT feature for KVM · 1439442c

由 Sheng Yang 提交于 4月 28, 2008

Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1439442c

KVM: MMU: Remove #ifdef CONFIG_X86_64 to support 4 level EPT · 1ac593c9

由 Sheng Yang 提交于 4月 25, 2008

Currently EPT level is 4 for both pae and x86_64. The patch remove the #ifdef
for alloc root_hpa and free root_hpa to support EPT.
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1ac593c9

KVM: MMU: Add EPT support · 7b52345e

由 Sheng Yang 提交于 4月 25, 2008

Enable kvm_set_spte() to generate EPT entries.
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

7b52345e

KVM: Add kvm_x86_ops get_tdp_level() · 67253af5

由 Sheng Yang 提交于 4月 25, 2008

The function get_tdp_level() provided the number of tdp level for EPT and
NPT rather than the NPT specific macro.
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

67253af5

KVM: MMU: Move some definitions to a header file · 8c6d6adc

由 Sheng Yang 提交于 4月 25, 2008

Move some definitions to mmu.h in order to allow building common table
entries between EPT and non-EPT.
Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

8c6d6adc

27 4月, 2008 19 次提交

KVM: MMU: kvm_pv_mmu_op should not take mmap_sem · 960b3991

由 Marcelo Tosatti 提交于 4月 16, 2008

kvm_pv_mmu_op should not take mmap_sem. All gfn_to_page() callers down
in the MMU processing will take it if necessary, so as it is it can
deadlock.

Apparently a leftover from the days before slots_lock.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

960b3991

KVM: MMU: Don't assume struct page for x86 · 35149e21

由 Anthony Liguori 提交于 4月 02, 2008

This patch introduces a gfn_to_pfn() function and corresponding functions like
kvm_release_pfn_dirty().  Using these new functions, we can modify the x86
MMU to no longer assume that it can always get a struct page for any given gfn.

We don't want to eliminate gfn_to_page() entirely because a number of places
assume they can do gfn_to_page() and then kmap() the results.  When we support
IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will
succeed.

This does not implement support for avoiding reference counting for reserved
RAM or for IO memory.  However, it should make those things pretty straight
forward.

Since we're only introducing new common symbols, I don't think it will break
the non-x86 architectures but I haven't tested those.  I've tested Intel,
AMD, NPT, and hugetlbfs with Windows and Linux guests.

[avi: fix overflow when shifting left pfns by adding casts]
Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

35149e21

KVM: MMU: prepopulate guest pages after write-protecting · bed1d1df

由 Marcelo Tosatti 提交于 4月 04, 2008

Zdenek reported a bug where a looping "dmsetup status" eventually hangs
on SMP guests.

The problem is that kvm_mmu_get_page() prepopulates the shadow MMU
before write protecting the guest page tables. By doing so, it leaves a
window open where the guest can mark a pte as present while the host has
shadow cached such pte as "notrap". Accesses to such address will fault
in the guest without the host having a chance to fix the situation.

Fix by moving the write protection before the pte prefetch.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

bed1d1df

KVM: MMU: Only mark_page_accessed() if the page was accessed by the guest · fcd6dbac

由 Avi Kivity 提交于 4月 03, 2008

If the accessed bit is not set, the guest has never accessed this page
(at least through this spte), so there's no need to mark the page
accessed.  This provides more accurate data for the eviction algortithm.

Noted by Andrea Arcangeli.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

fcd6dbac

KVM: MMU: allow the vm to shrink the kvm mmu shadow caches · 3ee16c81

由 Izik Eidus 提交于 3月 30, 2008

Allow the Linux memory manager to reclaim memory in the kvm shadow cache.
Signed-off-by: NIzik Eidus <izike@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

3ee16c81

KVM: MMU: unify slots_lock usage · 3200f405

由 Marcelo Tosatti 提交于 3月 29, 2008

Unify slots_lock acquision around vcpu_run(). This is simpler and less
error-prone.

Also fix some callsites that were not grabbing the lock properly.

[avi: drop slots_lock while in guest mode to avoid holding the lock
      for indefinite periods]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

3200f405

A
KVM: MMU: Introduce and use spte_to_page() · 0b49ea86
由 Avi Kivity 提交于 3月 23, 2008
```
Encapsulate the pte mask'n'shift in a function.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
0b49ea86

KVM: MMU: fix dirty bit setting when removing write permissions · 855149aa

由 Izik Eidus 提交于 3月 20, 2008

When mmu_set_spte() checks if a page related to spte should be release as
dirty or clean, it check if the shadow pte was writeble, but in case
rmap_write_protect() is called called it is possible for shadow ptes that were
writeble to become readonly and therefor mmu_set_spte will release the pages
as clean.

This patch fix this issue by marking the page as dirty inside
rmap_write_protect().
Signed-off-by: NIzik Eidus <izike@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

855149aa

KVM: MMU: Set the accessed bit on non-speculative shadow ptes · 947da538

由 Avi Kivity 提交于 3月 18, 2008

If we populate a shadow pte due to a fault (and not speculatively due to a
pte write) then we can set the accessed bit on it, as we know it will be
set immediately on the next guest instruction.  This saves a read-modify-write
operation.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

947da538

KVM: MMU: hypercall based pte updates and TLB flushes · 2f333bcb

由 Marcelo Tosatti 提交于 2月 22, 2008

Hypercall based pte updates are faster than faults, and also allow use
of the lazy MMU mode to batch operations.

Don't report the feature if two dimensional paging is enabled.

[avi:
 - one mmu_op hypercall instead of one per op
 - allow 64-bit gpa on hypercall
 - don't pass host errors (-ENOMEM) to guest]

[akpm: warning fix on i386]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2f333bcb

KVM: replace remaining __FUNCTION__ occurances · b8688d51

由 Harvey Harrison 提交于 3月 03, 2008

__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

b8688d51

KVM: MMU: large page support · 05da4558

由 Marcelo Tosatti 提交于 2月 23, 2008

Create large pages mappings if the guest PTE's are marked as such and
the underlying memory is hugetlbfs backed.  If the largepage contains
write-protected pages, a large pte is not used.

Gives a consistent 2% improvement for data copies on ram mounted
filesystem, without NPT/EPT.

Anthony measures a 4% improvement on 4-way kernbench, with NPT.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

05da4558

KVM: MMU: ignore zapped root pagetables · 2e53d63a

由 Marcelo Tosatti 提交于 2月 20, 2008

Mark zapped root pagetables as invalid and ignore such pages during lookup.

This is a problem with the cr3-target feature, where a zapped root table fools
the faulting code into creating a read-only mapping. The result is a lockup
if the instruction can't be emulated.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2e53d63a

KVM: MMU: add TDP support to the KVM MMU · fb72d167

由 Joerg Roedel 提交于 2月 07, 2008

This patch contains the changes to the KVM MMU necessary for support of the
Nested Paging feature in AMD Barcelona and Phenom Processors.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

fb72d167

KVM: MMU: make the __nonpaging_map function generic · 4d9976bb

由 Joerg Roedel 提交于 2月 07, 2008

The mapping function for the nonpaging case in the softmmu does basically the
same as required for Nested Paging. Make this function generic so it can be
used for both.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

4d9976bb

KVM: export information about NPT to generic x86 code · 18552672

由 Joerg Roedel 提交于 2月 07, 2008

The generic x86 code has to know if the specific implementation uses Nested
Paging. In the generic code Nested Paging is called Two Dimensional Paging
(TDP) to avoid confusion with (future) TDP implementations of other vendors.
This patch exports the availability of TDP to the generic x86 code.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

18552672

KVM: MMU: Decouple mmio from shadow page tables · d196e343

由 Avi Kivity 提交于 1月 24, 2008

Currently an mmio guest pte is encoded in the shadow pagetable as a
not-present trapping pte, with the SHADOW_IO_MARK bit set. However
nothing is ever done with this information, so maintaining it is a
useless complication.

This patch moves the check for mmio to before shadow ptes are instantiated,
so the shadow code is never invoked for ptes that reference mmio. The code
is simpler, and with future work, can be made to handle mmio concurrently.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d196e343

KVM: MMU: Simplify hash table indexing · 1ae0a13d

由 Dong, Eddie 提交于 1月 07, 2008

Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1ae0a13d

KVM: MMU: Update shadow ptes on partial guest pte writes · 489f1d65

由 Dong, Eddie 提交于 1月 07, 2008

A guest partial guest pte write will leave shadow_trap_nonpresent_pte
in spte, which generates a vmexit at the next guest access through that pte.

This patch improves this by reading the full guest pte in advance and thus
being able to update the spte and eliminate the vmexit.

This helps pae guests which use two 32-bit writes to set a single 64-bit pte.

[truncation fix by Eric]
Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: NFeng (Eric) Liu <eric.e.liu@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

489f1d65

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功