提交 · 93a5cef07d686a0341d056b0f930a762c7174a13 · openeuler / raspberrypi-kernel

27 12月, 2011 14 次提交

KVM: introduce KVM_MEM_SLOTS_NUM macro · 93a5cef0

由 Xiao Guangrong 提交于 11月 24, 2011

Introduce KVM_MEM_SLOTS_NUM macro to instead of
KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

93a5cef0

KVM: Optimize dirty logging by rmap_write_protect() · 95d4c16c

由 Takuya Yoshikawa 提交于 11月 14, 2011

Currently, write protecting a slot needs to walk all the shadow pages
and checks ones which have a pte mapping a page in it.

The walk is overly heavy when dirty pages in that slot are not so many
and checking the shadow pages would result in unwanted cache pollution.

To mitigate this problem, we use rmap_write_protect() and check only
the sptes which can be reached from gfns marked in the dirty bitmap
when the number of dirty pages are less than that of shadow pages.

This criterion is reasonable in its meaning and worked well in our test:
write protection became some times faster than before when the ratio of
dirty pages are low and was not worse even when the ratio was near the
criterion.

Note that the locking for this write protection becomes fine grained.
The reason why this is safe is descripted in the comments.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

95d4c16c

KVM: MMU: Split gfn_to_rmap() into two functions · 9b9b1492

由 Takuya Yoshikawa 提交于 11月 14, 2011

rmap_write_protect() calls gfn_to_rmap() for each level with gfn fixed.
This results in calling gfn_to_memslot() repeatedly with that gfn.

This patch introduces __gfn_to_rmap() which takes the slot as an
argument to avoid this.

This is also needed for the following dirty logging optimization.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9b9b1492

KVM: MMU: Clean up BUG_ON() conditions in rmap_write_protect() · d6eebf8b

由 Takuya Yoshikawa 提交于 11月 14, 2011

Remove redundant checks and use is_large_pte() macro.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d6eebf8b

KVM: MMU: remove KVM host pv mmu support · fb920458

由 Chris Wright 提交于 11月 01, 2011

The host side pv mmu support has been marked for feature removal in
January 2011.  It's not in use, is slower than shadow or hardware
assisted paging, and a maintenance burden.  It's November 2011, time to
remove it.
Signed-off-by: NChris Wright <chrisw@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fb920458

KVM: MMU: improve write flooding detected · a30f47cb

由 Xiao Guangrong 提交于 9月 22, 2011

Detecting write-flooding does not work well, when we handle page written, if
the last speculative spte is not accessed, we treat the page is
write-flooding, however, we can speculative spte on many path, such as pte
prefetch, page synced, that means the last speculative spte may be not point
to the written page and the written page can be accessed via other sptes, so
depends on the Accessed bit of the last speculative spte is not enough

Instead of detected page accessed, we can detect whether the spte is accessed
after it is written, if the spte is not accessed but it is written frequently,
we treat is not a page table or it not used for a long time
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a30f47cb

KVM: MMU: fix detecting misaligned accessed · 5d9ca30e

由 Xiao Guangrong 提交于 9月 22, 2011

Sometimes, we only modify the last one byte of a pte to update status bit,
for example, clear_bit is used to clear r/w bit in linux kernel and 'andb'
instruction is used in this function, in this case, kvm_mmu_pte_write will
treat it as misaligned access, and the shadow page table is zapped
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5d9ca30e

KVM: MMU: split kvm_mmu_pte_write function · 889e5cbc

由 Xiao Guangrong 提交于 9月 22, 2011

kvm_mmu_pte_write is too long, we split it for better readable
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

889e5cbc

KVM: MMU: remove unnecessary kvm_mmu_free_some_pages · f8734352

由 Xiao Guangrong 提交于 9月 22, 2011

In kvm_mmu_pte_write, we do not need to alloc shadow page, so calling
kvm_mmu_free_some_pages is really unnecessary
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f8734352

KVM: MMU: fast prefetch spte on invlpg path · f57f2ef5

由 Xiao Guangrong 提交于 9月 22, 2011

Fast prefetch spte for the unsync shadow page on invlpg path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f57f2ef5

KVM: MMU: cleanup FNAME(invlpg) · 505aef8f

由 Xiao Guangrong 提交于 9月 22, 2011

Directly Use mmu_page_zap_pte to zap spte in FNAME(invlpg), also remove the
same code between FNAME(invlpg) and FNAME(sync_page)
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

505aef8f

KVM: MMU: do not mark accessed bit on pte write path · d01f8d5e

由 Xiao Guangrong 提交于 9月 22, 2011

In current code, the accessed bit is always set when page fault occurred,
do not need to set it on pte write path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d01f8d5e

KVM: x86: retry non-page-table writing instructions · 1cb3f3ae

由 Xiao Guangrong 提交于 9月 22, 2011

If the emulation is caused by #PF and it is non-page_table writing instruction,
it means the VM-EXIT is caused by shadow page protected, we can zap the shadow
page and retry this instruction directly

The idea is from Avi
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1cb3f3ae

KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write · f759e2b4

由 Xiao Guangrong 提交于 9月 22, 2011

kvm_mmu_pte_write is unsafe since we need to alloc pte_list_desc in the
function when spte is prefetched, unfortunately, we can not know how many
spte need to be prefetched on this path, that means we can use out of the
free pte_list_desc object in the cache, and BUG_ON() is triggered, also some
path does not fill the cache, such as INS instruction emulated that does not
trigger page fault
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f759e2b4

26 9月, 2011 2 次提交

KVM: MMU: Do not unconditionally read PDPTE from guest memory · e4e517b4

由 Avi Kivity 提交于 7月 28, 2011

Architecturally, PDPTEs are cached in the PDPTRs when CR3 is reloaded.
On SVM, it is not possible to implement this, but on VMX this is possible
and was indeed implemented until nested SVM changed this to unconditionally
read PDPTEs dynamically.  This has noticable impact when running PAE guests.

Fix by changing the MMU to read PDPTRs from the cache, falling back to
reading from memory for the nested MMU.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Tested-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e4e517b4

KVM: MMU: fix incorrect return of spte · 41bc3186

由 Zhao Jin 提交于 9月 19, 2011

__update_clear_spte_slow should return original spte while the
current code returns low half of original spte combined with high
half of new spte.
Signed-off-by: NZhao Jin <cronozhj@gmail.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

41bc3186

24 7月, 2011 16 次提交

KVM: MMU: trace mmio page fault · 4f022648

由 Xiao Guangrong 提交于 7月 12, 2011

Add tracepoints to trace mmio page fault
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4f022648

KVM: MMU: mmio page fault support · ce88decf

由 Xiao Guangrong 提交于 7月 12, 2011

The idea is from Avi:

| We could cache the result of a miss in an spte by using a reserved bit, and
| checking the page fault error code (or seeing if we get an ept violation or
| ept misconfiguration), so if we get repeated mmio on a page, we don't need to
| search the slot list/tree.
| (https://lkml.org/lkml/2011/2/22/221)

When the page fault is caused by mmio, we cache the info in the shadow page
table, and also set the reserved bits in the shadow page table, so if the mmio
is caused again, we can quickly identify it and emulate it directly

Searching mmio gfn in memslots is heavy since we need to walk all memeslots, it
can be reduced by this feature, and also avoid walking guest page table for
soft mmu.

[jan: fix operator precedence issue]
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ce88decf

KVM: MMU: reorganize struct kvm_shadow_walk_iterator · dd3bfd59

由 Xiao Guangrong 提交于 7月 12, 2011

Reorganize it for good using the cache
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

dd3bfd59

KVM: MMU: lockless walking shadow page table · c2a2ac2b

由 Xiao Guangrong 提交于 7月 12, 2011

Use rcu to protect shadow pages table to be freed, so we can safely walk it,
it should run fastly and is needed by mmio page fault
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c2a2ac2b

KVM: MMU: do not need atomicly to set/clear spte · 603e0651

由 Xiao Guangrong 提交于 7月 12, 2011

Now, the spte is just from nonprsent to present or present to nonprsent, so
we can use some trick to set/clear spte non-atomicly as linux kernel does
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

603e0651

KVM: MMU: introduce the rules to modify shadow page table · 1df9f2dc

由 Xiao Guangrong 提交于 7月 12, 2011

Introduce some interfaces to modify spte as linux kernel does:
- mmu_spte_clear_track_bits, it set the spte from present to nonpresent, and
  track the stat bits(accessed/dirty) of spte
- mmu_spte_clear_no_track, the same as mmu_spte_clear_track_bits except
  tracking the stat bits
- mmu_spte_set, set spte from nonpresent to present
- mmu_spte_update, only update the stat bits

Now, it does not allowed to set spte from present to present, later, we can
drop the atomicly opration for X86_32 host, and it is the preparing work to
get spte on X86_32 host out of the mmu lock
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1df9f2dc

KVM: MMU: abstract some functions to handle fault pfn · d7c55201

由 Xiao Guangrong 提交于 7月 12, 2011

Introduce handle_abnormal_pfn to handle fault pfn on page fault path,
introduce mmu_invalid_pfn to handle fault pfn on prefetch path

It is the preparing work for mmio page fault support
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d7c55201

KVM: MMU: filter out the mmio pfn from the fault pfn · fce92dce

由 Xiao Guangrong 提交于 7月 12, 2011

If the page fault is caused by mmio, the gfn can not be found in memslots, and
'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify
the mmio page fault.
And, to clarify the meaning of mmio pfn, we return fault page instead of bad
page when the gfn is not allowd to prefetch
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fce92dce

KVM: MMU: remove bypass_guest_pf · c3707958

由 Xiao Guangrong 提交于 7月 12, 2011

The idea is from Avi:
| Maybe it's time to kill off bypass_guest_pf=1.  It's not as effective as
| it used to be, since unsync pages always use shadow_trap_nonpresent_pte,
| and since we convert between the two nonpresent_ptes during sync and unsync.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c3707958

KVM: MMU: split kvm_mmu_free_page · bd4c86ea

由 Xiao Guangrong 提交于 7月 12, 2011

Split kvm_mmu_free_page to kvm_mmu_isolate_page and
kvm_mmu_free_page

One is used to remove the page from cache under mmu lock and the other is
used to free page table out of mmu lock
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bd4c86ea

KVM: MMU: count used shadow pages on prepareing path · aa6bd187

由 Xiao Guangrong 提交于 7月 12, 2011

Move counting used shadow pages from commiting path to preparing path to
reduce tlb flush on some paths
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

aa6bd187

KVM: MMU: rename 'pt_write' to 'emulate' · b90a0e6c

由 Xiao Guangrong 提交于 7月 12, 2011

If 'pt_write' is true, we need to emulate the fault. And in later patch, we
need to emulate the fault even though it is not a pt_write event, so rename
it to better fit the meaning
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b90a0e6c

KVM: MMU: optimize to handle dirty bit · 640d9b0d

由 Xiao Guangrong 提交于 7月 12, 2011

If dirty bit is not set, we can make the pte access read-only to avoid handing
dirty bit everywhere
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

640d9b0d

KVM: MMU: cache mmio info on page fault path · bebb106a

由 Xiao Guangrong 提交于 7月 12, 2011

If the page fault is caused by mmio, we can cache the mmio info, later, we do
not need to walk guest page table and quickly know it is a mmio fault while we
emulate the mmio instruction
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bebb106a

KVM: MMU: do not update slot bitmap if spte is nonpresent · ffb61bb3

由 Xiao Guangrong 提交于 7月 12, 2011

Set slot bitmap only if the spte is present
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ffb61bb3

KVM: MMU: fix walking shadow page table · 052331be

由 Xiao Guangrong 提交于 7月 12, 2011

Properly check the last mapping, and do not walk to the next level if last spte
is met
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

052331be

12 7月, 2011 8 次提交

Revert "KVM: MMU: make kvm_mmu_reset_context() flush the guest TLB" · f8f7e5ee

由 Marcelo Tosatti 提交于 6月 21, 2011

This reverts commit bee931d31e588b8eb86b7edee32fac2d16930cd7.

TLB flush should be done lazily during guest entry, in
kvm_mmu_load().
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f8f7e5ee

KVM: MMU: make kvm_mmu_reset_context() flush the guest TLB · 45bd07b9

由 Avi Kivity 提交于 6月 12, 2011

kvm_set_cr0() and kvm_set_cr4(), and possible other functions,
assume that kvm_mmu_reset_context() flushes the guest TLB.  However,
it does not.

Fix by flushing the tlb (and syncing the new root as well).
Signed-off-by: NAvi Kivity <avi@redhat.com>

45bd07b9

KVM: MMU: Adjust shadow paging to work when SMEP=1 and CR0.WP=0 · 411c588d

由 Avi Kivity 提交于 6月 06, 2011

When CR0.WP=0, we sometimes map user pages as kernel pages (to allow
the kernel to write to them).  Unfortunately this also allows the kernel
to fetch from these pages, even if CR4.SMEP is set.

Adjust for this by also setting NX on the spte in these circumstances.
Signed-off-by: NAvi Kivity <avi@redhat.com>

411c588d

KVM: MMU: cleanup for dropping parent pte · bcdd9a93

由 Xiao Guangrong 提交于 5月 15, 2011

Introduce drop_parent_pte to remove the rmap of parent pte and
clear parent pte
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bcdd9a93

KVM: MMU: cleanup for kvm_mmu_page_unlink_children · 38e3b2b2

由 Xiao Guangrong 提交于 5月 15, 2011

Cleanup the same operation between kvm_mmu_page_unlink_children and
mmu_pte_write_zap_pte
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

38e3b2b2

KVM: MMU: remove the arithmetic of parent pte rmap · 67052b35

由 Xiao Guangrong 提交于 5月 15, 2011

Parent pte rmap and page rmap are very similar, so use the same arithmetic
for them
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

67052b35

KVM: MMU: abstract the operation of rmap · 53c07b18

由 Xiao Guangrong 提交于 5月 15, 2011

Abstract the operation of rmap to spte_list, then we can use it for the
reverse mapping of parent pte in the later patch
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

53c07b18

KVM: MMU: optimize pte write path if don't have protected sp · 332b207d

由 Xiao Guangrong 提交于 5月 15, 2011

Simply return from kvm_mmu_pte_write path if no shadow page is
write-protected, then we can avoid to walk all shadow pages and hold
mmu-lock
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

332b207d