提交 · 828554136bbacae6e39fc31b9cd7e7c660ad7530 · openeuler / raspberrypi-kernel

01 8月, 2010 38 次提交

KVM: Remove unnecessary divide operations · 82855413

由 Joerg Roedel 提交于 7月 01, 2010

This patch converts unnecessary divide and modulo operations
in the KVM large page related code into logical operations.
This allows to convert gfn_t to u64 while not breaking 32
bit builds.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

82855413

KVM: MMU: fix writable sync sp mapping · 36a2e677

由 Xiao Guangrong 提交于 6月 30, 2010

While we sync many unsync sp at one time(in mmu_sync_children()),
we may mapping the spte writable, it's dangerous, if one unsync
sp's mapping gfn is another unsync page's gfn.

For example:

SP1.pte[0] = P
SP2.gfn's pfn = P
[SP1.pte[0] = SP2.gfn's pfn]

First, we write protected SP1 and SP2, but SP1 and SP2 are still the
unsync sp.

Then, sync SP1 first, it will detect SP1.pte[0].gfn only has one unsync-sp,
that is SP2, so it will mapping it writable, but we plan to sync SP2 soon,
at this point, the SP2->unsync is not reliable since later we sync SP2 but
SP2->gfn is already writable.

So the final result is: SP2 is the sync page but SP2.gfn is writable.

This bug will corrupt guest's page table, fixed by mark read-only mapping
if the mapped gfn has shadow pages.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

36a2e677

KVM: Add mini-API for vcpu->requests · a8eeb04a

由 Avi Kivity 提交于 5月 10, 2010

Makes it a little more readable and hackable.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a8eeb04a

KVM: Remove memory alias support · a1f4d395

由 Avi Kivity 提交于 6月 21, 2010

As advertised in feature-removal-schedule.txt.  Equivalent support is provided
by overlapping memory regions.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a1f4d395

KVM: MMU: don't walk every parent pages while mark unsync · 1047df1f

由 Xiao Guangrong 提交于 6月 11, 2010

While we mark the parent's unsync_child_bitmap, if the parent is already
unsynced, it no need walk it's parent, it can reduce some unnecessary
workload
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1047df1f

KVM: MMU: clear unsync_child_bitmap completely · 7a8f1a74

由 Xiao Guangrong 提交于 6月 11, 2010

In current code, some page's unsync_child_bitmap is not cleared completely
in mmu_sync_children(), for example, if two PDPEs shard one PDT, one of
PDPE's unsync_child_bitmap is not cleared.

Currently, it not harm anything just little overload, but it's the prepare
work for the later patch
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7a8f1a74

KVM: MMU: cleanup for __mmu_unsync_walk() · ebdea638

由 Xiao Guangrong 提交于 6月 11, 2010

Decrease sp->unsync_children after clear unsync_child_bitmap bit
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ebdea638

KVM: MMU: don't mark pte notrap if it's just sync transient · be71e061

由 Xiao Guangrong 提交于 6月 11, 2010

If the sync-sp just sync transient, don't mark its pte notrap
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

be71e061

KVM: MMU: avoid double write protected in sync page path · f918b443

由 Xiao Guangrong 提交于 6月 11, 2010

The sync page is already write protected in mmu_sync_children(), don't
write protected it again
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f918b443

KVM: Fix mov cr3 #GP at wrong instruction · 2390218b

由 Avi Kivity 提交于 6月 10, 2010

On Intel, we call skip_emulated_instruction() even if we injected a #GP,
resulting in the #GP pointing at the wrong address.

Fix by injecting the exception and skipping the instruction at the same place,
so we can do just one or the other.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2390218b

KVM: MMU: delay local tlb flush · 3b5d1321

由 Xiao Guangrong 提交于 6月 08, 2010

delay local tlb flush until enter guest moden, it can reduce vpid flush
frequency and reduce remote tlb flush IPI(if KVM_REQ_TLB_FLUSH bit is
already set, IPI is not sent)
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3b5d1321

KVM: MMU: use wrapper function to flush local tlb · 5304efde

由 Xiao Guangrong 提交于 6月 08, 2010

Use kvm_mmu_flush_tlb() function instead of calling
kvm_x86_ops->tlb_flush(vcpu) directly.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5304efde

KVM: MMU: remove unnecessary remote tlb flush · 4f78fd08

由 Xiao Guangrong 提交于 6月 08, 2010

This remote tlb flush is no necessary since we have synced while
sp is zapped
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4f78fd08

KVM: MMU: reduce remote tlb flush in kvm_mmu_pte_write() · 0671a8e7

由 Xiao Guangrong 提交于 6月 04, 2010

collect remote tlb flush in kvm_mmu_pte_write() path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0671a8e7

KVM: MMU: traverse sp hlish safely · f41d335a

由 Xiao Guangrong 提交于 6月 04, 2010

Now, we can safely to traverse sp hlish
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f41d335a

KVM: MMU: gather remote tlb flush which occurs during page zapped · d98ba053

由 Xiao Guangrong 提交于 6月 04, 2010

Using kvm_mmu_prepare_zap_page() and kvm_mmu_zap_page() instead of
kvm_mmu_zap_page() that can reduce remote tlb flush IPI
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d98ba053

KVM: MMU: don't get free page number in the loop · 103ad25a

由 Xiao Guangrong 提交于 6月 04, 2010

In the later patch, we will modify sp's zapping way like below:

	kvm_mmu_prepare_zap_page A
	kvm_mmu_prepare_zap_page B
	kvm_mmu_prepare_zap_page C
	....
	kvm_mmu_commit_zap_page

[ zaped multiple sps only need to call kvm_mmu_commit_zap_page once ]

In __kvm_mmu_free_some_pages() function, the free page number is
getted form 'vcpu->kvm->arch.n_free_mmu_pages' in loop, it will
hinders us to apply kvm_mmu_prepare_zap_page() and kvm_mmu_commit_zap_page()
since kvm_mmu_prepare_zap_page() not free sp.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

103ad25a

KVM: MMU: split the operations of kvm_mmu_zap_page() · 7775834a

由 Xiao Guangrong 提交于 6月 04, 2010

Using kvm_mmu_prepare_zap_page() and kvm_mmu_commit_zap_page() to
split kvm_mmu_zap_page() function, then we can:

- traverse hlist safely
- easily to gather remote tlb flush which occurs during page zapped

Those feature can be used in the later patches
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7775834a

KVM: MMU: introduce some macros to cleanup hlist traverseing · 7ae680eb

由 Xiao Guangrong 提交于 6月 04, 2010

Introduce for_each_gfn_sp() and for_each_gfn_indirect_valid_sp() to
cleanup hlist traverseing
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7ae680eb

KVM: MMU: skip invalid sp when unprotect page · 03116aa5

由 Xiao Guangrong 提交于 6月 04, 2010

In kvm_mmu_unprotect_page(), the invalid sp can be skipped
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

03116aa5

KVM: MMU: Don't calculate quadrant if tdp_enabled · b66d8000

由 Gui Jianfeng 提交于 5月 31, 2010

There's no need to calculate quadrant if tdp is enabled.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b66d8000

KVM: MMU: Allow spte.w=1 for gpte.w=0 and cr0.wp=0 only in shadow mode · 8184dd38

由 Avi Kivity 提交于 5月 27, 2010

When tdp is enabled, the guest's cr0.wp shouldn't have any effect on spte
permissions.
Signed-off-by: NAvi Kivity <avi@redhat.com>

8184dd38

KVM: MMU: don't check PT_WRITABLE_MASK directly · 01c168ac

由 Gui Jianfeng 提交于 5月 27, 2010

Since we have is_writable_pte(), make use of it.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

01c168ac

KVM: MMU: Calculate correct base gfn for direct non-DIR level · c9fa0b3b

由 Lai Jiangshan 提交于 5月 26, 2010

In Document/kvm/mmu.txt:
  gfn:
    Either the guest page table containing the translations shadowed by this
    page, or the base page frame for linear translations. See role.direct.

But in __direct_map(), the base gfn calculation is incorrect,
it does not calculate correctly when level=3 or 4.

Fix by using PT64_LVL_ADDR_MASK() which accounts for all levels correctly.
Reported-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c9fa0b3b

KVM: MMU: Don't allocate gfns page for direct mmu pages · 2032a93d

由 Lai Jiangshan 提交于 5月 26, 2010

When sp->role.direct is set, sp->gfns does not contain any essential
information, leaf sptes reachable from this sp are for a continuous
guest physical memory range (a linear range).
So sp->gfns[i] (if it was set) equals to sp->gfn + i. (PT_PAGE_TABLE_LEVEL)
Obviously, it is not essential information, we can calculate it when need.

It means we don't need sp->gfns when sp->role.direct=1,
Thus we can save one page usage for every kvm_mmu_page.

Note:
  Access to sp->gfns must be wrapped by kvm_mmu_page_get_gfn()
  or kvm_mmu_page_set_gfn().
  It is only exposed in FNAME(sync_page).
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2032a93d

KVM: MMU: allow more page become unsync at getting sp time · 9f1a122f

由 Xiao Guangrong 提交于 5月 24, 2010

Allow more page become asynchronous at getting sp time, if need create new
shadow page for gfn but it not allow unsync(level > 1), we should unsync all
gfn's unsync page
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9f1a122f

KVM: MMU: allow more page become unsync at gfn mapping time · 9cf5cf5a

由 Xiao Guangrong 提交于 5月 24, 2010

In current code, shadow page can become asynchronous only if one
shadow page for a gfn, this rule is too strict, in fact, we can
let all last mapping page(i.e, it's the pte page) become unsync,
and sync them at invlpg or flush tlb time.

This patch allow more page become asynchronous at gfn mapping time
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9cf5cf5a

A
KVM: Update Red Hat copyrights · 221d059d
由 Avi Kivity 提交于 5月 23, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
221d059d

KVM: MMU: don't write-protect if have new mapping to unsync page · e02aa901

由 Xiao Guangrong 提交于 5月 15, 2010

Two cases maybe happen in kvm_mmu_get_page() function:

- one case is, the goal sp is already in cache, if the sp is unsync,
  we only need update it to assure this mapping is valid, but not
  mark it sync and not write-protect sp->gfn since it not broke unsync
  rule(one shadow page for a gfn)

- another case is, the goal sp not existed, we need create a new sp
  for gfn, i.e, gfn (may)has another shadow page, to keep unsync rule,
  we should sync(mark sync and write-protect) gfn's unsync shadow page.
  After enabling multiple unsync shadows, we sync those shadow pages
  only when the new sp not allow to become unsync(also for the unsyc
  rule, the new rule is: allow all pte page become unsync)
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e02aa901

KVM: MMU: split kvm_sync_page() function · 1d9dc7e0

由 Xiao Guangrong 提交于 5月 15, 2010

Split kvm_sync_page() into kvm_sync_page() and kvm_sync_page_transient()
to clarify the code address Avi's suggestion

kvm_sync_page_transient() function only update shadow page but not mark
it sync and not write protect sp->gfn. it will be used by later patch
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1d9dc7e0

KVM: MMU: remove rmap before clear spte · 6d74229f

由 Xiao Guangrong 提交于 5月 13, 2010

Remove rmap before clear spte otherwise it will trigger BUG_ON() in
some functions such as rmap_write_protect().
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6d74229f

KVM: MMU: use proper cache object freeing function · e8ad9a70

由 Xiao Guangrong 提交于 5月 13, 2010

Use kmem_cache_free to free objects allocated by kmem_cache_alloc.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e8ad9a70

KVM: x86: Clean up duplicate assignment · 62ad0755

由 Sheng Yang 提交于 5月 12, 2010

mmu.free() already set root_hpa to INVALID_PAGE, no need to do it again in the
destory_kvm_mmu().

kvm_x86_ops->set_cr4() and set_efer() already assign cr4/efer to
vcpu->arch.cr4/efer, no need to do it again later.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

62ad0755

M
KVM: pass correct parameter to kvm_mmu_free_some_pages · 24955b6c
由 Marcelo Tosatti 提交于 5月 12, 2010
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
24955b6c

KVM: MMU: Fix free memory accounting race in mmu_alloc_roots() · f0f5933a

由 Avi Kivity 提交于 5月 10, 2010

We drop the mmu lock between freeing memory and allocating the roots; this
allows some other vcpu to sneak in and allocate memory.

While the race is benign (resulting only in temporary overallocation, not oom)
it is simple and easy to fix by moving the freeing close to the allocation.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f0f5933a

KVM: inject #UD if instruction emulation fails and exit to userspace · 6d77dbfc

由 Gleb Natapov 提交于 5月 10, 2010

Do not kill VM when instruction emulation fails. Inject #UD and report
failure to userspace instead. Userspace may choose to reenter guest if
vcpu is in userspace (cpl == 3) in which case guest OS will kill
offending process and continue running.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6d77dbfc

KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed · 54a4f023

由 Gui Jianfeng 提交于 5月 05, 2010

Currently, kvm_mmu_zap_page() returning the number of freed children sp.
This might confuse the caller, because caller don't know the actual freed
number. Let's make kvm_mmu_zap_page() return the number of pages it actually
freed.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

54a4f023

KVM: Avoid killing userspace through guest SRAO MCE on unmapped pages · bf998156

由 Huang Ying 提交于 5月 31, 2010

In common cases, guest SRAO MCE will cause corresponding poisoned page
be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay
the MCE to guest OS.

But it is reported that if the poisoned page is accessed in guest
after unmapping and before MCE is relayed to guest OS, userspace will
be killed.

The reason is as follows. Because poisoned page has been un-mapped,
guest access will cause guest exit and kvm_mmu_page_fault will be
called. kvm_mmu_page_fault can not get the poisoned page for fault
address, so kernel and user space MMIO processing is tried in turn. In
user MMIO processing, poisoned page is accessed again, then userspace
is killed by force_sig_info.

To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM
and do not try kernel and user space MMIO processing for poisoned
page.

[xiao: fix warning introduced by avi]
Reported-by: NMax Asbock <masbock@linux.vnet.ibm.com>
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bf998156

19 7月, 2010 1 次提交

mm: add context argument to shrinker callback · 7f8275d0

由 Dave Chinner 提交于 7月 19, 2010

The current shrinker implementation requires the registered callback
to have global state to work from. This makes it difficult to shrink
caches that are not global (e.g. per-filesystem caches). Pass the shrinker
structure to the callback so that users can embed the shrinker structure
in the context the shrinker needs to operate on and get back to it in the
callback via container_of().
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

7f8275d0

13 7月, 2010 1 次提交

KVM: MMU: flush remote tlbs when overwriting spte with different pfn · 91546356

由 Xiao Guangrong 提交于 6月 30, 2010

After remove a rmap, we should flush all vcpu's tlb
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

91546356