提交 · 8df25a328a6ca3bd0f048278f4d5ae0a1f6fadc1 · openeuler / raspberrypi-kernel

24 10月, 2010 8 次提交

KVM: MMU: Track page fault data in struct vcpu · 8df25a32

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces a struct with two new fields in
vcpu_arch for x86:

	* fault.address
	* fault.error_code

This will be used to correctly propagate page faults back
into the guest when we could have either an ordinary page
fault or a nested page fault. In the case of a nested page
fault the fault-address is different from the original
address that should be walked. So we need to keep track
about the real fault-address.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8df25a32

KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu · 3241f22d

由 Joerg Roedel 提交于 9月 10, 2010

This patch changes is_rsvd_bits_set() function prototype to
take only a kvm_mmu context instead of a full vcpu.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3241f22d

KVM: MMU: Introduce get_cr3 function pointer · 5777ed34

由 Joerg Roedel 提交于 9月 10, 2010

This function pointer in the MMU context is required to
implement Nested Nested Paging.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5777ed34

KVM: MMU: Check for root_level instead of long mode · 957446af

由 Joerg Roedel 提交于 9月 10, 2010

The walk_addr function checks for !is_long_mode in its 64
bit version. But what is meant here is a check for pae
paging. Change the condition to really check for pae paging
so that it also works with nested nested paging.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

957446af

KVM: MMU: support disable/enable mmu audit dynamicly · 8b1fe17c

由 Xiao Guangrong 提交于 8月 30, 2010

Add a r/w module parameter named 'mmu_audit', it can control audit
enable/disable:

enable:
  echo 1 > /sys/module/kvm/parameters/mmu_audit

disable:
  echo 0 > /sys/module/kvm/parameters/mmu_audit

This patch not change the logic
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8b1fe17c

KVM: MMU: fix wrong not write protected sp report · bc32ce21

由 Xiao Guangrong 提交于 8月 28, 2010

The audit code reports some sp not write protected in current code, it's just the
bug in audit_write_protection(), since:

- the invalid sp not need write protected
- using uninitialize local variable('gfn')
- call kvm_mmu_audit() out of mmu_lock's protection
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bc32ce21

KVM: MMU: combine guest pte read between fetch and pte prefetch · 189be38d

由 Xiao Guangrong 提交于 8月 22, 2010

Combine guest pte read between guest pte check in the fetch path and pte prefetch
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

189be38d

KVM: MMU: prefetch ptes when intercepted guest #PF · 957ed9ef

由 Xiao Guangrong 提交于 8月 22, 2010

Support prefetch ptes when intercept guest #PF, avoid to #PF by later
access

If we meet any failure in the prefetch path, we will exit it and
not try other ptes to avoid become heavy path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

957ed9ef

02 8月, 2010 11 次提交

KVM: MMU: add missing reserved bits check in speculative path · fa1de2bf

由 Xiao Guangrong 提交于 7月 16, 2010

In the speculative path, we should check guest pte's reserved bits just as
the real processor does
Reported-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fa1de2bf

KVM: MMU: Eliminate redundant temporaries in FNAME(fetch) · 24157aaf

由 Avi Kivity 提交于 7月 13, 2010

'level' and 'sptep' are aliases for 'interator.level' and 'iterator.sptep', no
need for them.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

24157aaf

KVM: MMU: Validate all gptes during fetch, not just those used for new pages · 5991b332

由 Avi Kivity 提交于 7月 13, 2010

Currently, when we fetch an spte, we only verify that gptes match those that
the walker saw if we build new shadow pages for them.

However, this misses the following race:

  vcpu1            vcpu2

  walk
                  change gpte
                  walk
                  instantiate sp

  fetch existing sp

Fix by validating every gpte, regardless of whether it is used for building
a new sp or not.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5991b332

KVM: MMU: Simplify spte fetch() function · 0b3c9333

由 Avi Kivity 提交于 7月 13, 2010

Partition the function into three sections:

- fetching indirect shadow pages (host_level > guest_level)
- fetching direct shadow pages (page_level < host_level <= guest_level)
- the final spte (page_level == host_level)

Instead of the current spaghetti.

A slight change from the original code is that we call validate_direct_spte()
more often: previously we called it only for gw->level, now we also call it for
lower levels.  The change should have no effect.

[xiao: fix regression caused by validate_direct_spte() called too late]
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0b3c9333

KVM: MMU: Add gpte_valid() helper · 39c8c672

由 Avi Kivity 提交于 7月 13, 2010

Move the code to check whether a gpte has changed since we fetched it into
a helper.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

39c8c672

KVM: MMU: Add validate_direct_spte() helper · a357bd22

由 Avi Kivity 提交于 7月 13, 2010

Add a helper to verify that a direct shadow page is valid wrt the required
access permissions; drop the page if it is not valid.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a357bd22

KVM: MMU: Add drop_large_spte() helper · a3aa51cf

由 Avi Kivity 提交于 7月 13, 2010

To clarify spte fetching code, move large spte handling into a helper.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a3aa51cf

KVM: MMU: Add link_shadow_page() helper · 32ef26a3

由 Avi Kivity 提交于 7月 13, 2010

To simplify the process of fetching an spte, add a helper that links
a shadow page to an spte.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

32ef26a3

KVM: MMU: Keep going on permission error · f59c1d2d

由 Avi Kivity 提交于 7月 06, 2010

Real hardware disregards permission errors when computing page fault error
code bit 0 (page present).  Do the same.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f59c1d2d

KVM: MMU: Only indicate a fetch fault in page fault error code if nx is enabled · b0eeec29

由 Avi Kivity 提交于 7月 06, 2010

Bit 4 of the page fault error code is set only if EFER.NX is set.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b0eeec29

KVM: MMU: Introduce drop_spte() · be38d276

由 Avi Kivity 提交于 6月 06, 2010

When we call rmap_remove(), we (almost) always immediately follow it by
an __set_spte() to a nonpresent pte.  Since we need to perform the two
operations atomically, to avoid losing the dirty and accessed bits, introduce
a helper drop_spte() and convert all call sites.

The operation is still nonatomic at this point.
Signed-off-by: NAvi Kivity <avi@redhat.com>

be38d276

01 8月, 2010 16 次提交

KVM: MMU: cleanup FNAME(fetch)() functions · 84754cd8

由 Xiao Guangrong 提交于 6月 30, 2010

Cleanup this function that we are already get the direct sp's access
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

84754cd8

KVM: MMU: fix direct sp's access corrupted · 9e7b0e7f

由 Xiao Guangrong 提交于 6月 30, 2010

If the mapping is writable but the dirty flag is not set, we will find
the read-only direct sp and setup the mapping, then if the write #PF
occur, we will mark this mapping writable in the read-only direct sp,
now, other real read-only mapping will happily write it without #PF.

It may hurt guest's COW

Fixed by re-install the mapping when write #PF occur.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9e7b0e7f

KVM: MMU: fix conflict access permissions in direct sp · 5fd5387c

由 Xiao Guangrong 提交于 6月 30, 2010

In no-direct mapping, we mark sp is 'direct' when we mapping the
guest's larger page, but its access is encoded form upper page-struct
entire not include the last mapping, it will cause access conflict.

For example, have this mapping:
        [W]
      / PDE1 -> |---|
  P[W]          |   | LPA
      \ PDE2 -> |---|
        [R]

P have two children, PDE1 and PDE2, both PDE1 and PDE2 mapping the
same lage page(LPA). The P's access is WR, PDE1's access is WR,
PDE2's access is RO(just consider read-write permissions here)

When guest access PDE1, we will create a direct sp for LPA, the sp's
access is from P, is W, then we will mark the ptes is W in this sp.

Then, guest access PDE2, we will find LPA's shadow page, is the same as
PDE's, and mark the ptes is RO.

So, if guest access PDE1, the incorrect #PF is occured.

Fixed by encode the last mapping access into direct shadow page
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5fd5387c

KVM: Remove memory alias support · a1f4d395

由 Avi Kivity 提交于 6月 21, 2010

As advertised in feature-removal-schedule.txt.  Equivalent support is provided
by overlapping memory regions.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a1f4d395

KVM: MMU: don't mark pte notrap if it's just sync transient · be71e061

由 Xiao Guangrong 提交于 6月 11, 2010

If the sync-sp just sync transient, don't mark its pte notrap
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

be71e061

KVM: MMU: cleanup for dirty page judgment · cb83cad2

由 Xiao Guangrong 提交于 6月 11, 2010

Using wrap function to cleanup page dirty judgment
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

cb83cad2

KVM: MMU: rename 'page' and 'shadow_page' to 'sp' · ac3cd03c

由 Xiao Guangrong 提交于 6月 11, 2010

Rename 'page' and 'shadow_page' to 'sp' to better fit the context
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ac3cd03c

KVM: Fix unused but set warnings · a24e8099

由 Andi Kleen 提交于 6月 10, 2010

No real bugs in this one.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a24e8099

KVM: MMU: calculate correct gfn for small host pages backing large guest pages · 3af1817a

由 Lai Jiangshan 提交于 5月 26, 2010

In Documentation/kvm/mmu.txt:
  gfn:
    Either the guest page table containing the translations shadowed by this
    page, or the base page frame for linear translations. See role.direct.

But in function FNAME(fetch)(), sp->gfn is incorrect when one of following
situations occurred:

 1) guest is 32bit paging and the guest PDE maps a 4-MByte page
    (backed by 4k host pages), FNAME(fetch)() miss handling the quadrant.

    And if guest use pse-36, "table_gfn = gpte_to_gfn(gw->ptes[level - delta]);"
    is incorrect.

 2) guest is long mode paging and the guest PDPTE maps a 1-GByte page
    (backed by 4k or 2M host pages).

So we fix it to suit to the document and suit to the code which
requires sp->gfn correct when sp->role.direct=1.

We use the goal mapping gfn(gw->gfn) to calculate the base page frame
for linear translations, it is simple and easy to be understood.
Reported-by: NMarcelo Tosatti <mtosatti@redhat.com>
Reported-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3af1817a

KVM: MMU: Don't allocate gfns page for direct mmu pages · 2032a93d

由 Lai Jiangshan 提交于 5月 26, 2010

When sp->role.direct is set, sp->gfns does not contain any essential
information, leaf sptes reachable from this sp are for a continuous
guest physical memory range (a linear range).
So sp->gfns[i] (if it was set) equals to sp->gfn + i. (PT_PAGE_TABLE_LEVEL)
Obviously, it is not essential information, we can calculate it when need.

It means we don't need sp->gfns when sp->role.direct=1,
Thus we can save one page usage for every kvm_mmu_page.

Note:
  Access to sp->gfns must be wrapped by kvm_mmu_page_get_gfn()
  or kvm_mmu_page_set_gfn().
  It is only exposed in FNAME(sync_page).
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2032a93d

A
KVM: Update Red Hat copyrights · 221d059d
由 Avi Kivity 提交于 5月 23, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
221d059d

KVM: MMU: only update unsync page in invlpg path · f78978aa

由 Xiao Guangrong 提交于 5月 15, 2010

Only unsync pages need updated at invlpg time since other shadow
pages are write-protected
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f78978aa

KVM: MMU: unalias gfn before sp->gfns[] comparison in sync_page · f55c3f41

由 Xiao Guangrong 提交于 5月 13, 2010

sp->gfns[] contain unaliased gfns, but gpte might contain pointer
to aliased region.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f55c3f41

KVM: MMU: Fix debug output error in walk_addr() · 518c5a05

由 Gui Jianfeng 提交于 5月 05, 2010

Fix a debug output error in walk_addr
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

518c5a05

KVM: MMU: mark page table dirty when a pte is actually modified · f3b8c964

由 Gui Jianfeng 提交于 5月 05, 2010

Sometime cmpxchg_gpte doesn't modify gpte, in such case, don't mark
page table page as dirty.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f3b8c964

KVM: Avoid killing userspace through guest SRAO MCE on unmapped pages · bf998156

由 Huang Ying 提交于 5月 31, 2010

In common cases, guest SRAO MCE will cause corresponding poisoned page
be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay
the MCE to guest OS.

But it is reported that if the poisoned page is accessed in guest
after unmapping and before MCE is relayed to guest OS, userspace will
be killed.

The reason is as follows. Because poisoned page has been un-mapped,
guest access will cause guest exit and kvm_mmu_page_fault will be
called. kvm_mmu_page_fault can not get the poisoned page for fault
address, so kernel and user space MMIO processing is tried in turn. In
user MMIO processing, poisoned page is accessed again, then userspace
is killed by force_sig_info.

To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM
and do not try kernel and user space MMIO processing for poisoned
page.

[xiao: fix warning introduced by avi]
Reported-by: NMax Asbock <masbock@linux.vnet.ibm.com>
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bf998156

23 7月, 2010 1 次提交

KVM: MMU: fix conflict access permissions in direct sp · 6aa0b9de

由 Xiao Guangrong 提交于 6月 30, 2010

In no-direct mapping, we mark sp is 'direct' when we mapping the
guest's larger page, but its access is encoded form upper page-struct
entire not include the last mapping, it will cause access conflict.

For example, have this mapping:
        [W]
      / PDE1 -> |---|
  P[W]          |   | LPA
      \ PDE2 -> |---|
        [R]

P have two children, PDE1 and PDE2, both PDE1 and PDE2 mapping the
same lage page(LPA). The P's access is WR, PDE1's access is WR,
PDE2's access is RO(just consider read-write permissions here)

When guest access PDE1, we will create a direct sp for LPA, the sp's
access is from P, is W, then we will mark the ptes is W in this sp.

Then, guest access PDE2, we will find LPA's shadow page, is the same as
PDE's, and mark the ptes is RO.

So, if guest access PDE1, the incorrect #PF is occured.

Fixed by encode the last mapping access into direct shadow page
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6aa0b9de

19 5月, 2010 2 次提交

KVM: MMU: cleanup invlpg code · 884a0ff0

由 Xiao Guangrong 提交于 4月 28, 2010

Using is_last_spte() to cleanup invlpg code
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

884a0ff0

KVM: MMU: fix for calculating gpa in invlpg code · 22c9b2d1

由 Xiao Guangrong 提交于 4月 28, 2010

If the guest is 32-bit, we should use 'quadrant' to adjust gpa
offset
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

22c9b2d1

17 5月, 2010 2 次提交

KVM: MMU: Make use of is_large_pte() in walker · 814a59d2

由 Gui Jianfeng 提交于 4月 16, 2010

Make use of is_large_pte() instead of checking PT_PAGE_SIZE_MASK
bit directly.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

814a59d2

KVM: MMU: Move sync_page() first pte address calculation out of loop · 51fb60d8

由 Gui Jianfeng 提交于 4月 16, 2010

Move first pte address calculation out of loop to save some cycles.
Signed-off-by: NGui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

51fb60d8