提交 · aeecee2ea6e2b020de8bb562f4e79ab34eda3e22 · openeuler / Kernel

03 3月, 2016 3 次提交

KVM: MMU: introduce kvm_mmu_slot_gfn_write_protect · aeecee2e

由 Xiao Guangrong 提交于 2月 24, 2016

Split rmap_write_protect() and introduce the function to abstract the write
protection based on the slot

This function will be used in the later patch
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

aeecee2e

KVM: MMU: introduce kvm_mmu_gfn_{allow,disallow}_lpage · 547ffaed

由 Xiao Guangrong 提交于 2月 24, 2016

Abstract the common operations from account_shadowed() and
unaccount_shadowed(), then introduce kvm_mmu_gfn_disallow_lpage()
and kvm_mmu_gfn_allow_lpage()

These two functions will be used by page tracking in the later patch
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

547ffaed

KVM: MMU: rename has_wrprotected_page to mmu_gfn_lpage_is_disallowed · 92f94f1e

由 Xiao Guangrong 提交于 2月 24, 2016

kvm_lpage_info->write_count is used to detect if the large page mapping
for the gfn on the specified level is allowed, rename it to disallow_lpage
to reflect its purpose, also we rename has_wrprotected_page() to
mmu_gfn_lpage_is_disallowed() to make the code more clearer

Later we will extend this mechanism for page tracking: if the gfn is
tracked then large mapping for that gfn on any level is not allowed.
The new name is more straightforward
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

92f94f1e

23 2月, 2016 3 次提交

KVM: x86: use list_last_entry · d74c0e6b

由 Geliang Tang 提交于 1月 01, 2016

To make the intention clearer, use list_last_entry instead of
list_entry.
Signed-off-by: NGeliang Tang <geliangtang@163.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d74c0e6b

KVM: x86: MMU: Move handle_mmio_page_fault() call to kvm_mmu_page_fault() · e9ee956e

由 Takuya Yoshikawa 提交于 2月 22, 2016

Rather than placing a handle_mmio_page_fault() call in each
vcpu->arch.mmu.page_fault() handler, moving it up to
kvm_mmu_page_fault() makes the code better:

 - avoids code duplication
 - for kvm_arch_async_page_ready(), which is the other caller of
   vcpu->arch.mmu.page_fault(), removes an extra error_code check
 - avoids returning both RET_MMIO_PF_* values and raw integer values
   from vcpu->arch.mmu.page_fault()
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e9ee956e

KVM: x86: MMU: Consolidate quickly_check_mmio_pf() and is_mmio_page_fault() · ded58749

由 Takuya Yoshikawa 提交于 2月 22, 2016

These two have only slight differences:
 - whether 'addr' is of type u64 or of type gva_t
 - whether they have 'direct' parameter or not

Concerning the former, quickly_check_mmio_pf()'s u64 is better because
'addr' needs to be able to have both a guest physical address and a
guest virtual address.

The latter is just a stylistic issue as we can always calculate the mode
from the 'vcpu' as is_mmio_page_fault() does.  This patch keeps the
parameter to make the following patch cleaner.

In addition, the patch renames the function to mmio_info_in_cache() to
make it clear what it actually checks for.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ded58749

16 1月, 2016 1 次提交

kvm: rename pfn_t to kvm_pfn_t · ba049e93

由 Dan Williams 提交于 1月 15, 2016

To date, we have implemented two I/O usage models for persistent memory,
PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
userspace).  This series adds a third, DAX-GUP, that allows DAX mappings
to be the target of direct-i/o.  It allows userspace to coordinate
DMA/RDMA from/to persistent memory.

The implementation leverages the ZONE_DEVICE mm-zone that went into
4.3-rc1 (also discussed at kernel summit) to flag pages that are owned
and dynamically mapped by a device driver.  The pmem driver, after
mapping a persistent memory range into the system memmap via
devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus
page-backed pmem-pfns via flags in the new pfn_t type.

The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the
resulting pte(s) inserted into the process page tables with a new
_PAGE_DEVMAP flag.  Later, when get_user_pages() is walking ptes it keys
off _PAGE_DEVMAP to pin the device hosting the page range active.
Finally, get_page() and put_page() are modified to take references
against the device driver established page mapping.

Finally, this need for "struct page" for persistent memory requires
memory capacity to store the memmap array.  Given the memmap array for a
large pool of persistent may exhaust available DRAM introduce a
mechanism to allocate the memmap from persistent memory.  The new
"struct vmem_altmap *" parameter to devm_memremap_pages() enables
arch_add_memory() to use reserved pmem capacity rather than the page
allocator.

This patch (of 18):

The core has developed a need for a "pfn_t" type [1].  Move the existing
pfn_t in KVM to kvm_pfn_t [2].

[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
[2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.htmlSigned-off-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ba049e93

07 1月, 2016 1 次提交

kvm: x86: fix comment about {mmu,nested_mmu}.gva_to_gpa · 0af2593b

由 David Matlack 提交于 12月 30, 2015

The comment had the meaning of mmu.gva_to_gpa and nested_mmu.gva_to_gpa
swapped. Fix that, and also add some details describing how each translation
works.
Signed-off-by: NDavid Matlack <dmatlack@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0af2593b

19 12月, 2015 1 次提交

KVM: x86: MMU: Use clear_page() instead of init_shadow_page_table() · 77492664

由 Takuya Yoshikawa 提交于 12月 18, 2015

Not just in order to clean up the code, but to make it faster by using
enhanced instructions: the initialization became 20-30% faster on our
testing machine.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

77492664

26 11月, 2015 11 次提交

T
KVM: x86: MMU: Remove unused parameter parent_pte from kvm_mmu_get_page() · bb11c6c9
由 Takuya Yoshikawa 提交于 11月 26, 2015
```
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
bb11c6c9

KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk() · 74c4e63a

由 Takuya Yoshikawa 提交于 11月 26, 2015

As kvm_mmu_get_page() was changed so that every parent pointer would not
get into the sp->parent_ptes chain before the entry pointed to by it was
set properly, we can use the for_each_rmap_spte macro instead of
pte_list_walk().
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

74c4e63a

KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page() · 98bba238

由 Takuya Yoshikawa 提交于 11月 26, 2015

Every time kvm_mmu_get_page() is called with a non-NULL parent_pte
argument, link_shadow_page() follows that to set the parent entry so
that the new mapping will point to the returned page table.

Moving parent_pte handling there allows to clean up the code because
parent_pte is passed to kvm_mmu_get_page() just for mark_unsync() and
mmu_page_add_parent_pte().

In addition, the patch avoids calling mark_unsync() for other parents in
the sp->parent_ptes chain than the newly added parent_pte, because they
have been there since before the current page fault handling started.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

98bba238

KVM: x86: MMU: Move initialization of parent_ptes out from kvm_mmu_alloc_page() · 47005792

由 Takuya Yoshikawa 提交于 11月 20, 2015

Make kvm_mmu_alloc_page() do just what its name tells to do, and remove
the extra allocation error check and zero-initialization of parent_ptes:
shadow page headers allocated by kmem_cache_zalloc() are always in the
per-VCPU pools.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

47005792

KVM: x86: MMU: Consolidate BUG_ON checks for reverse-mapped sptes · 77fbbbd2

由 Takuya Yoshikawa 提交于 11月 20, 2015

At some call sites of rmap_get_first() and rmap_get_next(), BUG_ON is
placed right after the call to detect unrelated sptes which must not be
found in the reverse-mapping list.

Move this check in rmap_get_first/next() so that all call sites, not
just the users of the for_each_rmap_spte() macro, will be checked the
same way.

One thing to keep in mind is that kvm_mmu_unlink_parents() also uses
rmap_get_first() to handle parent sptes.  The change will not break it
because parent sptes are present, at least until drop_parent_pte()
actually unlinks them, and not mmio-sptes.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

77fbbbd2

KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() · afd28fe1

由 Takuya Yoshikawa 提交于 11月 20, 2015

is_rmap_spte(), originally named is_rmap_pte(), was introduced when the
simple reverse mapping was implemented by commit cd4a4e53
("[PATCH] KVM: MMU: Implement simple reverse mapping").  At that point,
its role was clear and only rmap_add() and rmap_remove() were using it
to select sptes that need to be reverse-mapped.

Independently of that, is_shadow_present_pte() was first introduced by
commit c7addb90 ("KVM: Allow not-present guest page faults to
bypass kvm") to do bypass_guest_pf optimization, which does not exist
any more.

These two seem to have changed their roles somewhat, and is_rmap_spte()
just calls is_shadow_present_pte() now.

Since using both of them without clear distinction just makes the code
confusing, remove is_rmap_spte().
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

afd28fe1

KVM: x86: MMU: Make mmu_set_spte() return emulate value · 029499b4

由 Takuya Yoshikawa 提交于 11月 20, 2015

mmu_set_spte()'s code is based on the assumption that the emulate
parameter has a valid pointer value if set_spte() returns true and
write_fault is not zero.  In other cases, emulate may be NULL, so a
NULL-check is needed.

Stop passing emulate pointer and make mmu_set_spte() return the emulate
value instead to clean up this complex interface.  Prefetch functions
can just throw away the return value.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

029499b4

KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap · fd951457

由 Takuya Yoshikawa 提交于 11月 20, 2015

Both __mmu_unsync_walk() and mmu_pages_clear_parents() have three line
code which clears a bit in the unsync child bitmap; the former places it
inside a loop block and uses a few goto statements to jump to it.

A new helper function, clear_unsync_child_bit(), makes the code cleaner.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fd951457

KVM: x86: MMU: Remove unused parameter of __direct_map() · 7ee0e5b2

由 Takuya Yoshikawa 提交于 11月 20, 2015

Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7ee0e5b2

KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct · 018aabb5

由 Takuya Yoshikawa 提交于 11月 20, 2015

New struct kvm_rmap_head makes the code type-safe to some extent.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

018aabb5

KVM: x86: MMU: always set accessed bit in shadow PTEs · 0e3d0648

由 Paolo Bonzini 提交于 11月 13, 2015

Commit 7a1638ce ("nEPT: Redefine EPT-specific link_shadow_page()",
2013-08-05) says:

    Since nEPT doesn't support A/D bit, we should not set those bit
    when building the shadow page table.

but this is not necessary.  Even though nEPT doesn't support A/D
bits, and hence the vmcs12 EPT pointer will never enable them,
we always use them for shadow page tables if available (see
construct_eptp in vmx.c).  So we can set the A/D bits freely
in the shadow page table.

This patch hence basically reverts commit 7a1638ce.

Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0e3d0648

10 11月, 2015 1 次提交

KVM: x86: merge handle_mmio_page_fault and handle_mmio_page_fault_common · 450869d6

由 Paolo Bonzini 提交于 11月 04, 2015

They are exactly the same, except that handle_mmio_page_fault
has an unused argument and a call to WARN_ON.  Remove the unused
argument from the callers, and move the warning to (the former)
handle_mmio_page_fault_common.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

450869d6

19 10月, 2015 1 次提交

KVM: x86: MMU: Initialize force_pt_level before calling mapping_level() · 8c85ac1c

由 Takuya Yoshikawa 提交于 10月 19, 2015

Commit fd136902 ("KVM: x86: MMU: Move mapping_level_dirty_bitmap()
call in mapping_level()") forgot to initialize force_pt_level to false
in FNAME(page_fault)() before calling mapping_level() like
nonpaging_map() does.  This can sometimes result in forcing page table
level mapping unnecessarily.

Fix this and move the first *force_pt_level check in mapping_level()
before kvm_vcpu_gfn_to_memslot() call to make it a bit clearer that
the variable must be initialized before mapping_level() gets called.

This change can also avoid calling kvm_vcpu_gfn_to_memslot() when
!check_hugepage_cache_consistency() check in tdp_page_fault() forces
page table level mapping.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8c85ac1c

16 10月, 2015 5 次提交

KVM: x86: MMU: Eliminate an extra memory slot search in mapping_level() · 5225fdf8

由 Takuya Yoshikawa 提交于 10月 16, 2015

Calling kvm_vcpu_gfn_to_memslot() twice in mapping_level() should be
avoided since getting a slot by binary search may not be negligible,
especially for virtual machines with many memory slots.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5225fdf8

KVM: x86: MMU: Remove mapping_level_dirty_bitmap() · d8aacf5d

由 Takuya Yoshikawa 提交于 10月 16, 2015

Now that it has only one caller, and its name is not so helpful for
readers, remove it.  The new memslot_valid_for_gpte() function
makes it possible to share the common code between
gfn_to_memslot_dirty_bitmap() and mapping_level().
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d8aacf5d

KVM: x86: MMU: Move mapping_level_dirty_bitmap() call in mapping_level() · fd136902

由 Takuya Yoshikawa 提交于 10月 16, 2015

This is necessary to eliminate an extra memory slot search later.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fd136902

KVM: x86: MMU: Make force_pt_level bool · cd1872f0

由 Takuya Yoshikawa 提交于 10月 16, 2015

This will be passed to a function later.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cd1872f0

KVM: x86: manually unroll bad_mt_xwr loop · 951f9fd7

由 Paolo Bonzini 提交于 9月 23, 2015

The loop is computing one of two constants, it can be simpler to write
everything inline.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

951f9fd7

01 10月, 2015 1 次提交

KVM: x86: introduce lapic_in_kernel · 35754c98

由 Paolo Bonzini 提交于 7月 29, 2015

Avoid pointer chasing and memory barriers, and simplify the code
when split irqchip (LAPIC in kernel, IOAPIC/PIC in userspace)
is introduced.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

35754c98

25 9月, 2015 2 次提交

KVM: x86: fix off-by-one in reserved bits check · 58c95070

由 Paolo Bonzini 提交于 9月 22, 2015

29ecd660 ("KVM: x86: avoid uninitialized variable warning",
2015-09-06) introduced a not-so-subtle problem, which probably
escaped review because it was not part of the patch context.

Before the patch, leaf was always equal to iterator.level.  After,
it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set,
and when is_shadow_zero_bits_set does another "-1" the check on
reserved bits becomes incorrect.  Using "iterator.level" in the call
fixes this call trace:

WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]()
Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq
[...]
Call Trace:
  dump_stack+0x4e/0x84
  warn_slowpath_common+0x95/0xe0
  warn_slowpath_null+0x1a/0x20
  handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]
  tdp_page_fault+0x231/0x290 [kvm]
  ? emulator_pio_in_out+0x6e/0xf0 [kvm]
  kvm_mmu_page_fault+0x36/0x240 [kvm]
  ? svm_set_cr0+0x95/0xc0 [kvm_amd]
  pf_interception+0xde/0x1d0 [kvm_amd]
  handle_exit+0x181/0xa70 [kvm_amd]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  ? preempt_count_sub+0x9b/0xf0
  ? mutex_lock_killable_nested+0x26f/0x490
  ? preempt_count_sub+0x9b/0xf0
  kvm_vcpu_ioctl+0x358/0x710 [kvm]
  ? __fget+0x5/0x210
  ? __fget+0x101/0x210
  do_vfs_ioctl+0x2f4/0x560
  ? __fget_light+0x29/0x90
  SyS_ioctl+0x4c/0x90
  entry_SYSCALL_64_fastpath+0x16/0x73
---[ end trace 37901c8686d84de6 ]---
Reported-by: NBorislav Petkov <bp@alien8.de>
Tested-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

58c95070

KVM: x86: use correct page table format to check nested page table reserved bits · 6fec2144

由 Paolo Bonzini 提交于 9月 22, 2015

Intel CPUID on AMD host or vice versa is a weird case, but it can
happen.  Handle it by checking the host CPU vendor instead of the
guest's in reset_tdp_shadow_zero_bits_mask.  For speed, the
check uses the fact that Intel EPT has an X (executable) bit while
AMD NPT has NX.
Reported-by: NBorislav Petkov <bp@alien8.de>
Tested-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6fec2144

06 9月, 2015 1 次提交

KVM: x86: avoid uninitialized variable warning · 29ecd660

由 Paolo Bonzini 提交于 9月 06, 2015

This does not show up on all compiler versions, so it sneaked into the
first 4.3 pull request.  The fix is to mimic the logic of the "print
sptes" loop in the "fill array" loop.  Then leaf and root can be
both initialized unconditionally.

Note that "leaf" now points to the first unused element of the array,
not the last filled element.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

29ecd660

05 8月, 2015 9 次提交

KVM: VMX: drop ept misconfig check · f735d4af

由 Xiao Guangrong 提交于 8月 05, 2015

The logic used to check ept misconfig is completely contained in common
reserved bits check for sptes, so it can be removed
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f735d4af

KVM: MMU: fully check zero bits for sptes · 47ab8751

由 Xiao Guangrong 提交于 8月 05, 2015

The #PF with PFEC.RSV = 1 is designed to speed MMIO emulation, however,
it is possible that the RSV #PF is caused by real BUG by mis-configure
shadow page table entries

This patch enables full check for the zero bits on shadow page table
entries (which includes not only bits reserved by the hardware, but also
bits that will never be set in the SPTE), then dump the shadow page table
hierarchy.
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

47ab8751

KVM: MMU: introduce is_shadow_zero_bits_set() · d625b155

由 Xiao Guangrong 提交于 8月 05, 2015

We have the same data struct to check reserved bits on guest page tables
and shadow page tables, split is_rsvd_bits_set() so that the logic can be
shared between these two paths
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d625b155

KVM: MMU: introduce the framework to check zero bits on sptes · c258b62b

由 Xiao Guangrong 提交于 8月 05, 2015

We have abstracted the data struct and functions which are used to check
reserved bit on guest page tables, now we extend the logic to check
zero bits on shadow page tables

The zero bits on sptes include not only reserved bits on hardware but also
the bits that SPTEs willnever use.  For example, shadow pages will never
use GB pages unless the guest uses them too.
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c258b62b

KVM: MMU: split reset_rsvds_bits_mask_ept · 81b8eebb

由 Xiao Guangrong 提交于 8月 05, 2015

Since shadow ept page tables and Intel nested guest page tables have the
same format, split reset_rsvds_bits_mask_ept so that the logic can be
reused by later patches which check zero bits on sptes
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

81b8eebb

KVM: MMU: split reset_rsvds_bits_mask · 6dc98b86

由 Xiao Guangrong 提交于 8月 05, 2015

Since softmmu & AMD nested shadow page tables and guest page tables have
the same format, split reset_rsvds_bits_mask so that the logic can be
reused by later patches which check zero bits on sptes
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6dc98b86

KVM: MMU: introduce rsvd_bits_validate · a0a64f50

由 Xiao Guangrong 提交于 8月 05, 2015

These two fields, rsvd_bits_mask and bad_mt_xwr, in "struct kvm_mmu" are
used to check if reserved bits set on guest ptes, move them to a data
struct so that the approach can be applied to check host shadow page
table entries as well
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a0a64f50

KVM: MMU: move FNAME(is_rsvd_bits_set) to mmu.c · d2b0f981

由 Xiao Guangrong 提交于 8月 05, 2015

FNAME(is_rsvd_bits_set) does not depend on guest mmu mode, move it
to mmu.c to stop being compiled multiple times
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d2b0f981

KVM: MMU: fix validation of mmio page fault · 6f691251

由 Xiao Guangrong 提交于 8月 05, 2015

We got the bug that qemu complained with "KVM: unknown exit, hardware
reason 31" and KVM shown these info:
[84245.284948] EPT: Misconfiguration.
[84245.285056] EPT: GPA: 0xfeda848
[84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4
[84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3
[84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2
[84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1

This is because we got a mmio #PF and the handler see the mmio spte becomes
normal (points to the ram page)

However, this is valid after introducing fast mmio spte invalidation which
increases the generation-number instead of zapping mmio sptes, a example
is as follows:
1. QEMU drops mmio region by adding a new memslot
2. invalidate all mmio sptes
3.

        VCPU 0                        VCPU 1
    access the invalid mmio spte
                            access the region originally was MMIO before
                            set the spte to the normal ram map

    mmio #PF
    check the spte and see it becomes normal ram mapping !!!

This patch fixes the bug just by dropping the check in mmio handler, it's
good for backport. Full check will be introduced in later patches
Reported-by: NPavel Shirshov <ru.pchel@gmail.com>
Tested-by: NPavel Shirshov <ru.pchel@gmail.com>
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6f691251

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功