提交 · 4b16184c1ccafa4b0c188c622ea532fb90e6f5b0 · openeuler / Kernel

24 10月, 2010 35 次提交

KVM: SVM: Initialize Nested Nested MMU context on VMRUN · 4b16184c

由 Joerg Roedel 提交于 9月 10, 2010

This patch adds code to initialize the Nested Nested Paging
MMU context when the L1 guest executes a VMRUN instruction
and has nested paging enabled in its VMCB.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4b16184c

KVM: MMU: Track NX state in struct kvm_mmu · 2d48a985

由 Joerg Roedel 提交于 9月 10, 2010

With Nested Paging emulation the NX state between the two
MMU contexts may differ. To make sure that always the right
fault error code is recorded this patch moves the NX state
into struct kvm_mmu so that the code can distinguish between
L1 and L2 NX state.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2d48a985

KVM: MMU: Allow long mode shadows for legacy page tables · 81407ca5

由 Joerg Roedel 提交于 9月 10, 2010

Currently the KVM softmmu implementation can not shadow a 32
bit legacy or PAE page table with a long mode page table.
This is a required feature for nested paging emulation
because the nested page table must alway be in host format.
So this patch implements the missing pieces to allow long
mode page tables for page table types.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

81407ca5

KVM: MMU: Refactor mmu_alloc_roots function · 651dd37a

由 Joerg Roedel 提交于 9月 10, 2010

This patch factors out the direct-mapping paths of the
mmu_alloc_roots function into a seperate function. This
makes it a lot easier to avoid all the unnecessary checks
done in the shadow path which may break when running direct.
In fact, this patch already fixes a problem when running PAE
guests on a PAE shadow page table.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

651dd37a

KVM: MMU: Introduce kvm_pdptr_read_mmu · d41d1895

由 Joerg Roedel 提交于 9月 10, 2010

This function is implemented to load the pdptr pointers of
the currently running guest (l1 or l2 guest). Therefore it
takes care about the current paging mode and can read pdptrs
out of l2 guest physical memory.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d41d1895

KVM: MMU: Introduce init_kvm_nested_mmu() · 02f59dc9

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces the init_kvm_nested_mmu() function
which is used to re-initialize the nested mmu when the l2
guest changes its paging mode.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

02f59dc9

KVM: MMU: Implement nested gva_to_gpa functions · 6539e738

由 Joerg Roedel 提交于 9月 10, 2010

This patch adds the functions to do a nested l2_gva to
l1_gpa page table walk.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6539e738

KVM: X86: Introduce pointer to mmu context used for gva_to_gpa · 14dfe855

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces the walk_mmu pointer which points to
the mmu-context currently used for gva_to_gpa translations.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

14dfe855

KVM: MMU: Track page fault data in struct vcpu · 8df25a32

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces a struct with two new fields in
vcpu_arch for x86:

	* fault.address
	* fault.error_code

This will be used to correctly propagate page faults back
into the guest when we could have either an ordinary page
fault or a nested page fault. In the case of a nested page
fault the fault-address is different from the original
address that should be walked. So we need to keep track
about the real fault-address.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8df25a32

KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu · 3241f22d

由 Joerg Roedel 提交于 9月 10, 2010

This patch changes is_rsvd_bits_set() function prototype to
take only a kvm_mmu context instead of a full vcpu.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3241f22d

KVM: MMU: Introduce kvm_init_shadow_mmu helper function · 52fde8df

由 Joerg Roedel 提交于 9月 10, 2010

Some logic of the init_kvm_softmmu function is required to
build the Nested Nested Paging context. So factor the
required logic into a seperate function and export it.
Also make the whole init path suitable for more than one mmu
context.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

52fde8df

KVM: MMU: Introduce inject_page_fault function pointer · cb659db8

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces an inject_page_fault function pointer
into struct kvm_mmu which will be used to inject a page
fault. This will be used later when Nested Nested Paging is
implemented.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

cb659db8

KVM: MMU: Introduce get_cr3 function pointer · 5777ed34

由 Joerg Roedel 提交于 9月 10, 2010

This function pointer in the MMU context is required to
implement Nested Nested Paging.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5777ed34

KVM: X86: Introduce a tdp_set_cr3 function · 1c97f0a0

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces a special set_tdp_cr3 function pointer
in kvm_x86_ops which is only used for tpd enabled mmu
contexts. This allows to remove some hacks from svm code.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1c97f0a0

KVM: MMU: Make set_cr3 a function pointer in kvm_mmu · f43addd4

由 Joerg Roedel 提交于 9月 10, 2010

This is necessary to implement Nested Nested Paging. As a
side effect this allows some cleanups in the SVM nested
paging code.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f43addd4

KVM: MMU: Make tdp_enabled a mmu-context parameter · c5a78f2b

由 Joerg Roedel 提交于 9月 10, 2010

This patch changes the tdp_enabled flag from its global
meaning to the mmu-context and renames it to direct_map
there. This is necessary for Nested SVM with emulation of
Nested Paging where we need an extra MMU context to shadow
the Nested Nested Page Table.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c5a78f2b

KVM: MMU: Fix 32 bit legacy paging with NPT · f87f9288

由 Joerg Roedel 提交于 9月 02, 2010

This patch fixes 32 bit legacy paging with NPT enabled. The
mmu_check_root call on the top-level of the loop causes
root_gfn to take values (in the tdp_enabled path) which are
outside of guest memory. So the mmu_check_root call fails at
some point in the loop interation causing the guest to
tiple-fault.
This patch changes the mmu_check_root calls to the places
where they are really necessary. As a side-effect it
introduces a check for the root of a pae page table too.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f87f9288

KVM: MMU: move audit to a separate file · 2f4f3372

由 Xiao Guangrong 提交于 8月 30, 2010

Move the audit code from arch/x86/kvm/mmu.c to arch/x86/kvm/mmu_audit.c
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2f4f3372

KVM: MMU: support disable/enable mmu audit dynamicly · 8b1fe17c

由 Xiao Guangrong 提交于 8月 30, 2010

Add a r/w module parameter named 'mmu_audit', it can control audit
enable/disable:

enable:
  echo 1 > /sys/module/kvm/parameters/mmu_audit

disable:
  echo 0 > /sys/module/kvm/parameters/mmu_audit

This patch not change the logic
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8b1fe17c

KVM: MMU: remove count_rmaps() · 8e0e8afa

由 Xiao Guangrong 提交于 8月 28, 2010

Nothing is checked in count_rmaps(), so remove it
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8e0e8afa

KVM: MMU: rewrite audit_mappings_page() function · 365fb3fd

由 Xiao Guangrong 提交于 8月 28, 2010

There is a bugs in this function, we call gfn_to_pfn() and kvm_mmu_gva_to_gpa_read() in
atomic context(kvm_mmu_audit() is called under the spinlock(mmu_lock)'s protection).

This patch fix it by:
- introduce gfn_to_pfn_atomic instead of gfn_to_pfn
- get the mapping gfn from kvm_mmu_page_get_gfn()

And it adds 'notrap' ptes check in unsync/direct sps
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

365fb3fd

KVM: MMU: fix wrong not write protected sp report · bc32ce21

由 Xiao Guangrong 提交于 8月 28, 2010

The audit code reports some sp not write protected in current code, it's just the
bug in audit_write_protection(), since:

- the invalid sp not need write protected
- using uninitialize local variable('gfn')
- call kvm_mmu_audit() out of mmu_lock's protection
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bc32ce21

KVM: MMU: check rmap for every spte · 0beb8d66

由 Xiao Guangrong 提交于 8月 28, 2010

The read-only spte also has reverse mapping, so fix the code to check them,
also modify the function name to fit its doing
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0beb8d66

KVM: MMU: fix compile warning in audit code · 9ad17b10

由 Xiao Guangrong 提交于 8月 28, 2010

fix:

arch/x86/kvm/mmu.c: In function ‘kvm_mmu_unprotect_page’:
arch/x86/kvm/mmu.c:1741: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’
arch/x86/kvm/mmu.c:1745: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’
arch/x86/kvm/mmu.c: In function ‘mmu_unshadow’:
arch/x86/kvm/mmu.c:1761: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’
arch/x86/kvm/mmu.c: In function ‘set_spte’:
arch/x86/kvm/mmu.c:2005: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’
arch/x86/kvm/mmu.c: In function ‘mmu_set_spte’:
arch/x86/kvm/mmu.c:2033: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 7 has type ‘gfn_t’
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9ad17b10

KVM: MMU: prefetch ptes when intercepted guest #PF · 957ed9ef

由 Xiao Guangrong 提交于 8月 22, 2010

Support prefetch ptes when intercept guest #PF, avoid to #PF by later
access

If we meet any failure in the prefetch path, we will exit it and
not try other ptes to avoid become heavy path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

957ed9ef

KVM: MMU: fix missing percpu counter destroy · 45bf21a8

由 Wei Yongjun 提交于 8月 23, 2010

commit ad05c88266b4cce1c820928ce8a0fb7690912ba1
(KVM: create aggregate kvm_total_used_mmu_pages value)
introduce percpu counter kvm_total_used_mmu_pages but never
destroy it, this may cause oops when rmmod & modprobe.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: NTim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

45bf21a8

KVM: MMU: fix regression from rework mmu_shrink() code · 80b63faf

由 Xiaotian Feng 提交于 8月 24, 2010

Latest kvm mmu_shrink code rework makes kernel changes kvm->arch.n_used_mmu_pages/
kvm->arch.n_max_mmu_pages at kvm_mmu_free_page/kvm_mmu_alloc_page, which is called
by kvm_mmu_commit_zap_page. So the kvm->arch.n_used_mmu_pages or
kvm_mmu_available_pages(vcpu->kvm) is unchanged after kvm_mmu_prepare_zap_page(),
This caused kvm_mmu_change_mmu_pages/__kvm_mmu_free_some_pages loops forever.
Moving kvm_mmu_commit_zap_page would make the while loop performs as normal.
Reported-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
Tested-by: NAvi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Tim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

80b63faf

KVM: create aggregate kvm_total_used_mmu_pages value · 45221ab6

由 Dave Hansen 提交于 8月 19, 2010

Of slab shrinkers, the VM code says:

 * Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
 * querying the cache size, so a fastpath for that case is appropriate.

and it *means* it.  Look at how it calls the shrinkers:

    nr_before = (*shrinker->shrink)(0, gfp_mask);
    shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);

So, if you do anything stupid in your shrinker, the VM will doubly
punish you.

The mmu_shrink() function takes the global kvm_lock, then acquires
every VM's kvm->mmu_lock in sequence.  If we have 100 VMs, then
we're going to take 101 locks.  We do it twice, so each call takes
202 locks.  If we're under memory pressure, we can have each cpu
trying to do this.  It can get really hairy, and we've seen lock
spinning in mmu_shrink() be the dominant entry in profiles.

This is guaranteed to optimize at least half of those lock
aquisitions away.  It removes the need to take any of the locks
when simply trying to count objects.

A 'percpu_counter' can be a large object, but we only have one
of these for the entire system.  There are not any better
alternatives at the moment, especially ones that handle CPU
hotplug.
Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

45221ab6

KVM: replace x86 kvm n_free_mmu_pages with n_used_mmu_pages · 49d5ca26

由 Dave Hansen 提交于 8月 19, 2010

Doing this makes the code much more readable.  That's
borne out by the fact that this patch removes code.  "used"
also happens to be the number that we need to return back to
the slab code when our shrinker gets called.  Keeping this
value as opposed to free makes the next patch simpler.

So, 'struct kvm' is kzalloc()'d.  'struct kvm_arch' is a
structure member (and not a pointer) of 'struct kvm'.  That
means they start out zeroed.  I _think_ they get initialized
properly by kvm_mmu_change_mmu_pages().  But, that only happens
via kvm ioctls.

Another benefit of storing 'used' intead of 'free' is
that the values are consistent from the moment the structure is
allocated: no negative "used" value.
Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

49d5ca26

KVM: rename x86 kvm->arch.n_alloc_mmu_pages · 39de71ec

由 Dave Hansen 提交于 8月 19, 2010

arch.n_alloc_mmu_pages is a poor choice of name. This value truly
means, "the number of pages which _may_ be allocated".  But,
reading the name, "n_alloc_mmu_pages" implies "the number of allocated
mmu pages", which is dead wrong.

It's really the high watermark, so let's give it a name to match:
nr_max_mmu_pages.  This change will make the next few patches
much more obvious and easy to read.
Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

39de71ec

KVM: abstract kvm x86 mmu->n_free_mmu_pages · e0df7b9f

由 Dave Hansen 提交于 8月 19, 2010

"free" is a poor name for this value.  In this context, it means,
"the number of mmu pages which this kvm instance should be able to
allocate."  But "free" implies much more that the objects are there
and ready for use.  "available" is a much better description, especially
when you see how it is calculated.

In this patch, we abstract its use into a function.  We'll soon
replace the function's contents by calculating the value in a
different way.

All of the reads of n_free_mmu_pages are taken care of in this
patch.  The modification sites will be handled in a patch
later in the series.
Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: NTim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e0df7b9f

KVM: MMU: mark page dirty only when page is really written · 4132779b

由 Xiao Guangrong 提交于 8月 02, 2010

Mark page dirty only when this page is really written, it's more exacter,
and also can fix dirty page marking in speculation path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4132779b

KVM: MMU: move bits lost judgement into a separate function · 8672b721

由 Xiao Guangrong 提交于 8月 02, 2010

Introduce spte_has_volatile_bits() function to judge whether spte
bits will miss, it's more readable and can help us to cleanup code
later
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8672b721

KVM: MMU: using kvm_set_pfn_accessed() instead of mark_page_accessed() · 251464c4

由 Xiao Guangrong 提交于 8月 02, 2010

It's a small cleanup that using using kvm_set_pfn_accessed() instead
of mark_page_accessed()
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

251464c4

KVM: MMU: remove valueless output message · 19ada5c4

由 Xiao Guangrong 提交于 7月 27, 2010

After commit 53383eaad08d, the '*spte' has updated before call
rmap_remove()(in most case it's 'shadow_trap_nonpresent_pte'), so
remove this information from error message
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

19ada5c4

07 8月, 2010 1 次提交

x86, kvm: Remove cast obsoleted by set_64bit() prototype cleanup · 7645e432

由 H. Peter Anvin 提交于 8月 06, 2010

KVM ended up having to put a pretty ugly wrapper around set_64bit()
in order to get the type right.  Now set_64bit() takes the expected
u64 type, and this wrapper can be cleaned up.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Avi Kivity <avi@redhat.com>
LKML-Reference: <4C5C4E7A.8040603@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7645e432

02 8月, 2010 4 次提交

KVM: MMU: using __xchg_spte more smarter · 9a3aad70

由 Xiao Guangrong 提交于 7月 16, 2010

Sometimes, atomically set spte is not needed, this patch call __xchg_spte()
more smartly

Note: if the old mapping's access bit is already set, we no need atomic operation
since the access bit is not lost
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9a3aad70

KVM: MMU: cleanup spte set and accssed/dirty tracking · e4b502ea

由 Xiao Guangrong 提交于 7月 16, 2010

Introduce set_spte_track_bits() to cleanup current code
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e4b502ea

KVM: MMU: don't atomicly set spte if it's not present · be233d49

由 Xiao Guangrong 提交于 7月 16, 2010

If the old mapping is not present, the spte.a is not lost, so no need
atomic operation to set it
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

be233d49

KVM: MMU: fix page dirty tracking lost while sync page · 9ed5520d

由 Xiao Guangrong 提交于 7月 16, 2010

In sync-page path, if spte.writable is changed, it will lose page dirty
tracking, for example:

assume spte.writable = 0 in a unsync-page, when it's synced, it map spte
to writable(that is spte.writable = 1), later guest write spte.gfn, it means
spte.gfn is dirty, then guest changed this mapping to read-only, after it's
synced,  spte.writable = 0

So, when host release the spte, it detect spte.writable = 0 and not mark page
dirty
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9ed5520d

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功