提交 · 97d64b788114be1c4dc4bfe7a8ba2bf9643fe6af · openanolis / cloud-kernel

20 9月, 2012 3 次提交

KVM: MMU: Optimize pte permission checks · 97d64b78

由 Avi Kivity 提交于 9月 12, 2012

walk_addr_generic() permission checks are a maze of branchy code, which is
performed four times per lookup.  It depends on the type of access, efer.nxe,
cr0.wp, cr4.smep, and in the near future, cr4.smap.

Optimize this away by precalculating all variants and storing them in a
bitmap.  The bitmap is recalculated when rarely-changing variables change
(cr0, cr4) and is indexed by the often-changing variables (page fault error
code, pte access permissions).

The permission check is moved to the end of the loop, otherwise an SMEP
fault could be reported as a false positive, when PDE.U=1 but PTE.U=0.
Noted by Xiao Guangrong.

The result is short, branch-free code.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

97d64b78

KVM: MMU: Move gpte_access() out of paging_tmpl.h · 3d34adec

由 Avi Kivity 提交于 9月 12, 2012

We no longer rely on paging_tmpl.h defines; so we can move the function
to mmu.c.

Rely on zero extension to 64 bits to get the correct nx behaviour.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3d34adec

KVM: MMU: Push clean gpte write protection out of gpte_access() · 8ea667f2

由 Avi Kivity 提交于 9月 12, 2012

gpte_access() computes the access permissions of a guest pte and also
write-protects clean gptes.  This is wrong when we are servicing a
write fault (since we'll be setting the dirty bit momentarily) but
correct when instantiating a speculative spte, or when servicing a
read fault (since we'll want to trap a following write in order to
set the dirty bit).

It doesn't seem to hurt in practice, but in order to make the code
readable, push the write protection out of gpte_access() and into
a new protect_clean_gpte() which is called explicitly when needed.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8ea667f2

10 9月, 2012 1 次提交

KVM: MMU: remove unnecessary check · 7de5bdc9

由 Xiao Guangrong 提交于 9月 07, 2012

Checking the return of kvm_mmu_get_page is unnecessary since it is
guaranteed by memory cache
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7de5bdc9

22 8月, 2012 3 次提交

KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended · 35f2d16b

由 Takuya Yoshikawa 提交于 8月 20, 2012

Although the possible race described in

  commit 85b70591
  KVM: MMU: fix shrinking page from the empty mmu

was correct, the real cause of that issue was a more trivial bug of
mmu_shrink() introduced by

  commit 19526396
  KVM: MMU: do not iterate over all VMs in mmu_shrink()

Here is the bug:

	if (kvm->arch.n_used_mmu_pages > 0) {
		if (!nr_to_scan--)
			break;
		continue;
	}

We skip VMs whose n_used_mmu_pages is not zero and try to shrink others:
in other words we try to shrink empty ones by mistake.

This patch reverses the logic so that mmu_shrink() can free pages from
the first VM whose n_used_mmu_pages is not zero.  Note that we also add
comments explaining the role of nr_to_scan which is not practically
important now, hoping this will be improved in the future.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

35f2d16b

KVM: introduce readonly memslot · 4d8b81ab

由 Xiao Guangrong 提交于 8月 21, 2012

In current code, if we map a readonly memory space from host to guest
and the page is not currently mapped in the host, we will get a fault
pfn and async is not allowed, then the vm will crash

We introduce readonly memory region to map ROM/ROMD to the guest, read access
is happy for readonly memslot, write access on readonly memslot will cause
KVM_EXIT_MMIO exit
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4d8b81ab

KVM: introduce gfn_to_pfn_memslot_atomic · 037d92dc

由 Xiao Guangrong 提交于 8月 21, 2012

It can instead of hva_to_pfn_atomic
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

037d92dc

06 8月, 2012 5 次提交

KVM: do not release the error pfn · cb9aaa30

由 Xiao Guangrong 提交于 8月 03, 2012

After commit a2766325, the error pfn is replaced by the
error code, it need not be released anymore

[ The patch has been compiling tested for powerpc ]
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

cb9aaa30

KVM: introduce KVM_PFN_ERR_HWPOISON · e6c1502b

由 Xiao Guangrong 提交于 8月 03, 2012

Then, get_hwpoison_pfn and is_hwpoison_pfn can be removed
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e6c1502b

KVM: introduce KVM_PFN_ERR_FAULT · 6c8ee57b

由 Xiao Guangrong 提交于 8月 03, 2012

After that, the exported and un-inline function, get_fault_pfn,
can be removed
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6c8ee57b

KVM: Push rmap into kvm_arch_memory_slot · d89cc617

由 Takuya Yoshikawa 提交于 8月 01, 2012

Two reasons:
 - x86 can integrate rmap and rmap_pde and remove heuristics in
   __gfn_to_rmap().
 - Some architectures do not need rmap.

Since rmap is one of the most memory consuming stuff in KVM, ppc'd
better restrict the allocation to Book3S HV.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d89cc617

KVM: MMU: Use gfn_to_rmap() instead of directly reading rmap array · 65fbe37c

由 Takuya Yoshikawa 提交于 8月 01, 2012

This helps to make rmap architecture specific in a later patch.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

65fbe37c

26 7月, 2012 1 次提交

KVM: MMU: use kvm_release_pfn_clean to release pfn · 3b2bd2f8

由 Xiao Guangrong 提交于 7月 26, 2012

The current code depends on the fact that fault_page is the normal page,
however, we will use the error code instead of these dummy pages in the
later patch, so we use kvm_release_pfn_clean to release pfn which will
release the error code properly
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3b2bd2f8

20 7月, 2012 3 次提交

KVM: remove the unused parameter of gfn_to_pfn_memslot · d5661048

由 Xiao Guangrong 提交于 7月 17, 2012

The parameter, 'kvm', is not used in gfn_to_pfn_memslot, we can happily remove
it
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d5661048

KVM: using get_fault_pfn to get the fault pfn · 903816fa

由 Xiao Guangrong 提交于 7月 17, 2012

Using get_fault_pfn to cleanup the code
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

903816fa

KVM: MMU: track the refcount when unmap the page · 86fde74c

由 Xiao Guangrong 提交于 7月 17, 2012

It will trigger a WARN_ON if the page has been freed but it is still
used in mmu, it can help us to detect mm bug early
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

86fde74c

19 7月, 2012 8 次提交

KVM: MMU: Avoid handling same rmap_pde in kvm_handle_hva_range() · bcd3ef58

由 Takuya Yoshikawa 提交于 7月 02, 2012

When we invalidate a THP page, we call the handler with the same
rmap_pde argument 512 times in the following loop:

  for each guest page in the range
    for each level
      unmap using rmap

This patch avoids these extra handler calls by changing the loop order
like this:

  for each level
    for each rmap in the range
      unmap using rmap

With the preceding patches in the patch series, this made THP page
invalidation more than 5 times faster on our x86 host: the host became
more responsive during swapping the guest's memory as a result.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bcd3ef58

KVM: MMU: Push trace_kvm_age_page() into kvm_age_rmapp() · f395302e

由 Takuya Yoshikawa 提交于 7月 02, 2012

This restricts the tracing to page aging and makes it possible to
optimize kvm_handle_hva_range() further in the following patch.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f395302e

KVM: MMU: Add memslot parameter to hva handlers · 048212d0

由 Takuya Yoshikawa 提交于 7月 02, 2012

This is needed to push trace_kvm_age_page() into kvm_age_rmapp() in the
following patch.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

048212d0

KVM: Separate rmap_pde from kvm_lpage_info->write_count · 77d11309

由 Takuya Yoshikawa 提交于 7月 02, 2012

This makes it possible to loop over rmap_pde arrays in the same way as
we do over rmap so that we can optimize kvm_handle_hva_range() easily in
the following patch.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

77d11309

KVM: Introduce kvm_unmap_hva_range() for kvm_mmu_notifier_invalidate_range_start() · b3ae2096

由 Takuya Yoshikawa 提交于 7月 02, 2012

When we tested KVM under memory pressure, with THP enabled on the host,
we noticed that MMU notifier took a long time to invalidate huge pages.

Since the invalidation was done with mmu_lock held, it not only wasted
the CPU but also made the host harder to respond.

This patch mitigates this by using kvm_handle_hva_range().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Alexander Graf <agraf@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b3ae2096

KVM: MMU: Make kvm_handle_hva() handle range of addresses · 84504ef3

由 Takuya Yoshikawa 提交于 7月 02, 2012

When guest's memory is backed by THP pages, MMU notifier needs to call
kvm_unmap_hva(), which in turn leads to kvm_handle_hva(), in a loop to
invalidate a range of pages which constitute one huge page:

  for each page
    for each memslot
      if page is in memslot
        unmap using rmap

This means although every page in that range is expected to be found in
the same memslot, we are forced to check unrelated memslots many times.
If the guest has more memslots, the situation will become worse.

Furthermore, if the range does not include any pages in the guest's
memory, the loop over the pages will just consume extra time.

This patch, together with the following patches, solves this problem by
introducing kvm_handle_hva_range() which makes the loop look like this:

  for each memslot
    for each page in memslot
      unmap using rmap

In this new processing, the actual work is converted to a loop over rmap
which is much more cache friendly than before.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Alexander Graf <agraf@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

84504ef3

KVM: Introduce hva_to_gfn_memslot() for kvm_handle_hva() · d19a748b

由 Takuya Yoshikawa 提交于 7月 02, 2012

This restricts hva handling in mmu code and makes it easier to extend
kvm_handle_hva() so that it can treat a range of addresses later in this
patch series.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Alexander Graf <agraf@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d19a748b

KVM: MMU: Use __gfn_to_rmap() to clean up kvm_handle_hva() · 9594a498

由 Takuya Yoshikawa 提交于 7月 02, 2012

We can treat every level uniformly.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9594a498

11 7月, 2012 7 次提交

KVM: MMU: trace fast page fault · a72faf25

由 Xiao Guangrong 提交于 6月 20, 2012

To see what happen on this path and help us to optimize it
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a72faf25

KVM: MMU: fast path of handling guest page fault · c7ba5b48

由 Xiao Guangrong 提交于 6月 20, 2012

If the the present bit of page fault error code is set, it indicates
the shadow page is populated on all levels, it means what we do is
only modify the access bit which can be done out of mmu-lock

Currently, in order to simplify the code, we only fix the page fault
caused by write-protect on the fast path
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c7ba5b48

KVM: MMU: introduce SPTE_MMU_WRITEABLE bit · 49fde340

由 Xiao Guangrong 提交于 6月 20, 2012

This bit indicates whether the spte can be writable on MMU, that means
the corresponding gpte is writable and the corresponding gfn is not
protected by shadow page protection

In the later path, SPTE_MMU_WRITEABLE will indicates whether the spte
can be locklessly updated
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

49fde340

KVM: MMU: fold tlb flush judgement into mmu_spte_update · 6e7d0354

由 Xiao Guangrong 提交于 6月 20, 2012

mmu_spte_update() is the common function, we can easily audit the path
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6e7d0354

KVM: MMU: cleanup spte_write_protect · 8e22f955

由 Xiao Guangrong 提交于 6月 20, 2012

Use __drop_large_spte to cleanup this function and comment spte_write_protect
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8e22f955

KVM: MMU: abstract spte write-protect · d13bc5b5

由 Xiao Guangrong 提交于 6月 20, 2012

Introduce a common function to abstract spte write-protect to
cleanup the code
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d13bc5b5

KVM: MMU: return bool in __rmap_write_protect · 2f84569f

由 Xiao Guangrong 提交于 6月 20, 2012

The reture value of __rmap_write_protect is either 1 or 0, use
true/false instead of these
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2f84569f

09 7月, 2012 1 次提交

KVM: MMU: Force cr3 reload with two dimensional paging on mov cr3 emulation · e676505a

由 Avi Kivity 提交于 7月 08, 2012

Currently the MMU's ->new_cr3() callback does nothing when guest paging
is disabled or when two-dimentional paging (e.g. EPT on Intel) is active.
This means that an emulated write to cr3 can be lost; kvm_set_cr3() will
write vcpu-arch.cr3, but the GUEST_CR3 field in the VMCS will retain its
old value and this is what the guest sees.

This bug did not have any effect until now because:
- with unrestricted guest, or with svm, we never emulate a mov cr3 instruction
- without unrestricted guest, and with paging enabled, we also never emulate a
  mov cr3 instruction
- without unrestricted guest, but with paging disabled, the guest's cr3 is
  ignored until the guest enables paging; at this point the value from arch.cr3
  is loaded correctly my the mov cr0 instruction which turns on paging

However, the patchset that enables big real mode causes us to emulate mov cr3
instructions in protected mode sometimes (when guest state is not virtualizable
by vmx); this mov cr3 is effectively ignored and will crash the guest.

The fix is to make nonpaging_new_cr3() call mmu_free_roots() to force a cr3
reload.  This is awkward because now all the new_cr3 callbacks to the same
thing, and because mmu_free_roots() is somewhat of an overkill; but fixing
that is more complicated and will be done after this minimal fix.

Observed in the Window XP 32-bit installer while bringing up secondary vcpus.
Signed-off-by: NAvi Kivity <avi@redhat.com>

e676505a

04 7月, 2012 1 次提交

KVM: MMU: fix shrinking page from the empty mmu · 85b70591

由 Xiao Guangrong 提交于 7月 03, 2012

Fix:

 [ 3190.059226] BUG: unable to handle kernel NULL pointer dereference at           (null)
 [ 3190.062224] IP: [<ffffffffa02aac66>] mmu_page_zap_pte+0x10/0xa7 [kvm]
 [ 3190.063760] PGD 104f50067 PUD 112bea067 PMD 0
 [ 3190.065309] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 [ 3190.066860] CPU 1
[ ...... ]
 [ 3190.109629] Call Trace:
 [ 3190.111342]  [<ffffffffa02aada6>] kvm_mmu_prepare_zap_page+0xa9/0x1fc [kvm]
 [ 3190.113091]  [<ffffffffa02ab2f5>] mmu_shrink+0x11f/0x1f3 [kvm]
 [ 3190.114844]  [<ffffffffa02ab25d>] ? mmu_shrink+0x87/0x1f3 [kvm]
 [ 3190.116598]  [<ffffffff81150c9d>] ? prune_super+0x142/0x154
 [ 3190.118333]  [<ffffffff8110a4f4>] ? shrink_slab+0x39/0x31e
 [ 3190.120043]  [<ffffffff8110a687>] shrink_slab+0x1cc/0x31e
 [ 3190.121718]  [<ffffffff8110ca1d>] do_try_to_free_pages

This is caused by shrinking page from the empty mmu, although we have
checked n_used_mmu_pages, it is useless since the check is out of mmu-lock
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

85b70591

14 6月, 2012 1 次提交

KVM: x86: change PT_FIRST_AVAIL_BITS_SHIFT to avoid conflict with EPT Dirty bit · 00763e41

由 Xudong Hao 提交于 6月 07, 2012

EPT Dirty bit use bit 9 as Intel SDM definition, to avoid conflict, change
PT_FIRST_AVAIL_BITS_SHIFT to 10.
Signed-off-by: NXudong Hao <xudong.hao@intel.com>
Signed-off-by: NXiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

00763e41

12 6月, 2012 1 次提交

KVM: MMU: Remove unused parameter from mmu_memory_cache_alloc() · 80feb89a

由 Takuya Yoshikawa 提交于 5月 29, 2012

Size is not needed to return one from pre-allocated objects.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

80feb89a

06 6月, 2012 1 次提交

KVM: disable uninitialized var warning · 79f702a6

由 Michael S. Tsirkin 提交于 6月 03, 2012

I see this in 3.5-rc1:

arch/x86/kvm/mmu.c: In function ‘kvm_test_age_rmapp’:
arch/x86/kvm/mmu.c:1271: warning: ‘iter.desc’ may be used uninitialized in this function

The line in question was introduced by commit
1e3f42f0

 static int kvm_test_age_rmapp(struct kvm *kvm, unsigned long *rmapp,
                              unsigned long data)
 {
-       u64 *spte;
+       u64 *sptep;
+       struct rmap_iterator iter;   <- line 1271
        int young = 0;

        /*

The reason I think is that the compiler assumes that
the rmap value could be 0, so

static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator
*iter)
{
        if (!rmap)
                return NULL;

        if (!(rmap & 1)) {
                iter->desc = NULL;
                return (u64 *)rmap;
        }

        iter->desc = (struct pte_list_desc *)(rmap & ~1ul);
        iter->pos = 0;
        return iter->desc->sptes[iter->pos];
}

will not initialize iter.desc, but the compiler isn't
smart enough to see that

        for (sptep = rmap_get_first(*rmapp, &iter); sptep;
             sptep = rmap_get_next(&iter)) {

will immediately exit in this case.
I checked by adding
        if (!*rmapp)
                goto out;
on top which is clearly equivalent but disables the warning.

This patch uses uninitialized_var to disable the warning without
increasing code size.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

79f702a6

05 6月, 2012 2 次提交

KVM: MMU: do not iterate over all VMs in mmu_shrink() · 19526396

由 Gleb Natapov 提交于 6月 04, 2012

mmu_shrink() needlessly iterates over all VMs even though it will not
attempt to free mmu pages from more than one on them. Fix that and also
check used mmu pages count outside of VM lock to skip inactive VMs faster.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

19526396

KVM: VMX: Use EPT Access bit in response to memory notifiers · 3f6d8c8a

由 Xudong Hao 提交于 5月 22, 2012

Signed-off-by: NHaitao Shan <haitao.shan@intel.com>
Signed-off-by: NXudong Hao <xudong.hao@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3f6d8c8a

28 5月, 2012 1 次提交

KVM: MMU: fix huge page adapted on non-PAE host · c3586667

由 Xiao Guangrong 提交于 5月 28, 2012

The huge page size is 4M on non-PAE host, but 2M page size is used in
transparent_hugepage_adjust(), so the page we get after adjust the
mapping level is not the head page, the BUG_ON() will be triggered
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c3586667

17 5月, 2012 1 次提交

KVM: MMU: Don't use RCU for lockless shadow walking · c142786c

由 Avi Kivity 提交于 5月 14, 2012

Using RCU for lockless shadow walking can increase the amount of memory
in use by the system, since RCU grace periods are unpredictable.  We also
have an unconditional write to a shared variable (reader_counter), which
isn't good for scaling.

Replace that with a scheme similar to x86's get_user_pages_fast(): disable
interrupts during lockless shadow walk to force the freer
(kvm_mmu_commit_zap_page()) to wait for the TLB flush IPI to find the
processor with interrupts enabled.

We also add a new vcpu->mode, READING_SHADOW_PAGE_TABLES, to prevent
kvm_flush_remote_tlbs() from avoiding the IPI.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c142786c

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功