提交 · 2b4b5af8f8e7296bc27c52023ab6bb8f53db3a2b · openeuler / raspberrypi-kernel

23 7月, 2012 2 次提交

KVM: Choose better candidate for directed yield · 06e48c51

由 Raghavendra K T 提交于 7月 19, 2012

Currently, on a large vcpu guests, there is a high probability of
yielding to the same vcpu who had recently done a pause-loop exit or
cpu relax intercepted. Such a yield can lead to the vcpu spinning
again and hence degrade the performance.

The patchset keeps track of the pause loop exit/cpu relax interception
and gives chance to a vcpu which:
 (a) Has not done pause loop exit or cpu relax intercepted at all
     (probably he is preempted lock-holder)
 (b) Was skipped in last iteration because it did pause loop exit or
     cpu relax intercepted, and probably has become eligible now
     (next eligible lock holder)
Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # on s390x
Signed-off-by: NAvi Kivity <avi@redhat.com>

06e48c51

KVM: Note down when cpu relax intercepted or pause loop exited · 4c088493

由 Raghavendra K T 提交于 7月 18, 2012

Noting pause loop exited vcpu or cpu relax intercepted helps in
filtering right candidate to yield. Wrong selection of vcpu;
i.e., a vcpu that just did a pl-exit or cpu relax intercepted may
contribute to performance degradation.
Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # on s390x
Signed-off-by: NAvi Kivity <avi@redhat.com>

4c088493

20 7月, 2012 3 次提交

KVM: remove the unused parameter of gfn_to_pfn_memslot · d5661048

由 Xiao Guangrong 提交于 7月 17, 2012

The parameter, 'kvm', is not used in gfn_to_pfn_memslot, we can happily remove
it
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d5661048

KVM: make bad_pfn static to kvm_main.c · ca0565f5

由 Xiao Guangrong 提交于 7月 17, 2012

bad_pfn is not used out of kvm_main.c, so mark it static, also move it near
hwpoison_pfn and fault_pfn
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ca0565f5

KVM: using get_fault_pfn to get the fault pfn · 903816fa

由 Xiao Guangrong 提交于 7月 17, 2012

Using get_fault_pfn to cleanup the code
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

903816fa

19 7月, 2012 1 次提交

KVM: Introduce kvm_unmap_hva_range() for kvm_mmu_notifier_invalidate_range_start() · b3ae2096

由 Takuya Yoshikawa 提交于 7月 02, 2012

When we tested KVM under memory pressure, with THP enabled on the host,
we noticed that MMU notifier took a long time to invalidate huge pages.

Since the invalidation was done with mmu_lock held, it not only wasted
the CPU but also made the host harder to respond.

This patch mitigates this by using kvm_handle_hva_range().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Alexander Graf <agraf@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b3ae2096

07 7月, 2012 1 次提交

KVM: handle last_boosted_vcpu = 0 case · 5cfc2aab

由 Rik van Riel 提交于 6月 19, 2012

If last_boosted_vcpu == 0, then we fall through all test cases and
may end up with all VCPUs pouncing on vcpu 0.  With a large enough
guest, this can result in enormous runqueue lock contention, which
can prevent vcpu0 from running, leading to a livelock.

Changing < to <= makes sure we properly handle that case.
Signed-off-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5cfc2aab

04 7月, 2012 1 次提交

KVM: fix fault page leak · f4119304

由 Xiao Guangrong 提交于 7月 03, 2012

fault_page is forgot to be freed
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f4119304

03 7月, 2012 1 次提交

KVM: Pass kvm_irqfd to functions · d4db2935

由 Alex Williamson 提交于 6月 29, 2012

Prune this down to just the struct kvm_irqfd so we can avoid
changing function definition for every flag or field we use.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d4db2935

18 6月, 2012 1 次提交

KVM: use KVM_CAP_IRQ_ROUTING to protect the routing related code · 9900b4b4

由 Marc Zyngier 提交于 6月 15, 2012

The KVM code sometimes uses CONFIG_HAVE_KVM_IRQCHIP to protect
code that is related to IRQ routing, which not all in-kernel
irqchips may support.

Use KVM_CAP_IRQ_ROUTING instead.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9900b4b4

05 6月, 2012 2 次提交

KVM: Avoid wasting pages for small lpage_info arrays · c1a7b32a

由 Takuya Yoshikawa 提交于 5月 20, 2012

lpage_info is created for each large level even when the memory slot is
not for RAM. This means that when we add one slot for a PCI device, we
end up allocating at least KVM_NR_PAGE_SIZES - 1 pages by vmalloc().

To make things worse, there is an increasing number of devices which
would result in more pages being wasted this way.

This patch mitigates this problem by using kvm_kvzalloc().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c1a7b32a

KVM: Separate out dirty_bitmap allocation code as kvm_kvzalloc() · 92eca8fa

由 Takuya Yoshikawa 提交于 5月 20, 2012

Will be used for lpage_info allocation later.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

92eca8fa

01 5月, 2012 1 次提交

KVM: s390: Implement the directed yield (diag 9c) hypervisor call for KVM · 41628d33

由 Konstantin Weitz 提交于 4月 25, 2012

This patch implements the directed yield hypercall found on other
System z hypervisors. It delegates execution time to the virtual cpu
specified in the instruction's parameter.

Useful to avoid long spinlock waits in the guest.

Christian Borntraeger: moved common code in virt/kvm/
Signed-off-by: NKonstantin Weitz <WEITZKON@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

41628d33

24 4月, 2012 1 次提交

KVM: Introduce direct MSI message injection for in-kernel irqchips · 07975ad3

由 Jan Kiszka 提交于 3月 29, 2012

Currently, MSI messages can only be injected to in-kernel irqchips by
defining a corresponding IRQ route for each message. This is not only
unhandy if the MSI messages are generated "on the fly" by user space,
IRQ routes are a limited resource that user space has to manage
carefully.

By providing a direct injection path, we can both avoid using up limited
resources and simplify the necessary steps for user land.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

07975ad3

12 4月, 2012 1 次提交

KVM: unmap pages from the iommu when slots are removed · 32f6daad

由 Alex Williamson 提交于 4月 11, 2012

We've been adding new mappings, but not destroying old mappings.
This can lead to a page leak as pages are pinned using
get_user_pages, but only unpinned with put_page if they still
exist in the memslots list on vm shutdown.  A memslot that is
destroyed while an iommu domain is enabled for the guest will
therefore result in an elevated page reference count that is
never cleared.

Additionally, without this fix, the iommu is only programmed
with the first translation for a gpa.  This can result in
peer-to-peer errors if a mapping is destroyed and replaced by a
new mapping at the same gpa as the iommu will still be pointing
to the original, pinned memory address.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

32f6daad

08 4月, 2012 4 次提交

KVM: Remove unused dirty_bitmap_head and nr_dirty_pages · 93474b25

由 Takuya Yoshikawa 提交于 3月 01, 2012

Now that we do neither double buffering nor heuristic selection of the
write protection method these are not needed anymore.

Note: some drivers have their own implementation of set_bit_le() and
making it generic needs a bit of work; so we use test_and_set_bit_le()
and will later replace it with generic set_bit_le().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

93474b25

KVM: fix kvm_vcpu_kick build failure on S390 · 8c84780d

由 Marcelo Tosatti 提交于 3月 14, 2012

S390's kvm_vcpu_stat does not contain halt_wakeup member.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8c84780d

KVM: Factor out kvm_vcpu_kick to arch-generic code · b6d33834

由 Christoffer Dall 提交于 3月 08, 2012

The kvm_vcpu_kick function performs roughly the same funcitonality on
most all architectures, so we shouldn't have separate copies.

PowerPC keeps a pointer to interchanging waitqueues on the vcpu_arch
structure and to accomodate this special need a
__KVM_HAVE_ARCH_VCPU_GET_WQ define and accompanying function
kvm_arch_vcpu_wq have been defined. For all other architectures this
is a generic inline that just returns &vcpu->wq;
Acked-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b6d33834

KVM: resize kvm_io_range array dynamically · a1300716

由 Amos Kong 提交于 3月 09, 2012

This patch makes the kvm_io_range array can be resized dynamically.
Signed-off-by: NAmos Kong <akong@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a1300716

08 3月, 2012 7 次提交

KVM: use correct tlbs dirty type in cmpxchg · bec87d6e

由 Alex Shi 提交于 3月 04, 2012

Using 'int' type is not suitable for a 'long' object. So, correct it.
Signed-off-by: NAlex Shi <alex.shi@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bec87d6e

KVM: Ensure all vcpus are consistent with in-kernel irqchip settings · 3e515705

由 Avi Kivity 提交于 3月 05, 2012

If some vcpus are created before KVM_CREATE_IRQCHIP, then
irqchip_in_kernel() and vcpu->arch.apic will be inconsistent, leading
to potential NULL pointer dereferences.

Fix by:
- ensuring that no vcpus are installed when KVM_CREATE_IRQCHIP is called
- ensuring that a vcpu has an apic if it is installed after KVM_CREATE_IRQCHIP

This is somewhat long winded because vcpu->arch.apic is created without
kvm->lock held.

Based on earlier patch by Michael Ellerman.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3e515705

KVM: mmu_notifier: Flush TLBs before releasing mmu_lock · 565f3be2

由 Takuya Yoshikawa 提交于 2月 10, 2012

Other threads may process the same page in that small window and skip
TLB flush and then return before these functions do flush.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

565f3be2

KVM: Introduce kvm_memory_slot::arch and move lpage_info into it · db3fe4eb

由 Takuya Yoshikawa 提交于 2月 08, 2012

Some members of kvm_memory_slot are not used by every architecture.

This patch is the first step to make this difference clear by
introducing kvm_memory_slot::arch;  lpage_info is moved into it.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

db3fe4eb

KVM: Simplify ifndef conditional usage in __kvm_set_memory_region() · 189a2f7b

由 Takuya Yoshikawa 提交于 2月 08, 2012

Narrow down the controlled text inside the conditional so that it will
include lpage_info and rmap stuff only.

For this we change the way we check whether the slot is being created
from "if (npages && !new.rmap)" to "if (npages && !old.npages)".

We also stop checking if lpage_info is NULL when we create lpage_info
because we do it from inside the slot creation code block.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

189a2f7b

KVM: Split lpage_info creation out from __kvm_set_memory_region() · a64f273a

由 Takuya Yoshikawa 提交于 2月 08, 2012

This makes it easy to make lpage_info architecture specific.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a64f273a

KVM: Introduce gfn_to_index() which returns the index for a given level · fb03cb6f

由 Takuya Yoshikawa 提交于 2月 08, 2012

This patch cleans up the code and removes the "(void)level;" warning
suppressor.

Note that we can also use this for PT_PAGE_TABLE_LEVEL to treat every
level uniformly later.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fb03cb6f

05 3月, 2012 4 次提交

KVM: Move gfn_to_memslot() to kvm_host.h · 9d4cba7f

由 Paul Mackerras 提交于 1月 12, 2012

This moves __gfn_to_memslot() and search_memslots() from kvm_main.c to
kvm_host.h to reduce the code duplication caused by the need for
non-modular code in arch/powerpc/kvm/book3s_hv_rm_mmu.c to call
gfn_to_memslot() in real mode.

Rather than putting gfn_to_memslot() itself in a header, which would
lead to increased code size, this puts __gfn_to_memslot() in a header.
Then, the non-modular uses of gfn_to_memslot() are changed to call
__gfn_to_memslot() instead.  This way there is only one place in the
source code that needs to be changed should the gfn_to_memslot()
implementation need to be modified.

On powerpc, the Book3S HV style of KVM has code that is called from
real mode which needs to call gfn_to_memslot() and thus needs this.
(Module code is allocated in the vmalloc region, which can't be
accessed in real mode.)

With this, we can remove builtin_gfn_to_memslot() from book3s_hv_rm_mmu.c.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9d4cba7f

KVM: Add barriers to allow mmu_notifier_retry to be used locklessly · a355aa54

由 Paul Mackerras 提交于 12月 12, 2011

This adds an smp_wmb in kvm_mmu_notifier_invalidate_range_end() and an
smp_rmb in mmu_notifier_retry() so that mmu_notifier_retry() will give
the correct answer when called without kvm->mmu_lock being held.
PowerPC Book3S HV KVM wants to use a bitlock per guest page rather than
a single global spinlock in order to improve the scalability of updates
to the guest MMU hashed page table, and so needs this.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a355aa54

KVM: s390: ucontrol: export SIE control block to user · 5b1c1493

由 Carsten Otte 提交于 1月 04, 2012

This patch exports the s390 SIE hardware control block to userspace
via the mapping of the vcpu file descriptor. In order to do so,
a new arch callback named kvm_arch_vcpu_fault  is introduced for all
architectures. It allows to map architecture specific pages.
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5b1c1493

KVM: s390: add parameter for KVM_CREATE_VM · e08b9637

由 Carsten Otte 提交于 1月 04, 2012

This patch introduces a new config option for user controlled kernel
virtual machines. It introduces a parameter to KVM_CREATE_VM that
allows to set bits that alter the capabilities of the newly created
virtual machine.
The parameter is passed to kvm_arch_init_vm for all architectures.
The only valid modifier bit for now is KVM_VM_S390_UCONTROL.
This requires CAP_SYS_ADMIN privileges and creates a user controlled
virtual machine on s390 architectures.
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e08b9637

01 2月, 2012 1 次提交

KVM: Fix __set_bit() race in mark_page_dirty() during dirty logging · 50e92b3c

由 Takuya Yoshikawa 提交于 1月 04, 2012

It is possible that the __set_bit() in mark_page_dirty() is called
simultaneously on the same region of memory, which may result in only
one bit being set, because some callers do not take mmu_lock before
mark_page_dirty().

This problem is hard to produce because when we reach mark_page_dirty()
beginning from, e.g., tdp_page_fault(), mmu_lock is being held during
__direct_map():  making kvm-unit-tests' dirty log api test write to two
pages concurrently was not useful for this reason.

So we have confirmed that there can actually be race condition by
checking if some callers really reach there without holding mmu_lock
using spin_is_locked():  probably they were from kvm_write_guest_page().

To fix this race, this patch changes the bit operation to the atomic
version:  note that nr_dirty_pages also suffers from the race but we do
not need exactly correct numbers for now.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

50e92b3c

27 12月, 2011 9 次提交

KVM: ensure that debugfs entries have been created · 4f69b680

由 Hamo 提交于 12月 15, 2011

by checking the return value from kvm_init_debug, we
can ensure that the entries under debugfs for KVM have
been created correctly.
Signed-off-by: NYang Bai <hamo.by@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4f69b680

KVM: drop bsp_vcpu pointer from kvm struct · d546cb40

由 Gleb Natapov 提交于 12月 15, 2011

Drop bsp_vcpu pointer from kvm struct since its only use is incorrect
anyway.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d546cb40

KVM: Use memdup_user instead of kmalloc/copy_from_user · ff5c2c03

由 Sasha Levin 提交于 12月 04, 2011

Switch to using memdup_user when possible. This makes code more
smaller and compact, and prevents errors.
Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ff5c2c03

KVM: Use kmemdup() instead of kmalloc/memcpy · cdfca7b3

由 Sasha Levin 提交于 12月 04, 2011

Switch to kmemdup() in two places to shorten the code and avoid possible bugs.
Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

cdfca7b3

KVM: introduce a table to map slot id to index in memslots array · f85e2cb5

由 Xiao Guangrong 提交于 11月 24, 2011

The operation of getting dirty log is frequent when framebuffer-based
displays are used(for example, Xwindow), so, we introduce a mapping table
to speed up id_to_memslot()
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f85e2cb5

KVM: sort memslots by its size and use line search · bf3e05bc

由 Xiao Guangrong 提交于 11月 24, 2011

Sort memslots base on its size and use line search to find it, so that the
larger memslots have better fit

The idea is from Avi
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bf3e05bc

KVM: introduce id_to_memslot function · 28a37544

由 Xiao Guangrong 提交于 11月 24, 2011

Introduce id_to_memslot to get memslot by slot id
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

28a37544

KVM: introduce kvm_for_each_memslot macro · be6ba0f0

由 Xiao Guangrong 提交于 11月 24, 2011

Introduce kvm_for_each_memslot to walk all valid memslot
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

be6ba0f0

KVM: introduce update_memslots function · be593d62

由 Xiao Guangrong 提交于 11月 24, 2011

Introduce update_memslots to update slot which will be update to
kvm->memslots
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

be593d62